Besclient persistent connection

We have bunch of servers in our environment specially openshift linux cluster nodes where they don’t receive the udp (query from bigfix server). I was tailing the besclient logs and running the query and besclient ( openshift linux server) didn’t receive the query in the logs. so I am wondering it could be the server is blocking the udp. The openshift cluster run its own iptables. Not sure if it does block the udp or not but it doesn’t receiver the query.

so I am thinking if I can pust this setting “_BESRelay_PersistentConnection_Enabled = 1” to those server only which don’t get the udp packet. I am not sure how to set the query and action on the basis of relevance. i don’t want to blast that setting across all the 10K nodes.

1 Like

As I understand it, the persistent connection only actually gets established if the system does not receive UDP notifications, so it should be safe to blast away (but test first of course).

I think the relay limits it to 3 persistent connections per each client subnet though so it may not fit your use case. It’s meant to get around NAT translation or site firewalls, you still may need to open your host-based firewalls to udp/52311 at least within your network; then the clients that establish the persistent connections will forward UDP notifications to the others on your subnet.

The tcp/udp/52311 is open across the network. Its only openshift cluster has natting / iptables enable which are not cooperating with udp. We have 200-250 openshift cluster in dev, test, prod. I think mgmt might not agree to tinker with openshift cluster config just for besclient.

If I used this relevance to find which server are not responding to UDP ping. I don’t get correct result because there will be hundred of servers where we haven’t executed the query ever.

not exists last command times whose(2*day > (now - it)) of client
That return “False”. I think that means “udp ping” is not responding?

If I used this
"last command time of client | ( maximum of creation times of files whose (length of name of it = 12 AND creation time of it > now - 5*day AND exists lines containing " command received" of it) of folders “Logs” of folders “__Global” of data folders of client )" as per “https://bigfix.me/relevance/details/3021681

Then I don’t get correct result either because in the besclient logs we do have “command received” but its not responding to udp ( query) at all.

Jason’s point was that even if you enable the setting on multiple systems, any endpoints that do successfully receive UDP notifications won’t be able to establish a persistent connection. So that is an approach you can take if you don’t have a way to target the openshift clusters generally.

You may not be able to use the last command time of client portion at the front of your relevance, since that probably includes commands retrieved from polling requests. The relevance is also returning a timestamp string instead of a boolean True/False, which is needed for fixlet applicability. If you change your check for " command received" to return whether a file exists with that text, I would expect it to work.

If you think there are other “command received” entries, then pull the client logs back so you can confirm and test relevance against them directly.

Thanks Steve. We will rolling out the bes client upgrade to 9.5.11 across 10K server in our unix environment starting next week. It may takes couple of months to roll out across the board depends on the change management approval. During the upgrade we will enable the “_BESRelay_PersistentConnection_Enabled = 1”, I am hoping after that “openshift” or some DMZ hosts will respond to the query.

we have updated the besclient to 9.5.11 and enabled the persistent connection. for some reason some client don’t make tcp connection to relay servers

both client1/2 are on the same subet.
client 1 is not making any tcp connection to relay even it has persistent tcp connection.
$ netstat -an 5|grep 52311
udp 0 0 0.0.0.0:52311 0.0.0.0:*
udp 0 0 0.0.0.0:52311 0.0.0.0:*
udp 0 0 0.0.0.0:52311 0.0.0.0:*

client2 . this wasn’t making any tcp connection to relay server yesterday but all of the sudden today its start makign the tcp connection and query start responding.

$ netstat -an 5|grep 52311
tcp 0 0 1.2.3.4:55490 5.26.7.8:52311 ESTABLISHED
udp 0 0 0.0.0.0:52311 0.0.0.0:*
tcp 0 0 1.2.3.4:55490 5.26.7.8:52311 ESTABLISHED
udp 0 0 0.0.0.0:52311 0.0.0.0:*

i can’t pin the issue why this odd behaviour.

interestign thing i have noticed the client1 is not able to make the “ESTABLISHED” connection to one of relay server.

client1:
$ netstat -an 5 |grep -i 52311
tcp 0 0 1.2.3.4:53402 5.6.7.9:52311 TIME_WAIT <<< –
udp 0 0 0.0.0.0:52311 0.0.0.0:*
tcp 0 0 1.2.3.4:53402 5.6.7.9:52311 TIME_WAIT <<< –
udp 0 0 0.0.0.0:52311 0.0.0.0:*
tcp 0 0 1.2.3.4:53402 5.6.7.9:52311 TIME_WAIT
udp 0 0 0.0.0.0:52311 0.0.0.0:*
tcp 0 0 1.2.3.4:53402 5.6.7.9:52311 TIME_WAIT
udp 0 0 0.0.0.0:52311 0.0.0.0:*
tcp 0 0 1.2.3.4:53402 5.6.7.9:52311 TIME_WAIT
udp 0 0 0.0.0.0:52311 0.0.0.0:*
tcp 0 0 1.2.3.4:53402 5.6.7.9:52311 TIME_WAIT
udp 0 0 0.0.0.0:52311 0.0.0.0:*
udp 0 0 0.0.0.0:52311 0.0.0.0:*
udp 0 0 0.0.0.0:52311 0.0.0.0:*

client2:
but the client2 has the “ESTABLISHED” connectoin to relay server.
$ netstat -an 5|grep 52311
tcp 0 0 8.9.10.11:55490 5.26.7.8:52311 ESTABLISHED <<< –
udp 0 0 0.0.0.0:52311 0.0.0.0:*
tcp 0 0 8.9.10.11:55490 5.26.7.8:52311 ESTABLISHED <<< –
udp 0 0 0.0.0.0:52311 0.0.0.0:*

What does the log of client1 say? Would it be possible to attach a few lines from the latest relay selection it did?

The TIME_WAIT sockets might refer to temporary connections that client1 has previously used for sending reports, gathering actions, etc. so they’re not necessarily referring to failed attempts to establish persistent connections.

now the client1 is keeping the “Established” connection. My colleague though it could be TTL from the relay side. I am not sure if he fixed some thing from the relay side or not.

When i execute the query, it creates the another “Established” connection to the relay server before that it wasn’t connecting to the relay server ,and query wasn’t responding at all.

client1
$ netstat -an 5|grep 52311
tcp 0 0 1.2.3.4:56428 5.6.7.9:52311 ESTABLISHED <<—
udp 0 0 0.0.0.0:52311 0.0.0.0:*
tcp 0 0 1.2.3.4:56428 5.6.7.9:52311 ESTABLISHED <<—
udp 0 0 0.0.0.0:52311 0.0.0.0:*

this what I see in the client1 besclient logs.

Beginning Relay Select
At 11:02:14 -0400 -
RegisterOnce: Attempting secure registration with …
Unrestricted mode
Configuring listener without wake-on-lan
Registered with url ,…
Registration Server version 9.5.11.191 , Relay version 9.5.11.191
Relay does not require authentication.
Client has an AuthenticationCertificate
RegisterOnce: Client is entitled to open a persistent connection.
Relay selected: xyz.abc.com. at: 5.6.7.9:52311 on: IPV4 (Using setting IPV4ThenIPV6)
[ThreadTime:11:02:14] ShutdownListener
[ThreadTime:11:02:14] Setup Listener success: reusing existing socket.
At 11:03:16 -0400 -
[ThreadTime:11:03:14] OpenPersistentConnection: socket opened successfully.
At 11:03:40 -0400 -

Client log looks good, there are all the expected messages for a successful persistent connection.

The second connection might be temporary and refer to the sending of the report containing the answer to the query.

Be aware that even with persistent connections enabled, a client will only make the connection when necessary. This is to avoid overloading the relays with persistent connections. There are client and relay settings to tune the behavior, but by default

  • a client will only make a persistent connection if it cannot receive UDP messages from the relay.
  • Only three clients in a subnet will make persistent connections. Those three will be used to forward UDP notifications to other clients within the subnet (essentially, tunnelling the notifications through local firewalls or NAT translations).
  • a relay will only accept 100 persistent connections in total.

It just the query which doesn’t respond some time from openshift or other unix servers even with persistent connection. Persistent connection doesnt seem necessary to respond to the query.
What I am trying to achieve with persistent conenction/query is to check the server status. Some time our Vsphere cluster goes down, then 100’s of VM get recycled. So ssh’ng to 100’s of VM is just painful process. I have script which check the status of the server via bigfix if the server has recycled and verify the file system/nas is mounted correctly. I do that checking via action/ query. if the query dont respond to even 20/30/40 of the servers that defeat the purpose of the persistent connection. It means the team has to scramble and login to all the host to verify every thing is mounted.

And i do lot of reporting via API using query. Some time server don’t respond to query at all. I don’t get the correct reporting of the package from the server.

I have the install the 7 nodes openshift cluster to narrow down why the query dont’ respond even the node has persistent connection is configured in the besclient.

It turns out to be the iptables which is not letting the udp query to let in if the client is not connected via persistent connection.

Before:
$/etc/sysconfig/iptables

*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
-A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
-A INPUT -p icmp -j ACCEPT
-A INPUT -i lo -j ACCEPT
-A INPUT -p tcp -m state --state NEW -m tcp --dport 22 -j ACCEPT
-A INPUT -j REJECT --reject-with icmp-host-prohibited
-A FORWARD -j REJECT --reject-with icmp-host-prohibited
COMMIT

After adding the 52311/tcp/udp in iptables config the query respond. So that being said, we have 200+ openshift cluster. Query might not being able to respond back even if the besclient has “persistent connection” enable. Some of the openshift cluster node will respond via persistent connection but some may not be able to.

$/etc/sysconfig/iptables
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
-A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
-A INPUT -p icmp -j ACCEPT
-A INPUT -p tcp -j ACCEPT --dport 52311 <<<
-A INPUT -p udp -j ACCEPT --dport 52311 <<<
-A INPUT -i lo -j ACCEPT
-A INPUT -p tcp -m state --state NEW -m tcp --dport 22 -j ACCEPT
-A INPUT -j REJECT --reject-with icmp-host-prohibited
-A FORWARD -j REJECT --reject-with icmp-host-prohibited
COMMIT

“Server1” with iptables rule for 52311, query respond
$iptables --list
Chain INPUT (policy ACCEPT)
target prot opt source destination
ACCEPT all – anywhere anywhere state RELATED,ESTABLISHED
ACCEPT icmp – anywhere anywhere
ACCEPT tcp – anywhere anywhere tcp dpt:52311
ACCEPT udp – anywhere anywhere udp dpt:52311
ACCEPT all – anywhere anywhere
ACCEPT tcp – anywhere anywhere state NEW tcp dpt:ssh
REJECT all – anywhere anywhere reject-with icmp-host-prohibited

Chain FORWARD (policy ACCEPT)
target prot opt source destination
DOCKER-ISOLATION all – anywhere anywhere
DOCKER all – anywhere anywhere
ACCEPT all – anywhere anywhere ctstate RELATED,ESTABLISHED
ACCEPT all – anywhere anywhere
ACCEPT all – anywhere anywhere
REJECT all – anywhere anywhere reject-with icmp-host-prohibited

Chain OUTPUT (policy ACCEPT)
target prot opt source destination

Chain DOCKER (1 references)
target prot opt source destination

Chain DOCKER-ISOLATION (1 references)
target prot opt source destination
RETURN all – anywhere anywhere

$grep -i persis /var/opt/BESClient/besclient.config
[Software\BigFix\EnterpriseClient\Settings\Client_BESClient_PersistentConnection_Enabled]

“Server2” with iptables no rule for 52311, but query respond via persistent settings
$iptables --list
Chain INPUT (policy ACCEPT)
target prot opt source destination
ACCEPT all – anywhere anywhere state RELATED,ESTABLISHED
ACCEPT icmp – anywhere anywhere
ACCEPT all – anywhere anywhere
ACCEPT tcp – anywhere anywhere state NEW tcp dpt:ssh
REJECT all – anywhere anywhere reject-with icmp-host-prohibited

Chain FORWARD (policy ACCEPT)
target prot opt source destination
DOCKER-ISOLATION all – anywhere anywhere
DOCKER all – anywhere anywhere
ACCEPT all – anywhere anywhere ctstate RELATED,ESTABLISHED
ACCEPT all – anywhere anywhere
ACCEPT all – anywhere anywhere
REJECT all – anywhere anywhere reject-with icmp-host-prohibited

Chain OUTPUT (policy ACCEPT)
target prot opt source destination

Chain DOCKER (1 references)
target prot opt source destination

Chain DOCKER-ISOLATION (1 references)
target prot opt source destination
RETURN all – anywhere anywhere

Yeah, this was what Jason described above. Since only 3 endpoints in a subnet can establish a persistent connection by default, all the other nodes of the cluster have to be able to receive UDP from one of the 3 persistent connection nodes. I’m assuming you’ve changed the iptables rules based on your findings, so then I would expect it to work.