Adjusting Command Polling for Endpoints where UDP is not enabled

FatScottishGuy · August 15, 2022, 2:10pm

Scenario: A limited time is given to patching and all actions must be started at a specific time and hard stop within x amount of time (lower than current command polling interval).

Issue: If the command poll doesn’t pick up the job in that defined window as it’s set higher then the patching wont occur.

Solution?: I know there are a few ways of going about ensuring endpoints without UDP being available are communicating frequently such as peer nesting however I was thinking that something like this might also work if peer nesting isn’t an option - can you give me your pros and cons on it or perhaps a better way of doing it?

//Check for UDP not being present
if {not exist last command time of client | not exists lines containing " command received" of files whose(12 = length of name of it AND (name of it ends with ".log" OR name of it ends with ".bkg") AND exists lines of it) of folders "Logs" of folders "__Global" of data folders of client}
//Set command polling to 30 mins if UDP not present
setting "_BESClient_Comm_CommandPollEnable"="1" on "{parameter "action issue date" of action}" for client
setting "_BESClient_Comm_CommandPollIntervalSeconds"="1800" on "{parameter "action issue date" of action}" for client

//Check for UDP being present
elseif {exist last command time of client | exists lines containing " command received" of files whose(12 = length of name of it AND (name of it ends with ".log" OR name of it ends with ".bkg") AND exists lines of it) of folders "Logs" of folders "__Global" of data folders of client}
//Set command polling to 6 hours if UDP is present
setting "_BESClient_Comm_CommandPollEnable"="1" on "{parameter "action issue date" of action}" for client
setting "_BESClient_Comm_CommandPollIntervalSeconds"="21600" on "{parameter "action issue date" of action}" for client
endif

I plan to run this as a policy action on all devices so that if UDP is found again on a device or enabled it will put it back to 6 hours.

trn · August 15, 2022, 2:34pm

I’m not at all sure that PeerNest would help in this situation - it would make the downloads faster, but the client would not receive speedier notification of new content.

On the subject of new content, could the patch actions be created earlier, so the client will have had time to interrogate the server and learn of the action it is to take during the required time slot?

Persistent connection may be the way for you to go: https://help.hcltechsw.com/bigfix/9.5/platform/Platform/Config/c_persistenconn.html

FatScottishGuy · August 15, 2022, 3:06pm

I have persistent connections enabled on the relay as a policy action but nothing set on the clients by default. The settings are:

“MaxNumberOfPersistentConnections”="100"
“MaxNumberOfPersistentConnectionsPerSubnet”=“3”

The problem with Persistent Connections is that I have to then pick the servers that I want to have persistently connected and modify their settings too.

That said, I could locate all the servers that have no UDP, figure out their subnet and point them to the correct relay(s) with the setting enabled for them.

This might solve some of the issues.

JasonWalker · August 15, 2022, 3:20pm

I would enable Persistent Connections on all of the clients, honestly.
Something I think we should make more clear about the Persistent Connections configuration, is that there is some intelligence built-in to it. During Relay Selection, the client will ask the Relay for a test UDP ping, and the client only establishes the Persistent Connection when it doesn’t receive the UDP message.

So you can enable Persistent Connections, and the clients will skip using it if it’s not needed (when the UDP message is received). Once established, the Persistent-connected client will then forward UDP messages from the Relay to other clients in the same subnet, so those other clients will skip setting up the Persistent Connection once they receive the forwarded-UDP messages from three other clients (that’s how the MaxNumberOfPersistentConnectionsPerSubnet is determined - not by lists of subnet registrations, but by “how many forwarded tests did I receive”)

When I get some free time I’d like to put in an RFE to add client inspectors for this. Seems there are several use-case for knowing the UDP connection test statuses, so I’d like to see “requesting udp status of client” as a boolean to know whether the client asked for a test, “allowed udp status of selected relay” to test whether the Relay allows it, “receiving udp from selected relay of client” to know whether it got a response from the relay, “receiving udp from forwarders of client” to know whether the client received forwarded notifications from other clients, “number of UDP forwarders of client” to know how many, and “time of udp connectivity test” to know when the test occurred. It would also be useful to have an ActionScript command like “initiate udp notification check” to perform the test on-demand.

trn · August 15, 2022, 3:22pm

Those boolean properties would be useful.

JasonWalker · August 15, 2022, 3:22pm

Not to mention adding a “time of last command received” to avoid all that log parsing. I’d have to check, but it might not be possible in the log to distinguish between “command received by udp notification” versus “I noticed a command received when I ran a Command Poll”

FatScottishGuy · August 15, 2022, 3:27pm

does “last command time of client” not cover the last UDP time?

FatScottishGuy · August 15, 2022, 3:29pm

This! This changes everything for persistent connections

JasonWalker · August 15, 2022, 3:32pm

I, uh, forgot about that one, I should probably check

JasonWalker · August 15, 2022, 4:47pm

Ok, I ran a few quick tests, and it looks like ‘last command time of client’ is the most useful - bearing in mind that when the client starts up, until it receives a new notification the ‘last command time of client’ will be nonexistant. (In the Fixlet Debugger, this needs to be run in Client Evaluation Mode)

The “parse through the logs” is useful because it can reach back to before the current client startup, but it will have a false-positive with Command Polling. When I blocked the traffic in Windows Firewall, the ForceRefresh the client detected as the result of a Command Poll looks the same in the client log, while ‘last command time of client’ shows that we did not really receive a notification on it -

At 11:16:37 -0500 - 
   PollForCommands: Requesting commands
   PollForCommands: commands to process: 1
At 11:16:38 -0500 - 
   ForceRefreshMV command received.  Version difference, gathering action site.
At 11:16:39 -0500 -  

q: last command time of client
A: Singular expression refers to nonexistent object.

So if you base your checks on Command Polling, I expect once you actually receive a new notification based on the Command Poll, you’ll probably turn the Command Polling back off when it should have stayed enabled. The only way I see out of that false-positive is rather ugly - you’d have to look again through the log files for a line matching

PollForCommands: commands to process: 0
and be sure that there is no number other than ‘0’ after the the ‘commands to process:’ string. That should guarantee that Command Polling is not responsible for the client detecting the new command, so a UDP notification must have done it.

With the firewall open and the client receiving notifications, the results on ‘last command time of client’ are easier to deal with:

q: last command time of client
a: Mon, 15 Aug 2022 09:19:38 -0500

I’d note also that there are (at least) two different types of notification - Command Notification (like a new action or “Force Refresh”), and a DownloadPing Notification (where the client has previously requested a download from it’s Relay, and the Relay is now notifying the client that the download is available and ready for the client). Both of these update ‘last command time of client’.

If I was going to build something along these lines, to change the client behavior for Command Polling, I’d probably use both the ‘last command time of client’ and ‘last relay select time’, so maybe wait an hour after connecting to the Relay before enabling Command Polling if no commands have been received, and turn Command Polling off any time ‘last command time of client’ is newer than ‘last relay select time’.

q: last relay select time
A: Mon, 15 Aug 2022 09:28:35 -0500
T: 0.030 ms

q: last command time of client
A: Mon, 15 Aug 2022 11:34:42 -0500
T: 0.038 ms

But what I usually recommend, is Command Polling everywhere, set to two to three hours, and Persistent Connections everywhere

FatScottishGuy · August 15, 2022, 8:51pm

Does the “ForceRefreshMV command received” notification come only when a refresh is issued from the console?

Are GatherHashMV and GatherActionMV typical of UDP traffic only?