Linux Fixlet taking 24 hours to run

My linux fixlet are taking approximately 24 hours to run, and I can’t seem to figure out why

The actual fixlet itself seems to run quickly, but it is queued for about 24 hours before it runs

This is not an issue with Windows fixlets - they run pretty much right away

I fear it is a simple problem/fix, but I’m not see any difference between the fixlets

Anyone have any thoughts on this?

My first bet would be that the Linux-based endpoints have a software firewall on them that is preventing the BigFix Agent from receiving UDP notifications on the BigFix port. This can be verified via the Client logs by looking for entries such as the following (indicating that UDP notifications are being received):

  • GatherHashMV command received
  • GatherActionMV command received
  • ForceRefresh command received

Please see https://developer.ibm.com/answers/questions/275256/how-does-the-bigfix-server-notify-the-relays-and-c/ for more information.

If you find that UDP notifications are in fact being blocked, consider re-configuring the firewall to allow the traffic (there are Fixlets in BES Support that can assist with this), or alternatively if you’re running v9.5.11+, you can enable Persistent Connections.

1 Like

Thanks for the info, I was able to find the GatherHashMV in the log file, so it appears the UDP notifications are reaching the client.

I did notice that the clock on the client was off by about 3 hours, so I had that corrected. I restarted the BES service, which I noticed completed the actions I started this morning (this will normally take 24ish hours to complete)

However, when I tried to start another action, it become ‘stuck’ and is currently going on an hour, although it should be a simple service restart (the fixlet doesnt appear to matter, I just wanted it for your info)

Which Linux distribution and release?

Sometime ago, my colleague had exactly the same situation on a Linux computer and the cause was the default Linux firewall blocks UDP for signaling new command and command polling interval was set to 24 hours upon installation.

When you restart client, client checks if there are any commands to process, so if you have actions to take for that computer, they will run when you restart.

1 Like

Red Hat Enterprise Linux Server 7.6 (Maipo)

I walked through @Aram suggestions, and as far as I can tell, the traffic is getting through without issue. It does seem lately (within the last week or so) that some of the actions have been going faster - ~3.5 hours instead of 24 hours.

I have noticed that custom fixlets seem to run faster than the built-in fixlets - so maybe its something there

Glad to hear it’s running more quickly, but ~3.5 hours is still much longer than expected for a given endpoint to react to an action. I would recommend having a look at the Client’s logs to:

  • ensure the Client is notified of the action (which you seem to have confirmed), and is able to gather it in a timely manner (perhaps there are delays in gathering the action?)
  • see what other activity may be occurring that may be causing delays

Is the ~3.5 hour timeframe based on the Client’s log, or based on reports in the BigFix Console, WebUI, or Web Reports?

On RHEL 7.6, you’d use firewall-cmd to view and manage the firewall rules. The same is true for CentOS 7, refer the commands below…

[root@cent-2 ~]# firewall-cmd --state
running

[root@cent-2 ~]# firewall-cmd --list-all
public (active)
  target: default
  icmp-block-inversion: no
  interfaces: ens160
  sources:
  services: dhcpv6-client ssh
  ports: 52311/udp 52311/tcp 22/tcp
  protocols:
  masquerade: no
  forward-ports:
  source-ports:
  icmp-blocks:
  rich rules:

These indicate the firewall is Active, but is allowing 52311/udp and 52311/tcp (along with 22/tcp for my SSH connections)

If the 52311 ports are missing, you can add them via

firewall-cmd --zone=public --add-port=52311/tcp --permanent 
firewall-cmd --zone=public --add-port=52311/udp --permanent
firewall-cmd --reload

It’s a little more rare, but if the client is already allowing inbound 52311/udp, then there could be a firewall problem somewhere in the Relay chain. Each Relay should allow inbound 52311/tcp from its parent Relay, in order for the Relay itself to be notified of new actions.