Anyone have any good advice or troubleshooting tips to determine why we see a lot of pending downloads on open actions?
We have _BESGather_Download_CacheLimitMB set to 102400 on our root server but the \sha1 directory only has a small number of files in it.
Some spots to check would be whether the clients are set to do direct downloads ( _BESClient_Download_Direct ) or whether possibly a high-level Relay is set to do the downloads itself instead of checking upstream to the root server ( _BESGather_Download_CheckParentFlag )
Your sha1 directory may have a small number of files, but what does their total size add up to? Looks like you are configuring for a cache size of 100 GB, but where MS patches can be a couple of gigabytes a piece that can add up quickly. Also if you’re using something like OS Deployment, an OS image can easily be ten or twenty gigabytes for each - and those count too in the download cache size.
In the Action Status on the Console, you should be able to see whether the download has been cached by the root server. If it has, then you can check for cache sizes on the clients and the relay chain down to the client to see where it might be getting stuck.
There’s also a case where, once the client requests a download, it expects the Relay to let the client know when that download is ready (Relays do the same when downloading from their parent relay). So blocking downstream communication (TCP/52311 from the Server or Parent Relay to Child Relays; or UDP/52311 from a Relay to its clients) can result in longer “Pending Download” times. You’d also see a symptom where clients take a long time to respond when you send new Actions.
If those notifications are getting blocked, the download retry minutes and download retry times described at https://help.hcltechsw.com/bigfix/9.5/platform/Platform/Config/r_client_set.html#r_client_set__dwld can become a lot more important. You might also look at persistent connections or command polling as workarounds if the downstream notifications are being blocked.
50 items, 5.74GB
We don’t set _BESClient_Download_Direct or _BESGather_Download_CheckParentFlag so whatever they default to is what we are using.
It is slowly growing in # of filles and of course total size.
In case the files are being downloaded, might it be related to some kind of proxy issue?
For what it’s worth, we have been seeing a lot of this kind of symptoms lately. We haven’t been able to pinpoint the exact root cause (prove it), just happens during critical deployments where all the pressure is to resolve rather than troubleshoot, but from I’ve been able to ascertain in our case it’s relay issues… Certain relays just get waaay too out-of-sync and takes ages to propagate actions/process downloads. We have been fixing it by resetting gather states on the relays with a lot of clients showing these symptoms and resolves it. Been doing this for several months now and the same relays do not seem to re-occur but another relay would have the same problem following month and so on, so it does seem to be slow occurring/build-up problem but I do like to eventually pin it down. Hope this helps you.
Thanks for that.
We wanted to use the relay dashboard that is in the console and that is broken (case opened).