We are managing an environment where high-level BES Relays are intermittently pausing uploads because the UploadManagerData\BufferDir\sha1\ directory fills up. This does not affect low-level relays.
Actions Taken So Far: To address this, we increased the capacity of the relays and set an automatic cleanup window:
-
_BESRelay_UploadManager_BufferDirectoryMaxSize: Increased to 1GB (to prevent the 20MB default cap).
-
_BESRelay_UploadManager_BufferDirectoryMaxCount: Increased to 100,000.
-
_BESRelay_UploadManager_CleanupHours: Set to 72 hours.
The Current Issue: While the frequency of the issue has decreased, some high-level relays appear to be ignoring the CleanupHours setting. We are still finding .sha1 files that are over a month old sitting in the buffer, which eventually blocks the processing of new files until they are manually deleted and the service is restarted.
Questions:
-
Are there known scenarios where the UploadManager ignores the CleanupHours setting?
-
Is there a specific "lock" file or process that might prevent the relay service from purging these old files?
-
Does the version of BigFix Inventory impact how these files are flagged for cleanup?
-
Besides _BESRelay_Log_Verbose, are there specific log entries we should look for to troubleshoot the "Cleanup" cycle?
Any insights or similar experiences would be greatly appreciated!
The short answer is that probably the behavior is because a known problem tracked Defect Article KB0121993 … in any case it is suggested to contact the support team for further investigation
The long answer is that, when is performed an upload, each file is split in chunks … when all chunks are received, the complete file is moved to the sha1 folder to be uploaded further ( or expanded on the root server ) as soon as possible …
But because relays can change parent relay time by time, it is possible some chunks remain orphan and the file is uploaded through another relay … the CleanupHours setting is to cleanup these orphan files only.
The CleanupHours setting does not cleanup nothing from the sha1 folder instead and does not exist any purging activity in this folder simply because don’t need at all: files, once received, need to be uploaded ( or expanded on the server ) … if a file remain in the sha1 folder means there’s a problem, that need to be fixed, that’s all.
Main reasons for which a file can remain in the sha1 folder could be ( at least those I have in mind in this moment ):
-
The upload manager not properly configured: too low values for maxsize/maxcount in the parent relay, no enough free disk space on the parent relay machine or server, …
In general the log files on the relay and parent relay provide enough information to analyze this type of issue … in some cases need enable the tracing at verbose
-
AV exclusion not properly in place …
In this case it is possible some errors are visible in the log files, but the better way to investigate about this issue is use the procmon utility and verify the exclusion is correctly in place
-
The presence of bugs, as the known one KB0121993 that is probably causing the issue in this case.
In this case it is suggested to contact the Bigfix support team to investigate what is going on.
Just as reference: