This week I have heard from two large BigFix users that they are having problems with the AIX function to download AIX TLs to an NFS repository. An AIX TL can be larger than 12GB, and have > 2000 files. In each case, this function was previously working.
I have spent several hours debugging these problems with each user, and have found they have identical symptoms. I am wondering if there has been a change to the underlying BigFix function, or a huge coincidence.
Here are the symptoms:
- BigFix Platform Server running on Linux
- One customer at 9.5.15, the other customer at 9.5.12
- One customer uses a Proxy, one does not
- The NFS repository download fails. Some of the files are downloaded successfully, and other fail with the following message: Download error:
Unexpected HTTP response: 416
- On the Platform server, the directory
/var/opt/BESServer/wwwrootbes/bfmirror/downloads/ActiveDownloads
contains dozens (or in some cases, hundreds) of files. These files are different sizes, fairly large, and seem to be complete, but never get moved to the sha1 directory. - I first thought it may be related to file size, but the files that do get moved to the
sha1
folder and downstream relays are of various sizes - large and small. I’ve verified that some of the files stuck in theActiveDownloads
folder are smaller than some of the ones that have made it tosha1
Debugging steps I’ve taken:
- Stopped the server and completely cleared out the cache and the
ActiveDownloads
folder, restarted the server and launch the action again - same results - Verified that
ulimit -n
returns at least 4096 for Platform server, downlevel relays, and target system of TL download - Followed instructions to clean up downstream relays (remove bfemapfile.xml, GatherState.xml, bfsites)
- Ensured plenty of cache space available (> 20GB free) in /var/opt/BESServer, /var/opt/BESRelay
- Ensured plenty of cache space available (> 20GB free), and /var/opt/BESClient - for target system of TL download
- Updated the Client Download limits on target system of TL download
_BESClient_Download_NormalStageDiskLimitMB = 20480
_BESClient_Download_PreCacheStageDiskLimitMB = 20480
One other data point - NFS download works fine on my own BigFix environment (10.0), but I am on a very fast IBM network. I successfully downloaded the exact same TL that was failing in the other environments.
Any suggestions for what I can try, or what might have recently changed to cause these working environments to stop working?