Problems with download to NFS repository

This week I have heard from two large BigFix users that they are having problems with the AIX function to download AIX TLs to an NFS repository. An AIX TL can be larger than 12GB, and have > 2000 files. In each case, this function was previously working.

I have spent several hours debugging these problems with each user, and have found they have identical symptoms. I am wondering if there has been a change to the underlying BigFix function, or a huge coincidence.

Here are the symptoms:

  • BigFix Platform Server running on Linux
  • One customer at 9.5.15, the other customer at 9.5.12
  • One customer uses a Proxy, one does not
  • The NFS repository download fails. Some of the files are downloaded successfully, and other fail with the following message: Download error: Unexpected HTTP response: 416
  • On the Platform server, the directory /var/opt/BESServer/wwwrootbes/bfmirror/downloads/ActiveDownloads contains dozens (or in some cases, hundreds) of files. These files are different sizes, fairly large, and seem to be complete, but never get moved to the sha1 directory.
  • I first thought it may be related to file size, but the files that do get moved to the sha1 folder and downstream relays are of various sizes - large and small. I’ve verified that some of the files stuck in the ActiveDownloads folder are smaller than some of the ones that have made it to sha1

Debugging steps I’ve taken:

  • Stopped the server and completely cleared out the cache and the ActiveDownloads folder, restarted the server and launch the action again - same results
  • Verified that ulimit -n returns at least 4096 for Platform server, downlevel relays, and target system of TL download
  • Followed instructions to clean up downstream relays (remove bfemapfile.xml, GatherState.xml, bfsites)
  • Ensured plenty of cache space available (> 20GB free) in /var/opt/BESServer, /var/opt/BESRelay
  • Ensured plenty of cache space available (> 20GB free), and /var/opt/BESClient - for target system of TL download
  • Updated the Client Download limits on target system of TL download
    _BESClient_Download_NormalStageDiskLimitMB = 20480
    _BESClient_Download_PreCacheStageDiskLimitMB = 20480

One other data point - NFS download works fine on my own BigFix environment (10.0), but I am on a very fast IBM network. I successfully downloaded the exact same TL that was failing in the other environments.

Any suggestions for what I can try, or what might have recently changed to cause these working environments to stop working?

1 Like