We had a recent issue that was rather cumbersome to recover from and I am wondering if it is a me thing or something that could be a useful bigfix addition. I was rolling some software out and it hit exactly the same time as a core network upgrade happened. The job ran on the susyems, but since the 15th has been in a perpetual state of “Running” with the can’t clear downloads message spamming every second.
I looked at the folders and they were empty, so I deleted them and restarted the client and it did the same thing until I restarted the computer. Now here is the issue - these are on tightly controlled point of sale system (cant remote powershell) and spread across like 27 states. There was no way I could find that would work remotely and the only thing the client was actually doing was sending reports back up - nothing else. Any jobs post 10-15 would be forever “pending downloads” or “waiting”.
Is there something else that could have been done other than calling and rebooting a bunch of systems by hand?
Generally this is caused by running a patch, installer, or other executable that was downloaded, but failing to run the program silently.
The program may be attempting to display a confirmation or some other prompt to the user, but as it is invoked silently and the interface is not displayed to the user, the program does not terminate. Since it is executing from the site’s __Download folder, the BESClient cannot clear the folder to replace it with the next actions’ downloads.
To fix this, you can attempt to identify which process has the download folder locked, and terminate that process.
You could reboot the computer (obvs not desiranle).
You could use the timeout settings described at List of settings and detailed descriptions so that BESClient will kill the spawned child process of they take too long (be sure to set it long enough that no expected long-running processes, like a Windows Upgrade, are killed unexpectedly)
Yeah, I wish that were the case though. I used an nsis installer that had no user interface, no user prompts, and basically just unpacked files into a directory. I tested that install alone multiple times with no issue before I put it into bigfix - where that was tested multiple times. It was odd. I will check the timeout settings though, so thank you for that.
Yeah I saw that @orbiton - Stuck that back to test and also had a system we rebooted and it fixed it and we just needed the fire put out. I will stick that back for sure.
The timeout settings Jason mentioned are a good idea.
Another one is don’t run the installer from the __Download folder… but at the same time, I actually prefer to run out of that folder most of the time because cleanup is handled for you.
You can set a custom timeout for each run or wait command that is different than the global one, and that is the best option in addition to setting a much longer timeout globally.
Killing the installer that is running and stuck or rebooting the system is really the only option. If you can run actions through BigFix through another site, then you could use BigFix to clear this, but if the action is stuck in the running state, then you probably con’t use BigFix to clear this.
Yeah I was calling it from __Downloads, and up until this time I have had zero issues with it, which is what struck me as odd that it happened at the same time as a maintenance window. It was either that or maybe AV was doing something the dumb.
I’ll just copy it out of the downloads directory - it definitely is not worth the headache, and put it somewhere we already do clean up on for other things. I really don’t want to pull a 40 hour weekend again over this.