Component in Baseline Still Running

We have a Baseline that contains much of the Office Scrubbers (to uninstall all versions of Office). To use the Office 2007 scrubber as an example, we are seeing situations where the 2007 component in the baseline never returns complete (status of running).

When we look at the log of the scrubber, it clearly has finished within several minutes; the end of its log shows:

Removal result: 0 - SUCCESS
Removal end.

It even shows a return code of 0 in the SWD_DeploymentResults.log:

Mon 07/17/2017 10:03:23.93 
Action ID: 76165 
Command: cscript "OffScrub07.vbs" ALL /Quiet /Log C:\Windows\temp 
Microsoft (R) Windows Script Host Version 5.8
Copyright (C) Microsoft Corporation. All rights reserved.

Return code: 0 

This is that components last entry in the BF log:

At 10:03:23 -0400 - actionsite (http://FQDN.com:52311/cgi-bin/bfgather.exe/actionsite)
   Command succeeded parameter "baseFolder" =  "__Download/" (group:76155,action:76165)
   Command succeeded move "__Download/DF2CDE0131594840A663AF619D0A3257D3213658" "__Download/OffScrub07.vbs"  (group:76155,action:76165)
   Command succeeded parameter "mainSWDLogFolder" = "C:\Program Files (x86)\BigFix Enterprise\BES Client\__BESData/__Global/SWDDeployData" (group:76155,action:76165)
   Command succeeded folder create "C:\Program Files (x86)\BigFix Enterprise\BES Client\__BESData/__Global/SWDDeployData" (group:76155,action:76165)
   Command succeeded parameter "logFile" = "SWD_DeploymentResults.log" (group:76155,action:76165)
   Command succeeded delete No 'C:\Program Files (x86)\BigFix Enterprise\BES Client\__BESData\CustomSite_SWD_Production\__createfile' exists to delete, no failure reported (group:76155,action:76165)
   Command succeeded parameter "logFolder" = "C:\Program Files (x86)\BigFix Enterprise\BES Client\__BESData/__Global/SWDDeployData" (group:76155,action:76165)
   Command succeeded delete No 'C:\Program Files (x86)\BigFix Enterprise\BES Client\__BESData\CustomSite_SWD_Production\run.bat' exists to delete, no failure reported (group:76155,action:76165)
   Command succeeded createfile until  (group:76155,action:76165)
   Command succeeded move __createfile run.bat (group:76155,action:76165)
   Command succeeded override wait (group:76155,action:76165)
   Command succeeded override hidden=true (group:76155,action:76165)
   Command succeeded override completion=job (group:76155,action:76165)
   Command started - wait run.bat (group:76155,action:76165)

It’s happening more frequently than we would like to see and we aren’t able to nail down any commonalities to when it works and doesn’t work. This is obviously a problem because the rest of the baseline doesn’t run, which means the system doesn’t reboot after the uninstallers, and then doesn’t install O365.

Perhaps experimenting with changing completion=process?

Is there anyway to force a complete on the individual component if it doesn’t return one within a defined amount of time?

I have had similar situations where misbehaving fixlets caused my baseline to hang. What I usually do is 1) use run instead of the normal “wait”, 2) start a timer, and 3) abort if the timer is exceeded or some other failure is detected (e.g. critical errors in some log file). Using run instead of wait frees me to monitor the spawned process as I see fit, instead of getting stuck by a possibly never-ending “wait”.

There is a timeout thread here and I can also provide more detailed info with actionscript if that would be helpful.

Thanks. These are uninstallers so I think they are best to run serial. I think “run” would kick it off and move on to the next component. The SWD process by default overrides wait, using completion=job…I’m trying completion=process (which from my understanding negates the override of “wait” and uses its default behavior. Of course I can speak too much to those optional differences, but given it works on some perfectly and not on others is limiting what I can really do.

I almost have to say that this is a BigFix Agent issue as the process is finishing and returning an exit of 0; but the agent isn’t picking that up and continuing to the next component in the baseline.

Sorry, I’ll try to be more clear. My solution also “waits” by using the pause command, it just provides me more control in instances where misbehaving installers/uninstallers cause the BigFix agent to wait indefinitely (usually through no fault of its own). Here is a snippet where I used this:

runhidden "{parameter "ScriptPath"}"

parameter "TimeToWaitFromNow" = "{now + (30 * minute)}"

// The installer can get stuck occasionally while copying files.
// Attempt to recognize and handle that situation by timing out.
pause while {exists process whose (name of it contains case insensitive regex (parameter "WatchedProcess")) and (now < parameter "TimeToWaitFromNow" as time)}

// Sometimes the install seems to succeed, but some core files fail to upgrade and leave the device in an unusable state. If this happens, change the registry key back to the previous version so the applicability relevance will fail and the operator knows to take action again.
if {(exists lines whose (it contains case insensitive regex "Install failed|Update failed to apply") of file (parameter "UpgradeLogFilePath")) or exists process whose (name of it contains case insensitive regex (parameter "WatchedProcess"))}
  regset64 "[{parameter "RegPath"}]" "Version"="{parameter "PreviousVersion"}"
else
  parameter "StepSucceeded" = "True"
endif

if {not exists parameter "StepSucceeded"}
  dos echo {parameter "ErrorLevelToReport"} - {parameter "CurrentStep"} may have not installed correctly. >> "{parameter "LogFile"}"
  regset64 "[{parameter "UpdatesRegPath"}]" "Result from {parameter "CurrentStep"}"="{parameter "ErrorLevelToReport"}"
endif
1 Like