Restarts in Automation Plans

We have an automation plan for our SQL clusters. It works well overall, but sometimes we have issues with that the automation plan does not proceed after one of the reboot steps.

Related steps in our automation plan:

  • Restart Endpoint on Pending Restart and Wait for Restart to Complete
  • Baseline for installation of Windows cumulative updates
  • Restart Endpoint and Wait for Restart to Complete

The baseline installation completes and after this the endpoint is restarted according to the automation plan. After the reboot the client does not report the Restart action as “Completed” - the action is still in “Pending Reboot” state. For this reason the automation plan is stuck.

We have to logon to these servers and perform another reboot manually, after this second reboot the automation plan proceeds.

I assume that there are pending operations ongoing on the server that are processed before the BigFix agent is started, then the server will still be in “Pending Reboot” state when the BigFix agent comes alive. This is my theory anyway.

Have anyone else ran into this or can anyone this of a solution / woraround for this problem?

Thanks,
Stefan

Some forum members have reported that they need a second reboot to completely resolve pending operations. Perhaps add a second restart task at the end of your plan?

We found an extensive article that discuss the topic of managing Pending Restarts in an Automation plan:
https://help.hcltechsw.com/bigfix/9.5/lifecycle/Lifecycle/ServerAutomation/SAUsersGuide/Server_Automation_plan/sa_pendingrestart.html

Our problem is that the automation plan do not move to the next step. It is stuck waiting for the current reboot to be completed, so it would not matter if we had yet another reboot as the next step Is there a way to control the behavior and how long the Restart Endpoint step waits before moving on to the next step?

An update regarding this problem.

As a workaround, we initiate the restarts using the following actionscript instead of the native Restart command.

// Restart the endpoint
runhidden shutdown.exe /R /T 05

// 300 second delay in action script
parameter "startTime"="{now}"
pause while {(now-time(parameter "startTime") < 300*second)}

This seems to work, the five minute delay is there to prevent the next action in the automation plan to be started before the reboot has been completed.

Can anyone think of any bad side effects by using this approach?

Thanks,
Stefan

You can also consider modifying the step’s execution configuration (Gear Icon) to Fail incomplete targets after xx minutes. Further, you can add a failure step (Paper with plus sign) that will be executed if the step fails.

Yes, we looked into these options as well. But we think that the above methods seems to work as expected in this situation.

I have read in another thread that the “Restart” command is the correct approach, but in this situation the native “shutdown” command seems to works better.

Still curious if there could be any drawbacks with this approach in an automation plan.