Forced Reboots via BigFix

We are trying to have BigFix rebooot our servers on a fixed schedule. We have used the “Restart Computers” task with mixed results. We have tried forcing a reboot with the shutdown -r -f command and that actually caused the servers to shutdown, not reboot. We’ve tried the restart command as well with mixed results. There are times that it works and other times that a BES Client service restart on the server needs to take place. Why is it so difficult to get a reboot working consistently with BigFix? Are we approaching this wrong or missing something?

Thanks in advance for your help.

To reboot a computer in an immediate way, use the following actionscript:

restart 0

I haven’t used it enough in actionscript often enough, but sometimes on the OS level, a restart command with a short delay is more forceful than one with no delay:

restart 10

Related


The short explanation for the delay: If it takes more than 5 minutes for a client to run a new action (particularly if there are no file downloads involved) then there is very likely a problem preventing UDP notifications from working. Most actions should happen within 15 to 90 seconds if UDP notifications are working all the way down the chain from root server to relays to clients. (assuming the client is awake)


If you are trying to reboot a system right now and sending an action with that actionscript, then how soon the server will get the command depends on it getting the UDP notification from it’s parent relay. If it gets the UDP notification, it should take something like 30 seconds for it to get the action, process it, and run it. It should reboot fairly quickly.

If the server you are trying to reboot does not get the UDP notification, then it will not get the reboot action until it polls for commands, gathers, or you restart the client service.

If you are creating a policy action that reboots the server at the same time every day or week, then once the client has that action through UDP notification, gather, command polling, or restarting the client service, then it should do the reboot as scheduled from then on.

By default command polling is not enabled. I recommend enabling it for every 6 hours for all clients and every hour for clients on WiFi that roam networks frequently. (obviously WiFi does not apply to servers) Command polling puts some small extra load on your relays, but not the root server unless there are many clients connected to the root server, and even then it should not be significant.

I believe the default gather interval is every 24 hours. This should be the maximum time that it takes for a server to get a new policy action.

Inorder to make sure the servers are getting the UDP notifications so that they process new actions almost instantly, you need to make sure the firewall allows incoming UDP traffic on port 52311 and do the same on any hardware firewalls between the client and relay, but also between all firewalls between all relays and the root server.

Related:

2 Likes

I’ll give this a try, we currently DON’T have command polling enabled. Also, do you know of a way to schedule reboots or any fixlet/task for that matter on a monthly basis on say the second wednesday @ 6pm? BigFix seems to lack this basic cron functionality that I’m used to with other patching products.

Yes, you can definitely do this.

When you take an action, you can configure the timing that you want that action to repeat on, but that dialog may be limited on the intervals and scheduling that it allows you to set through the dialog GUI itself.

Those options in the dialog really just create relevance that constrain when the action runs. You can create any relevance to do this yourself and put that in a baseline to cause that baseline to run on that exact schedule, then include any fixlets or tasks you want in that baseline, which will cause them to run on that schedule. You could also put the scheduling relevance into the fixlet or task itself, but that is problematic if you want that fixlet or task to run on one schedule on one set of machines and a different schedule on a different set of machines.

Here is a screenshot of the execution tab of the take action dialog that approximates what you are looking to do, but it will not enforce the 2nd Wednesday part. You would have to take the action after the first Wednesday but before the 2nd and it should do what you want, but not as precisely as you could define using custom relevance:

The above may cause it to slip a week every March due to there being less than 30 days in the month of February. If there was an option to Reapply this action while relevant waiting 28 days, then that shouldn’t happen, but even so, this relevance isn’t as precise as writing it for the specific constraints.

I’ll admit I suck at relevance, I’m a sysadmin, not a developer. Do you have any examples I could reference for writing this relevance?

Thanks for all your assistance, much appreciated.

I tried the Restart commands as you suggested and am still getting mixed results. 2 of the 9 that I tried both of these on had to have the BES Client restarted in order for the reboot to take place. 1 of the servers that I happened to be logged on to had the BF dialog pop up on it and wouldn’t reboot until I clicked on Take Action.

I’m going to try this next.

if{0 < number of logged on users}
restart 30
else
restart 1
endif

1 Like

I just edited my above post with an easier option + screenshot.

What actionscript did you use that wouldn’t reboot until you clicked Take Action?

It should be that the number of seconds you specify is how long the user has to accept the reboot, otherwise it will just go. Perhaps restart 0 is unlimited time for the user to respond.

I do use actionscript like you just pasted in and I like it for handling reboots. One time out for users, a faster one if no users.

I had the prompt happen on the restart 10 only, not the restart 0. The script that I posted just seemed to ignore if anyone is logged in, which is what we want to do.

As to the schedule, I’ll give that a try and see how it works. Any other suggestions you might have would be greatly appreciated.

As to how we are going to do this, we have 200+ servers and we have them all tagged as being in certain Reboot Groups. So we will send down the monthly patches via a baseline in one Action and then send the reboot down scheduled in another action.

1 Like

Related:

So, I’m trying to get this setup to trial it using 1 day instead of 30 days, in the Run Only on Constraint selection, is the greyed out selections NOT selected or selected? For some reason I can’t upload a screen shot because I’m a new user here. My screen looks different that your screen shot so that’s why I’m a bit confused.

I believe greyed out means it is not selected.

You must check the box on the far left of “Run only on” in order for those to become selectable.

Well, this script worked in testing, but it’s not working across the board. Why can’t BigFix just reboot a device without any issues? This is basic functionality that I would expect to work. We scheduled a reboot for a server for 6 PM using the above script, all lines of the script show completed, however the action is showing a status of Pending Restart. The server never rebooted. I’ve had an open PMR with IBM on this for over a month now and they don’t understand what the issue is. Is anyone else experiencing this? If so, what is the work around?

If you think the issue is with BigFix, another option besides using the Action command “RESTART” is to issue a SHUTDOWN command to the OS.

DOS SHUTDOWN /R /F /T 0 /d P:0:0

This should cause your targeted systems to restart immediately and register that it was a Other (Planned) shutdown event, you can adjust the timing option as needed.

I strongly recommend to NOT call the OS shutdown commands from within actionscript as you may end up with a corrupted agent depending on how the OS does the shutdown. In most cases its probably OK but there are some large notable exceptions (AIX with a Hardware Management device is one)

2 Likes

We make the call to the OS to perform the reboot. If it doesn’t complete it there could be other issues with the endpoint preventing it, and also a status of Pending Restart doesn’t mean the endpoint DIDN’T restart, though in this case you said it didn’t restart.

Additionally if the client is busy when the 6pm time comes along and you give it a short timeframe to perform this restart, it may not be able to get to the action before it expires.

1 Like

I’ll give this a shot and see what happens. I want to say we’ve tried this already and it didn’t work either.
Thanks!

If this doesn’t work then its showing that BigFix isn’t able to do it because the OS can’t do it either. This command says restart no matter what… and we just pass the command directly to the OS to execute.

2 Likes

Do you know for sure that the system did not actually reboot? Also, it may not happen at exactly 6pm.

1 Like

I experienced a similar issue with executing an automatic reboot after a fixlet completed. The behavior was that console interaction was required to complete the reboot process. The answer is that GPO had “Interactive Logon: Do Not Require CTRL-ALT-Delete” set. https://support.microsoft.com/en-us/kb/938204

5 Likes

Maybe someone out there has recently experience this?
Using latest version of BigFix.

We run a baseline with 20 or so patches.
The server goes into pending restart
The server is unattended at all times
we created an action that does:
restart 180
the relevance is, if server belongs to a group and it is in pending start
we run the action for a day and hour in the future.
the action does not run just because the server is in pending start???

I can get it to work if instead of restart 180, the action reads DOS SHUTDOWN /R /F /T 0 /d P:0:0

is the conclusion then that restart 180 won’t work if the server is actually in pending start??

thanks.

The state of an action of “Pending Restart” may be due to the action just noticing that it needs a restart, to having already asked the user to restart the box and they aren’t replying. So it depends on how the action is written and how it was taken.

Doing a restart 180 with an existing pending restart waiting on the user will most likely wait for the first to be completed so you are in a bad state. Calling the OS to shutdown the system directly is not recommended as it hits the agent a bit harder and that action will also “fail” as it is shut down before it completes.

So the question is what the parameters given to the baseline were when it was taken? If the restart is requested did the operator choose to eventually force the restart?

1 Like