Vmware tools baseline - false positive failures for pending restart

Going back a few years ago, we found that if you simply apply the latest vmware tools update, a small percentage of servers will wipe out the NIC configuration.

While we’re not 100% sure on true root cause(s), we did identify 2 critical items that needed to occur to basically eliminate this issue.

  1. Identify if the server was in a pending restart situation from patches staged and not applied and reboot prior to attempting vmware tools upgrade
  2. Vmware tools installer & and it’s pre-reqs often require a restart and then you have to retry the install a second time to complete the upgrade. This is documented right in vmware KBs.

So we have been using a baseline that looks like the following. The key takeaway here is that we check for reboots, reboot when needed, and then try again. Of course if successful on the first try, relevance prevents further again. The upgrade fixlets are standard HCL while the install fixlets are HCL with tweaks to identify missing vmware tools (same actionscript).

Also note that all restart fixlets have the relevant fixlet box unchecked in the baseline.

So that brings me to the issue…

When we patch servers, we will apply multiple baselines at the same time. OS, Office, applications, etc. Whichever action the client decides to run first, it does. What we are finding is when the server reboots in the middle of this baseline, sometimes the BigFix client will stop running this baseline and instead begin running another baseline…. When this occurs in combination with other baselines/reboots, we see a small percentage of the baseline action for vmware tools to report a false positive failure on one of the restart fixlets.

  1. I believe this is expected behavior, correct?
  2. Doubtful, but is there any client config that can be applied to ensure the client returns to the same baseline?
  3. I imagine the only fix is custom relevance to chain baselines or to leverage Server Automation?
  4. Other suggestions?

P.S. Now that we have been pushing out C++ runtimes more regularly, we theorize the vmware tools pre-reqs in the installer may not be as much of an issue as in the past but it’s hard to be sure without more research.

Here are some example failures

Just verifying that you are using a task and not a fixlet?

Good question.

The “Restart Needed…” components are all TASKS.

The Install/Upgrade vmware components are all FIXLETS.

The last item (service health check) is a FIXLET and it’s just there to force start the service if anyone disables/stop the service.

Can you check if restart is actually happening, may be using uptime property to avoid remote login to server. Most likely bigfix is reporting this as false positive and changing the success criteria to All lines of the action script have completed successfully.

Sorry, my question is for the restart task. A fixlet will require success criteria but a task will just return complete if the all tasks in the action script actually ran.

Can you go to the task "view action info"

Like this...

Restarts do for sure occur in this baseline, but from what I can gather reviewing logs is the issue is a situation where the server may switch the active baseline actions following a reboot. At some point during the multiple baseline action executions, relevance for “pending restart” changes true/false and the BigFix client sets the action status to Failed.

Here is a couple examples from the Restart Needed task.

Here is the action settings.

This gave me an idea actually… I wonder if I should try setting the task “Success Criteria” to “applicability relevance evaluates to false” or maybe just “False”.

Task success criteria right now is…

There are a few ways to resolve this. One is the change of the success criteria.

I foud this, maybe you can pause while...

waithidden restart 60
pause while {pending restart}