Symptoms of a Client Out of Sync?

I don’t fully understand this, which is why I’m posting this. All BES components are at 9.5.5.

From time to time, I find endpoints that show a never ending “Evaluating” status. Looking at the endpoint, I see several open actions also showing this status.

My understanding is that will be true if the first action is hung in someway; that others will be in a holding pattern. Another symptom is endpoints shown relevant for a task, but after deploying the task, the endpoint reports back as Not Relevant.

I also understand (or think I do) that an endpoint can sometimes fall out of sync and that this can be corrected by sending a blank action via the Master Operator using the “Select Target” and NOT dynamic target. The unique process triggers the endpoint to download the mailboxsite, which then puts it in sync with up level relays…

This morning I took a handful of endpoints that had these symptoms and performed the deployment of a blank action as described above. What should return a status of Complete immediately, they forever stand in the Evaluating status. I manually restarted the BESClient service and they then completed. In addition, those with other open actions that had been pending Evaluate now showed complete and tasks where they had been listed as relevant no longer showed.

My Questions:

  1. Am I even close in my understanding?
  2. If manual action such as I performed here isn’t done, will endpoints never recover?
  3. Is there any way more effectively identify these endpoints and take corrective action?

The client will only report “changes” in state and once reported, will not report again unless a request for a full report has been received. This can result in a difference of what the DB says and what the client knows (hence a “relevant” fixlet that the action taken on is “Not relevant”)

Missing reports could cause this, but the server should be asking for these missing reports. Sending an action down will not cause other items to be reported on specifically but may “make” the client do a report.

Sending the message via right click on the endpoint in the console, or via the actionscript command (notify client ForceRefresh) would be the only way to clean that up but be very wary of doing that too much as the server will take a very heavy load on the full reports.

The 9.5 and later clients will do this automatically if their actionsite was deleted when they start up as well.

All that being said the fact you had to “restart” the client to get this to work suggests you might not be receiving UDP messages on the endpoint.

1 Like

UDP is working as expected. The action is received as relevant on the endpoint (as seen in the logs) but does nothing at that point. The endpoints seem to perform their normal processes, but in this case, the action wasn’t kicked off until I restarted the service.

After a bit of research, I think that that restarting the BESClient service OR sending a blank action as MO (which is less intrusive) will accomplish what I was speaking of… to update the latest mailboxsite.

Restarting shouldn’t kick off any action unless the action has some unusual relevance. The only thing that would stop a “relevant” action from executing is some kind of condition that is required.

Are you sure that you don’t have a stuck action running on the client? A “command started” with no “command completed” or another action sitting at running?

Restarting the client would cause it to “forget” about the previous action being stuck and immediately start a new action

Bill

When looking at history for the varied machines, common open actions that are evaluation are native BigFix actions, such as the policy for the BFI scanner; perhaps that is hanging on its scan? What would be the best logging to reference to rule that in or out?

I think this has something to do with the “Run Capacity Scan and Upload Results” process as it seems to be a common thread. I restarted the agent on another system and “Evaluating” items begin to process, but looking at the log, I see this flood (only a short list is shown here but it goes on for over 800 lines).

Another common items I’m seeing, is Cannot empty _Download directory despite those directories being empty.

12:37:29 -0400;True;27 days, 17:19:54.713;7
AbtSvcHost_.exe;Sun, 23 Apr 2017 10:07:12 -0400;Tue, 20 Jun 2017 12:37:11 -0400;Tue, 20 Jun 2017 12:37:29 -0400;True;54 days, 01:45:23.537;17
AppleMobileDeviceService.exe;Mon, 31 Oct 2016 22:14:26 -0400;Tue, 20 Jun 2017 12:37:11 -0400;Tue, 20 Jun 2017 12:37:29 -0400;True;227 days, 14:03:23;48
BESClient.exe;Mon, 31 Oct 2016 22:14:26 -0400;Tue, 20 Jun 2017 12:37:11 -0400;Tue, 20 Jun 2017 12:37:29 -0400;True;227 days, 14:04:07.975;47
BESClientUI.exe;Sun, 23 Apr 2017 10:08:23 -0400;Tue, 20 Jun 2017 12:37:19 -0400;Tue, 20 Jun 2017 12:37:29 -0400;True;56 days, 17:00:46.556;13
BTPlayerCtrl.exe;Sun, 23 Apr 2017 10:08:23 -0400;Tue, 20 Jun 2017 12:37:11 -0400;Tue, 20 Jun 2017 12:37:29 -0400;True;56 days, 17:11:55.708;13
BleServicesCtrl.exe;Mon, 31 Oct 2016 22:14:26 -0400;Tue, 20 Jun 2017 12:37:11 -0400;Tue, 20 Jun 2017 12:37:29 -0400;True;217 days,

This typically means that there is a stuck action.

1 Like

That may be something to match up. We have some Office VBS scrips (those scrubbers) and for whatever reason, some seem to to compete on the endpoint, but remain running in BigFix. Like it isn’t sending an exit or something.