Client PCs not reporting

(imported topic written by SystemAdmin)

We seem to frequently come across PCs that haven’t reporting into BigFix in a few hours. The PC is certainly up, and if you check the BES client log, it doesn’t look like anything is going on. Sending it a refresh works every time and the PC reports back in.

Seems to happen with both the BES 6 (6.0.21.5) and BES 7 (7.0.1.376) clients.

For now, I’ve enabled verbose logging on a few PCs. If it happens on one of those again, I’m hoping the verbose log will show a little more detail.

Anyone also see this? For some reason the clients sometimes just won’t report back in.

Paul

(imported comment written by Steve91)

Hi Paul

Did you happen to resolve this?

We are seeing the same issue, we’re running version 7.0.1.376.

The PC’s are on and the client is running fine, but they aren’t checking in on a regular basis, the heartbeat is set to 15 mins.

The only way to get them to check back in is to either take an action against them of send a refresh

Any thought’s chaps?

Cheers

Steve

(imported comment written by BenKus)

Hey Paul / Steve,

This could happen if you have overloaded your agents with many actions or properties or Fixlets that take a very long time to evaluate. An easy way to do this in 6.0 is to make lots of huge baselines or multi-action groups. When you do this, the agent spends a lot of time evaluating those and less time evaluating everything else… The agent also wants to make an “evaluation pass” where is evaluates all the Fixlets/actions before reporting the heartbeat.

In BES 7.0, the "

efficient mime

" helps this a lot by making the agent much more efficient with big action groups (but the old action groups will need to be eventually stopped. You can stop old actions to try to make this beter.

Also, if you have Fixlets that take a very long time to evaluate, they will cause the same sort of issue. If this issue started after you added some Fixlets/properties, they might be the issue. The advanced emsg log (at 10,000 level of detail) will tell you how long it takes to evaluate each Fixlet and you can look through it to see if there is anything with very large values.

Ben

(imported comment written by Steve91)

We do have a lot of big baselines, although we always have had and have never experienced this problem before.

Maybe we’ve just added one too many.

I’m in the middle of performing some housekeeping and will trim these baselines down.

If this resolves the issue I’ll post back

Thanks for the help Ben

Cheers

Steve

(imported comment written by Steve91)

Still having problems with this.

I’ve removed some old baselines and also trimmed the remaining ones down, also deleted the majority of expired and stopped actions.

We’re still only seeing clients report in once every few hours.

I’ve enabled the esmg logging on a few clients to see what info I can get from them.

We also have some failures in the deployment health check’s dashboard:

  1. Most of our relays have gather errors, average 10-20, the main BES server has 70.

  2. The “Total Stopped And Expired Actions” is also at 19372 and stays at that number no matter how many I delete.

  3. When I go into the BES admin tool on the BES server “usePre70ClientCompatibleMIME” is set to true, but when I look in the deployment health check it reports “Efficient MIME is not in use and not all BES Clients are version 7.0.”

Additionally, one of our operators has over 10 policies running, I’m wondering if that’s a contributing factor.

I’ll analyse the logs tomorrow and hopefully get some useful info.

Does anyone have any other suggestions in the meantime?

By the way,

We’re running version 7.0.1.376 server, relays and consoles.

The clients are a mixed bag of 6’s and 7’s and a handful of them are a higher version 7 than the relays and server .

Thanks guys

Steve

(imported comment written by BenKus)

19372 actions is quite a lot of actions, but really the open actions are the key… You might try to clear your cache is the number seems inconsistent…

The next steps to troubleshooting this is to look through some client logs to try to figure out what they are doing… it is hard to do that through forum posts so you might want to submit an emsg log and a client diagnostics file to support and see if they can spot what the issue is…

Ben