I am looking at a way to check on the client if the BES Agent/Client is working correctly. Just having the agent running is not enough to say it is healthy. We have utilized the BigFix UI to report on agent health for other agents, but am trying to find a way to report on BigFix’s own health to itself.
I am thinking about parsing the log files, to look for a confirmation the report has posted successfully, but thought i would check in with the forum before developing a new solution.
Any thoughts or relevance that folks have already worked on?
Here is a client healthcheck tool I wrote for Windows –
It writes to the Windows Event Log which you can do analysis on from your log manager or you can look at via BigFix Relevance.
The big challenge of course is how does an unhealthy agent report its unhealthy status? You kinda need an external mechanism – so that was my approach here.
It essentially checks the following on Windows:
Does the BESClient Service Exist, is it set to Automatic, and is it currently Running
Do the BigFix Registry Keys exist in HKLM
Does the EnterpriseClientFolder value and ComputerId value exist in the Registry
Does the EnterpriseClientFolder value point to a real folder on the disk
Do the following folders exist: “__BESData”, “__BESData\__Global”, “__BESData\__Global\Logs”, “__BESData\actionsite”
Do the following files exist: “besclient.exe”, “__BESData\actionsite\ActionSite.afxm”
Is there a log file in “__BESData\__Global\Logs” from today
If any of the checks fail, the script backups the __BESData dir, runs the BESClient cleaner, and reinstalls the client fresh.
You can of course just remove the cleaner/install step and perform manual remediation.
The actual script is run as a scheduled task and pushed out via Group Policy in most places ive used it.
As the script generates different Event IDs on success/failure you can setup a report in your log manager to detect devices with continuously failing checks, and you can check the Last Success date for the analysis to setup a Web Report for clients that have not reported a Last Success in X days or something similar
I know this is an old thread but I thought of asking my questions here as they’re on the same topic. So, here are my questions.
Does the latest version of Bigfix have any new enhancements around this or is what you’ve suggested still the best way to do this?
Is there a script that can help detect clients that are unable to talk with their assigned local relays or are facing issues downloading content (action sites or actual package/updates content) and push those errors also to the Event Log?