Hello,
I am working toward correcting a problem that crept up with our environment during installations. We had rolled certain information into our installer that ultimately lead to some problems when trying to avoid some network load. We were seeing a lot of under things as simple as BES Client version and the full operating system name and service pack also coming up as . The fix ultimately came to be creating a job the deleted the _BESData folder on the system and then ultimately it re-downloading the needed information for that client. My problem lies in collecting and automatically fixing those units. I need help finding a way to automatically group my results of retrieved properties as mentioned above and then apply a never ending fixlet to that automatic group so that as broken systems show up, they also get the repair. Any guidance is appreciated.
Iāve been seeing something similar in our environment for a few months now. Iāve created a task that just clears out the ā__resultsā file and then restarts the client and then it reports normally but I have to run it manually. Iām not sure why it does it but since the ānot reportedā is happening on the console side setting up relevance to put these clients in a specific group may not be possible since that evaluation for which groups to join happens on the client.
Sure have, doesnāt seem to have an affect, however I am thinking that it is due to the items we tried to roll into the installer. After we clear up the _BESData folder, it checks back in and the items correct and report the way they are supposed to. I am about to try what jmaple suggested about the ā__resultsā file and at least see if that corrects the issue with a smaller amount of traffic. I am not sure why the āsend refreshā doesnāt do the trickā¦
I believe the Send Refresh only works if the client gets UDP notifications.
When you send refresh, do you see in the client logs that a refresh command was received?
It should also eventually have in the log that it submitted a full report successfully. Once that happens, then that means that itās relay does have the full report and it should get passed up the chain to the root and be reflected in the console. The process from the client submitting the full report to it showing up in the console could take a minute or more depending on the complexity of your networks and infrastructure. Also, the time it takes the client to calculate everything required for a full report could take some time depending on how many properties it has to report.
When we send a refresh against the system it does get it and does post a full report, however the status of the two mentioned properties stay the same.
Iāve attempted the same thing. Only thing that worked for sure was clearing the ā__resultsā file (possible corruption of the file but Iām unsure how to determine it using some analysis) or deleting the whole āBESDataā folder and restarting the client.
As a test. I made a copy of a ābadā ā__resultsā file before deleting it and compared the contents to the newly created one and everything it contains looks to be the exact same. The only odd thing is the size of the bad file is 4 bytes bigger than the good file.
There doesnāt seem to be any way to tell why the bad file is any different than the original nor is there a way to know why the client wonāt read the bad file any more but as soon as itās recreated, it begins reporting normally.
EDIT: The client reading the file was probably not the right way to say it. When the results this file contains are sent off, they donāt seem to be interpreted correctly when they are ingested into the database?
This sounds like the same issue discussed here Searching Relevance for <not reported>. This other thread explains why you canāt use automatic groups or dynamically target such endpoints, and the suggested approach. In order to automate it, you would have to use the REST API to deploy the actions automatically based on what the server sees.
Iām confused as to why this requires automation, though. If youāve identified the problem, I would assume youāre not deploying the bad package anymore, so why do you expect to see problem systems continuing to show up in the console?
@steve Automation of this problem would be nice because while Iām not able to determine the root cause of why this happens, I get 4 to 5 machines that do this every week. Most of them are PCs and my suspicion is that they were decommissioned and are being recommissioned without being cleaned up properly. I donāt run our Site Support team who commissions them so I donāt have an effective way to test but at the moment, these machines need intervention and Iād rather not have to run the task manually if I can avoid it for such a simple procedure.
yeah for sure, but is there any markers on the bad one that would possibly make it stand out (something I could scrape) in order to identify that the system does have a bad __results file?
Thatās what Iāve been trying to determine and from all the tests Iāve done, there doesnāt seem to be any difference between the files. They look to be exactly the same.
The _results file is a copy of the property results the agent has already reported. The agent uses this file to tell what property results have changed since it last sent a report. When you delete this file, it causes the agent to behave as if the server does not have previous values for any of the āglobally activatedā properties, and so the agent reports the values whether they are the same or not.
I would try to figure out what is causing your agents understanding of what it has reported to the server differs from what the console shows. Some possible areas to investigate: the console cache may not have been filled properly, or filldb may not have successfully placed the values in the database, or the server may have discarded the prior results when you edited the property expression, yet the result was the same on the agent, so it didnāt report a value.
@AgentGuy I find it unlikely this is an issue with the console because as soon as the client generates a new _results file, all properties start reporting correctly. I can only assume that the file is slightly corrupted when the client is sending it to the core. If it wasnāt the file, I would think deleting it wouldnāt change what we see in the console.
What @AgentGuy is saying is that deleting the results file is the same as telling the client to send a full report, so the file isnāt the cause, it is a mismatch between what the client believes it has reported and what the server has. While there may be some unique cases, I would expect most (if not all) of these cases could be resolved using a SendRefresh or (if UDP is not reaching the endpoint) a notify client forcerefresh action.
There was an issue with the BESComputerRemover that could cause this to occur where fully deleted endpoints were not being marked properly for refresh if they came back online. If you havenāt updated your BESComputerRemover in several months, you might check for our latest version to address this.
My question about the need for automation was really directed at the OP who had an identified cause that was no longer occurring. For other intermittent occurrences, the BESComputerRemover should provide sufficient automation, as it is capable of removing āNot Reportedā computers and marking them for full refresh on the next registration.