Query to List the missing patches on endpoint - slow response?

Hello, given below is the query i am running to get a list of all missing patches for an endpoint managed by BigFix server (managing thousands of endpoints). Sometimes the response is quick, while at other times it takes for more than an hour. Is this expected ? Any way this query can optimized ? Thank you!

Query Resource="(source severity of it | “Unspecified” & " - " & display name of site of it & " - " & name of it) of relevant fixlets whose ((display name of site of it starts with “Patch” OR display name of site of it starts with “Updates” OR display name of site of it = “Client Manager for Endpoint Protection”)) of bes computer whose ( id of it =)

Seems to run very quickly for me.

@JasonWalker , any suggestions?

Hm. The way that query is written should be very efficient. There aren’t any tuple groupings that would cause the cross-product/nested-looping that is usually a concern.

I assume you are filling in the id for a bes computer when you actually run it?

It’s also missing the last closing quote - is this really the whole query, nothing lost in copy/paste?

One thing I’d note is that ‘Query’ API calls are actually serviced by Web Reports. If Web Reports is offline or overloaded you might get some delayed response (but I’d expect to actually get an error rather than valid response that takes too long).

How frequently are you running a query like this? I did see a case where a customer was blowing up their Web Reports server by having each client repeatedly a query like this, retrieving info for itself. But having a hundred thousand computers try to run this query every minute was just too much for the poor server.

Hi @JasonWalker,
The response is fast most of the time, within a few ms. However, it’s just a few times and randomly, the response isn’t received within the 1 hour timeout period. And yes, the computer ID is dynamically filled in before the query is sent.

Thanks!

Ah, ok, so it is not the case that after a long time you get an answer, it’s the case that after a long time the connection times out and you did not receive a response at all, right?

You might open a support ticket. This is certainly not normal behavior.

I’d start by trying to correlate logs between the applicaton, BESRootServer, and Web Reports server to identify whether there are any network issues, services going offline, or other error messages from either of the servers.

You’d also want to check whether there are any firewalls, proxies, or or switch ACLs between the root and web reports server that might be dropping connections.

With an issue like that I think it is unlikely to be in BigFix itself, this sounds much more like a network interruption.

Hi @JasonWalker ,
Thanks for your quick response. Yes, does not appear to be a normal behavior. Just to clarify, this is a asynchronous http rest api request/response. So, in the logs, I didn’t see any response for a few of the request. Also, there was no network interruption (like broken pipe or any http error code during that period)
I don’t have access to customer’s bigfix environment. Would you be able to tell information to request for BigFix server logs ? Where are those stored on the BigFix server ? Thanks!

I think you’ll need to open a support ticket, they can step you through the logging you’d need. You’ll need to get both server logs and the web reports logs, and probably enable debug logging on both.

Thank you very much @JasonWalker . Will do!

Hi @JasonWalker,
The customer provided these BigFix logs for the timeframe when response was not received (screenshot attached). Does that indicate an issue with the BigFix server to relay communication.

Thanks!

Yes, those messages imply the the server or parent relay was not able to connect to those child relays. The connection would have been using tcp/52311 from the parent to the child.

1 Like

Thank you very much @JasonWalker !