Performance of BESclient

system · December 7, 2009, 11:18pm

(imported topic written by SY57_Jim_Montgomery)

I know that the client evaluates all the fixlets and actions from every site he is subscribed to, regularly, including admin sites.

Are all these processed serially, in order? That is, is the relevance for each of these fixlets run in order? Does the client cache any data to help run that relevance. If a client changes in such a way to make a fixlet relevant, what is the maximum amount of time before the client will figure that out, and the fixlet is relevant again?

I guess what I’m asking is what is the algorithm the client uses to figure out which fixlet/action to process next?

I’d like to have some way of measuring the load on the client, to verify that some fixlets we’re planning to load up aren’t taking too long for the client to process through, in lieu of other fixlets from running and being analyzed.

–Jim

BenKus · December 8, 2009, 12:53pm

(imported comment written by BenKus)

Hey Jim,

The Agent Fixlet engine is highly optimized with all sorts of caching schemes, optimized algorithms, and you will note that we are generally obsessed with making relevance queries as fast as possible…

You can’t control the order of Fixlet evaluation… it has some basic set of rules where it evaluates through the all the Fixlets starting with the actionsites and moving through the sites (usually in alphabetical order), but you can’t generally rely on the order because the agent has lots of work to do.

Usually on a healthy agent, it will take less than 20 minutes to evaluate everything in a single pass at less than 1% CPU… But it will vary based on how many Fixlets/properties that you have… You might enable the agent profiler to make sure that there isn’t any specific Fixlet that is taking up a lot of time…

Ben

system · December 9, 2009, 6:46pm

(imported comment written by SY57_Jim_Montgomery)

Thanks Ben,

I have noticed the obsession with making fast relevance. I think that spawned my obsession with keeping as light load on the client as possible. (whether it’s warranted or not)

We’ve been thinking about ways to deal with subsets of patches that are internally deemed important, in a 90 day rolling window. So, say, we are concerned with 5 of the last 8 MS patches. There are a couple sides to this --> Managing the list of patches, and not burdening the client on initial client reports, and during patch time, and reporting on this rolling window with webreports.

So we’ve been tossing around the pros/cons of copying the special fixlets into baselines/multiple baselines/a custom site. We always get hung up on two questions: what’s easiest to manage, and what’s the lightest load on the clients.

I did get the profiler up and running yesterday, and I’ll be pouring over those logs, but I’m sure you’ll agree, there is a difference between showing only the top long running fixlets (I did crank it up to top 10000 ) compared with seeing the process time for all the fixlets.

How would you measure that it takes less than 20 minutes for a single pass? By looking at a 20 minute profile log?

Another question I’m trying to have answered before the higher-ups ask is How long does it take for SCM content to query.

I

know its constantly evaluating, and its not a fair question for this product (our last product was WAAAY different) but telling them “it evaluates everything every 20 minutes” is understandable to them. I’d like a method to generate an accurate answer for my environment.

BenKus · December 10, 2009, 4:43am

(imported comment written by BenKus)

Hey Jim,

Looking at the emsg logs like you are doing is a good way to figure out the total time for an eval loop (although it certainly takes some patience to parse through the log…) Basically you will want to look and see how long until it starts re-evaluating the same Fixlets (assuming it hasn’t been interrupted by an action or a gather or any other task).

For your SCM question:

It seems it would be safe to say that the agent should do a complete evaluation every hour. If it takes longer than an hour, then I would expect something is wrong and needs to be fixed.

Ben

system · December 14, 2009, 11:04pm

(imported comment written by SY57_Jim_Montgomery)

FYI, after looking through our usage profiler logs, it looks like a full loop takes somewhere between 3 and 3 and a half hours. Obviously this is a function of what type of hardware, and what the content of the sites is per computer. I was kind of surprised at this number. Luckily if there are any issued actions, they can interrupt this loop to get stuff done.

Pro tip: If you are examining the usage profiler logs, be VERY careful to know when the computer is going to sleep, becuase the log shows process time for a fixlet using the start and end times, not counting any sleep in between - so you may see some fixlets with REALLY Long times.

(I was working with always on machines)

–Jim