Statistic range suddenly empty

system · January 12, 2009, 4:29pm

(imported topic written by MartinZ91)

I’m using relevance to analyse statistic bins of properties from properties. Those properties are defined in an analyses with “KeepStatistics=true”, the relevance is executed against the SOAP API using PHP programming language.

This all went very well for several month, it stopped working an Wednesday 7. january. I traced the reason using presentation debugger and found that there seems to be an effect on the number of bins returned or summarzied.

I did use the following relevance

mean computer counts of totals (6*hour) of statistic range of bes properties whose (name of it as lowercase contains “number of relevant fixlets enterprise security”)

This does return in presentation debugger and from custom-SOAP-request quite fast (7 milliseconds reported by presentation debugger) but the output window stays blank now, no error message printed. Then I adjusted the number of hours to summarize and surprisingly got the expected results when I used 24 and all multiples of 24. The results stay blank if I use a number which is NOT a multiple of 24.

I know that there is a maximum number of bins kept and that the summarized totals will start on specific boundaries accouring to reporting intervals. But until now I thought, that excess bins will be discarded on a first in first discard scheme.

What could be the reason and even further what could I change to get the results again? Does anyone else have similar problems

Martin

BenKus · January 13, 2009, 2:35am

(imported comment written by BenKus)

Hey Martin,

I am impressed that you have gotten so far. The stastical inspectors were created by one of our lead architects a little while back and she has a PhD in Math from Berkeley and it seems like you need to have quite a background in statistics to figure out the statistical inspectors (for instance, we have an inspector "logarithm kurtosis of "… want to know more about kurtosis? See here: http://en.wikipedia.org/wiki/Kurtosis)

To answer your question (or at least try):

The goal of the statistical inspectors is to provide historical information on a property, but not require an ever-expanding amount of storage. For instance, in your example you are storing the amount of Fixlets relevant every 6 hours for all computers. Rather than keep every value for every computer over time (which would grow and grow and get computationally more expensive to process as it grows), we “aggregate” the data in “statistical bins” that let you get the data you are looking for (like the total, mean, max/min bound, standard deviation, kurtosis, etc.).

One of the ideas in the statistics is that you can consolidate the statistical bins over time to save space/computation. So you may care about the values every 6-hours for the last week, but you don’t really care about the values 1-year ago on a 6-hour interval. So the system will compact the data to let you get the data on a 24-hour basis over the last year.

So my guess is that the data was aggregated to 24-hour intervals, which is why you can see the info if you use the multiples of 24 but not 6.

I will double-check with a developer to make sure I am giving you the proper info and let you know if I find anything new.

Does that help?

Ben

BenKus · January 13, 2009, 2:55am

(imported comment written by BenKus)

From our developer:

He’s probably just gone over the three month mark, at which point the oldest data moves to 24-hour bins. He can still get 6 hour bins for the more recent data however, if he limits the range:

mean computer counts of totals (6hour) of range ((now - 30day) & now) of statistic range of bes properties whose (name of it as lowercase contains “number of relevant fixlets enterprise security”)

Ben

system · January 13, 2009, 10:51am

(imported comment written by jessewk)

For the record as of 6.0:

We keep a maximum of 2048 statistical bins of 5 minute duration, 2048 bins of 1 hour duration, and 2048 bins of 1 day duration. This is equivalent to about a week’s worth of 5 minute bins, three month’s worth of hour bins, and 5.5 years of day bins. The bins of a given property will never overlap and always form a contiguous range.

system · January 13, 2009, 6:27pm

(imported comment written by MartinZ91)

Thanks Ben, Jesse - I’ll let You know if I could resolve the problem.

Regarding my knowledge of statistics: I’ve been working as an experimental physist collecting and analysing data from particle physics detectors

system · January 13, 2009, 8:28pm

(imported comment written by jessewk)

Hi Martin,

If you need further help with the statistical inspectors, there is more documentation in the Session Inspector guide. Of course also feel free to ask questions here. It’s really cool to see you using these advanced inspectors!

Jesse