Efficiency pendulum... where should it stop?

straffin · March 14, 2024, 6:43pm

(Calling out @jgstew right away as this is definitely in his wheelhouse…)

I am currently testing the use of a larger-than-James-would-recommend Site file to distribute 8 pieces of information from an external system to be stored in individual Client Settings per computer. The idea is that each Client looks through the file for the line that begins with its BES Client ID then parses the other pieces of information from the remainder of the line. It all works very nicely, and our network pipes are plenty fat enough, but many Clients are outside of our network, so I’ve been considering alternatives to sending the 1.5MB file to 15,000 Clients.

One such “middle of the pendulum swing” alternative would be to split the file into 10 smaller files based on BES Client ID mod 10, uploaded to one of 10 sites that have the appropriate Clients subscribed. More setup on my part (but just once) and 10 API calls instead of 1 to upload the files.

The other end of the pendulum swing is 15,000 API calls to upload one-line files to each computer individually.

This process currently runs “fully” once per day at midnight, collecting and uploading the “one big file” to a Site subscribed to by “All Computers”. Then, a “delta” run is processed every 30 minutes and only uploads info for computers that have had their information change in a much smaller “delta” file (which is still sent to all computers). Any of the other options would likely operate on this same premise: overnight “full” run and periodic “delta” runs.

So, where can I get the most bang for the buck? The current process takes one second to upload the file, which then shows up on my test Client only a few minutes later. The processing time to find the Client’s information and record it in Client Settings is about 5 seconds. There is a small security concern with each computer having (non-sensitive) information about every other computer that would be resolved with the mailbox solution, but at what processing cost? Is the “middle” solution even worth the effort since the “one big file” solution takes so little time already and only the “mailbox” solution completely avoids the security concern? I may be able to do some parallel processing, but 15,000 API calls is gonna leave a mark on the root server, won’t it? If I throttled it to one call per second, it’ll be doing that for over 4 hours!

Would love to hear anyone’s thoughts on this. Thanks!

ageorgiev · March 14, 2024, 7:12pm

I built myself a CMDB integration with an old system a few years ago very much in the mould of what you are doing and would say the short answer “it depends on your use case”. Why do I say that? Well, because it entirely depends on the data - how frequently does the data change? How many machines would you expect it to change on? How up-to-date do you want to keep the data? How big is the endpoint’s data vs the total data? How concerned are you with the data security you mentioned (i.e. how sensitive is the data)? and so on.

In my use case data was changing daily to roughly a few thousand endpoints out of a total of ~20k, and total data was about 150mb, so sending a file daily worth 150mb to all machines so 2-3k can update their values was ridiculous to me! Also, such a big file it will have even evaluation overhead (lines whose (…) of file “…” would still read every single line of the file, to get to the one that it really needs)! Given those to me it was clear that I was going to break the data to 1 file PER machine and only re-distribute the files that have changed!

So what I did was produce a script that does all the parsing and runs on the Platform Root server (ours had enough resources to handle and you can conviniently place the folder under BES Server\UploadManagerData\Uploads, so that it is directly accessible as bigfix url). My script was written in Powershell and what it did was it would grab the full data from the external source, “explode” it into separate files for each machine and then using hashes I would generate “manifest.file” which would contain the sizes, hashes, etc of all the small files. That manifest.file I would put in the all mailboxes, so they have the latest list of downloads data IF they need to grab a file but the actual relevance as to whether it needs to update I would control from within the script. Further down in it I would just load the data from the files as were on previous run and compare them to current. IF the data has changed (doesn’t matter if one field or all of them) then I would overwrite the individual file (and respectively the data for it in the manifest.file). Then had a fixlet that just compares the sha1 of the file that existed locally and compares it to the sha1 of the file as listed in the manifest.file once that is updated - IF the sha1s mismatch then it’s a new file and fixlet is relevant (and all the fixlet does download and overwrite file locally but that was let’s say 10k file that it was pulling conditionally); IF the sha1s match then the file hasn’t change and policy remained not-relevant.

By designing it in this way I tried to kept “load OFF client” and onto centralized location - comparing datasets, even big one, is quite cheap (quick and releases the resources soon after) in good programming languages! It also, kept the number of POST API calls to a minimum (one against a site/all endpoints with comparably small amount of data) and from there file distribution was really negligible (a under 10kb file to the endpoints that have changed only but that was sent to each machine only, so no waste there at all).

Hope it is of help to you.

straffin · March 14, 2024, 8:10pm

That’s a very interesting angle, having the computer reach out for the file versus using the Site/mailbox files. I’ll have to think about that one. My data lines are fairly short (less than 150 characters or so, most much shorter), so the SHA1 length of 40 characters would save some but not lots, but it would solve the security issue.

Hmm…

ageorgiev · March 15, 2024, 6:53am

Agreed. As I said it is all down to the use case!

jgstew · March 15, 2024, 7:26pm

Definitely wouldn’t do that.

There is nothing wrong with this, just that I would typically prefer to do this through a bigfix action that would get generated and use a prefetch to download the file rather than a site file, but honestly 1.5mb is small enough it is probably good enough. You just might not want to have it be in the master actionsite.

If it seems to be working fine as is, then don’t worry about it.

I like this idea in concept, but I don’t like as much having 10 sites created just for this purpose.

If you already had sites like:

Org/Windows
Org/Windows/Desktop
Org/Windows/Server
Org/Linux
Org/Mac

then maybe I would break it up that way, but that sounds challenging since you only have BigFix Client ID to go on.

brolly33 · March 15, 2024, 7:41pm

Another approach here is to deal with it as delta files.

one giant file that only changes once a year with all of the EoY results in it

12 medium files published at every EoM (monthly file takes precedence over EOY file) cleared out when EoY file is updated.

31 tiny files that get published for the daily delta. Newest of these files would be the “winning” entry for any given endpoint. cleared out when EoM file is published.

Use sets to build the “most current” result starting with EoY, then overlay EoM (in reverse order, then overlay EoD, again in reverse order, to generate the final, up to date set.

jgstew · March 15, 2024, 7:43pm

This requires smart processing on the automation side of things but this is probably the best approach.

straffin · March 15, 2024, 8:01pm

That’s definitely a possibility, but by concern is that I’m also updating the effective date of the settings that are storing the data. There’d be no difference between the look of a computer that has had no changes in a few months and a computer that (for whatever reason) stopped processing the script a few months ago.

I suppose that if we defined what “too stale” looked like (say, 30 days old), we could go with an EoM/EoD model, watching for anything where the effective date of these values was older than 30 days as needing investigation.

However, there’s also the situation where the external data has changed and I need to get that data to the client ASAP, hence why my process runs every 30 minutes with the intra-day delta file uploaded at that frequency.

This is ALL great information and is giving me plenty to think about… thank you!!

jgstew · March 15, 2024, 8:05pm

you could have the computer write it’s values to the setting it finds every time the update action runs even if the value is unchanged…

The idea would be you would only change the intra day file if the file has changed, otherwise you wouldn’t touch it. If you did change it, then it would prop.

straffin · March 15, 2024, 8:22pm

What if the script were to perform a monthly “full” process that was split 10 mod-10-based groups, so that only BES Computer IDs ending in 1 got a full update on the first of the month, 2s on the second, etc., so, instead of 15,000 API calls, there were only 1,500 a day over 10 days (along with the delta calls). Would using computer mailboxes possibly make sense then? Could even spread it out over the first 20 or 25 days of the month instead of 10.

So, a given computer would get a file uploaded to its mailboxsite at least once per month, with all computers spread out more-or-less evenly over 25 days, with additional files being uploaded to mailboxsites for computers when their information has changed, within 30 minutes of the change.

Hmmm…