The Decompress
utility will decompress (and optionally decrypt) the documents that end
up in the FillDB buffer directory. These documents are usually
compressed (and encrypted possibly). Run this tool to turn them into a
human readable format for support purposes.
Download - Version 1.0.0.1 (362 KB)
I am having some constrains with the filldb overload and would like to extract one of these big files coming back to filldb to be processed. I would like know exactly what is the content.
I downloaded and try to run with a few different ways but it isn’t doing anything. Neither decompress.exe --help or /? is working…
Yeah, it’s a single file with a bunch of reports from different clients. I just saw them…
We are already evaluating a new HW design … it turned that to have the MSQL and the IEM data in the same array is a nightmare …
You mean you have both the FillDB and the MSSQL DB on the same RAID array?
That is definitely not a good idea. A large/slower RAID 5 for bulk storage of the download cache is a good idea. Something like the Intel S3700 or similar PCI Express storage is best with frequent backups to an internal RAID array that then gets shipped off elsewhere is going to give you the best performance. Short of that, multiple RAID 1 volumes of SSDs for FIllDB, SQL, etc… are a good idea.
How many endpoints in this environment? How many console operators?
Yeah I totally agree with that post … planning to do this new configuration with the new box coming.
My main issue is with the I/O… I was just curious to find out why the filldb files were coming with 1M, something explained when I extracted and saw the multi messages from multi clients.
We are running with almost 40K endpoints and a couple of consoles.
You’ll often see 1MB files come in from relays because that is the size limit they they are allowed to post, by default. So if a relay has 1MB or more of reports from lower-level clients, it will post 1MB at a time up to the server. If you see this all the time, or a lot, it means that the server is not keeping up with the incoming report load (as you suspected) and thus the relays are filling up and posting the max size often.
When you run the decompress utility it uncompresses the reports coming from relays and breaks them down into .0, .1, .2 and .3 files. The .3 files are the individual reports from clients. If you see a lot of larger .3 files (50k+) then clients are sending up more data than expected or full reports, which could be a cause for isolated bufferdir backlogs.
If you have a single top level relay in front of your root that all other relays connect to, you might be able to increase the timeout and the max size and max number of reports that the relay will take before sending it along to the root. This would allow the root to process the reports in larger batches which might cause it to process them more quickly because it would be more sequential IO instead of smaller more random IO.
I have never actually tried this, so I don’t know how much it would help, but you should be able to tell if it does just by seeing how much the FillDB buffer dir gets backed up after implementing it.
You could also set the minimum report interval to be something like 120 seconds.
All of this will have the effect of reducing the load on the root server, but it will also mean that the data in the console will not be as fresh and it will take much longer for that status of actions to be reflected in the console.
Of course, it may not actually be longer that it is taking now because of things being backed up. It could be the case that by making things report less often, it removes the backlog and you actually get things showing up in the console FASTER and not slower.
Given the state of things, I would also increase the minimum console refresh interval to at least 60 seconds to reduce some load on the root. This is especially true for a large number of simultaneous console operators. This will likely affect console performance more than FillDB ingestion speed, but it could help.
Yeah we are already adjusting the report minimum interval for the servers not critical so we can delay a little the load coming to the bufferdir. Also the console refresh is in a good number 10 min.
The main issue is to have a unique raid card controlling a unique disk array, and to have everybody together.
This config came before me, so there isn’t too much to do until the next HW arrives and we rebuild it in a better format… spitting the disks and managing the arrays with more than one raid controller.
With as many endpoints as you have, I wouldn’t recommend spinning disks at all for the FillDB buffer dir or the SQL db. You should be using 2 SSDs in RAID1 for both of these. Potentially use larger / slower disks for the rest of the storage.