Webui and Filldb bogging down,

Just curious: at what time did the ETL complete? For example if the ETL took 60 minutes, it would justify why FillDB was stuck for an amount of time that would justify agents to be assumed as not reporting…

Here is the complete section of that log… it still hasnt completed.

Thu, 04 May 2017 06:50:26 GMT bf:bfetl:debug Running cleanup
Thu, 04 May 2017 06:59:59 GMT bf:bfetl:debug Running analyze with a threshold of 1000
Thu, 04 May 2017 06:59:59 GMT bf:bfetl:debug Updating statistics on ACTIONS
Thu, 04 May 2017 07:00:04 GMT bf:bfetl:debug Updating statistics on ACTION_TARGET_STATIC
Thu, 04 May 2017 07:05:34 GMT bf:bfetl:debug Updating statistics on COMPUTER_ACTIONS
Thu, 04 May 2017 07:10:31 GMT bf:bfetl:debug Updating statistics on COMPUTER_ANALYSES
Thu, 04 May 2017 07:24:26 GMT bf:bfetl:debug Updating statistics on COMPUTER_GROUPS
Thu, 04 May 2017 07:25:25 GMT bf:bfetl:debug Updating statistics on COMPUTER_PROPERTY_INFO
Thu, 04 May 2017 08:07:23 GMT bf:bfetl:debug Updating statistics on COMPUTER_PROPERTY_TEXT
Thu, 04 May 2017 10:38:55 GMT bf:bfetl:debug Updating statistics on EXTERNAL_FIXLET_ACTIONS
Thu, 04 May 2017 10:39:12 GMT bf:bfetl:debug Updating statistics on EXTERNAL_FIXLET_ACTION_TRANSLATIONS
Thu, 04 May 2017 10:40:49 GMT bf:bfetl:debug Updating statistics on EXTERNAL_FIXLET_FIELDS
Thu, 04 May 2017 10:41:11 GMT bf:bfetl:debug Updating statistics on EXTERNAL_FIXLET_RELEVANCE
Thu, 04 May 2017 10:41:33 GMT bf:bfetl:debug Updating statistics on EXTERNAL_FIXLET_TRANSLATIONS
Thu, 04 May 2017 10:49:49 GMT bf:bfetl:debug Updating statistics on COMPUTERS
Thu, 04 May 2017 10:49:50 GMT bf:bfetl:debug Updating statistics on COMPUTER_BASELINES
Thu, 04 May 2017 10:51:45 GMT bf:bfetl:debug Updating statistics on COMPUTER_FIXLETS
Thu, 04 May 2017 13:56:20 GMT bf:bfetl:debug Updating statistics on COMPUTER_ROLES
Thu, 04 May 2017 13:56:51 GMT bf:bfetl:debug Updating statistics on COMPUTER_SITES
Thu, 04 May 2017 15:36:53 GMT bf:bfetl:debug Updating statistics on COMPUTER_USERS
Thu, 04 May 2017 15:38:09 GMT bf:bfetl:debug Updating statistics on EXTERNAL_FIXLETS
Thu, 04 May 2017 15:38:20 GMT bf:bfetl:debug Updating statistics on SITE_USERS
Thu, 04 May 2017 15:38:20 GMT bf:bfetl:debug Sending request for 66 tables
Thu, 04 May 2017 15:38:20 GMT bf:bfetl:debug Received request for 66 tables
Thu, 04 May 2017 15:38:21 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/actions?sequence=2027215536
Thu, 04 May 2017 15:39:41 GMT bf:bfetl:debug Updated ACTIONS 1939 rows in 79.712 seconds (24 rows per second)
Thu, 04 May 2017 15:40:55 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/action-fields?sequence=2027215537
Thu, 04 May 2017 15:40:57 GMT bf:bfetl:debug Updated ACTION_FIELDS 0 rows in 1.953 seconds
Thu, 04 May 2017 15:40:57 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/action-parameters?sequence=2027215537
Thu, 04 May 2017 15:40:59 GMT bf:bfetl:debug Updated ACTION_PARAMETERS 0 rows in 1.938 seconds
Thu, 04 May 2017 15:40:59 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/action-target-names?sequence=2027215537
Thu, 04 May 2017 15:41:01 GMT bf:bfetl:debug Updated ACTION_TARGET_NAMES 0 rows in 1.859 seconds
Thu, 04 May 2017 15:41:01 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/action-target-static?sequence=2027215537
Thu, 04 May 2017 15:41:26 GMT bf:bfetl:debug Updated ACTION_TARGET_STATIC 20985 rows in 25.491 seconds (823 rows per second)
Thu, 04 May 2017 15:41:27 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/action-user-groups?sequence=2027215540
Thu, 04 May 2017 15:41:29 GMT bf:bfetl:debug Updated ACTION_USER_GROUPS 0 rows in 1.907 seconds
Thu, 04 May 2017 15:41:29 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/analysis-activations?sequence=2027215540
Thu, 04 May 2017 15:41:29 GMT bf:bfetl:debug Updated ANALYSIS_ACTIVATIONS 0 rows in 0.047 seconds
Thu, 04 May 2017 15:41:29 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/computer-actions?sequence=2027215540
Thu, 04 May 2017 15:41:56 GMT bf:bfetl:debug Updated COMPUTER_ACTIONS 9779 rows in 27.033 seconds (361 rows per second)
Thu, 04 May 2017 15:41:56 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/computer-analyses?sequence=2027215540
Thu, 04 May 2017 15:42:37 GMT bf:bfetl:debug Updated COMPUTER_ANALYSES 8062 rows in 40.238 seconds (200 rows per second)
Thu, 04 May 2017 15:42:37 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/computer-groups?sequence=2027215547
Thu, 04 May 2017 15:42:54 GMT bf:bfetl:debug Updated COMPUTER_GROUPS 1518 rows in 17.314 seconds (87 rows per second)
Thu, 04 May 2017 15:42:55 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/computer-property-info?sequence=2027215547
Thu, 04 May 2017 15:45:08 GMT bf:bfetl:debug Failed to update COMPUTER_PROPERTY_INFO: Curl failed: Transferred a partial file
Thu, 04 May 2017 15:45:08 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/computer-property-text?sequence=2027215972
Thu, 04 May 2017 16:21:21 GMT bf:bfetl:debug Updated COMPUTER_PROPERTY_TEXT 963973 rows in 2172.744 seconds (443 rows per second)
Thu, 04 May 2017 16:38:45 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/custom-analyses?sequence=2027216669
Thu, 04 May 2017 16:38:45 GMT bf:bfetl:debug Updated CUSTOM_ANALYSES 0 rows in 0.031 seconds
Thu, 04 May 2017 16:38:45 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/custom-analysis-fields?sequence=2027216669
Thu, 04 May 2017 16:38:45 GMT bf:bfetl:debug Updated CUSTOM_ANALYSIS_FIELDS 0 rows in 0.063 seconds
Thu, 04 May 2017 16:38:45 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/custom-analysis-properties?sequence=2027216677
Thu, 04 May 2017 16:38:45 GMT bf:bfetl:debug Updated CUSTOM_ANALYSIS_PROPERTIES 0 rows in 0.031 seconds
Thu, 04 May 2017 16:38:45 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/custom-analysis-relevance?sequence=2027216677
Thu, 04 May 2017 16:38:45 GMT bf:bfetl:debug Updated CUSTOM_ANALYSIS_RELEVANCE 0 rows in 0.031 seconds
Thu, 04 May 2017 16:38:45 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/custom-baseline-action-settings?sequence=2027216677
Thu, 04 May 2017 16:38:45 GMT bf:bfetl:debug Updated CUSTOM_BASELINE_ACTION_SETTINGS 0 rows in 0.094 seconds
Thu, 04 May 2017 16:38:45 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/custom-baseline-action-settings-user-groups?sequence=2027216677
Thu, 04 May 2017 16:38:45 GMT bf:bfetl:debug Updated CUSTOM_BASELINE_ACTION_SETTINGS_USER_GROUPS 0 rows in 0.094 seconds
Thu, 04 May 2017 16:38:45 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/custom-baseline-components?sequence=2027216677
Thu, 04 May 2017 16:38:51 GMT bf:bfetl:debug Updated CUSTOM_BASELINE_COMPONENTS 35 rows in 5.562 seconds (6 rows per second)
Thu, 04 May 2017 16:38:51 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/custom-baseline-component-actions?sequence=2027216677
Thu, 04 May 2017 16:38:51 GMT bf:bfetl:debug Updated CUSTOM_BASELINE_COMPONENT_ACTIONS 35 rows in 0.609 seconds (57 rows per second)
Thu, 04 May 2017 16:38:51 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/custom-baseline-component-action-success?sequence=2027216677
Thu, 04 May 2017 16:38:52 GMT bf:bfetl:debug Updated CUSTOM_BASELINE_COMPONENT_ACTION_SUCCESS 35 rows in 0.235 seconds (148 rows per second)
Thu, 04 May 2017 16:38:52 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/custom-baseline-component-groups?sequence=2027216677
Thu, 04 May 2017 16:38:52 GMT bf:bfetl:debug Updated CUSTOM_BASELINE_COMPONENT_GROUPS 5 rows in 0.140 seconds (35 rows per second)
Thu, 04 May 2017 16:38:52 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/custom-baseline-fields?sequence=2027216677
Thu, 04 May 2017 16:38:52 GMT bf:bfetl:debug Updated CUSTOM_BASELINE_FIELDS 0 rows in 0.110 seconds
Thu, 04 May 2017 16:38:52 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/custom-baseline-relevance?sequence=2027216677
Thu, 04 May 2017 16:38:52 GMT bf:bfetl:debug Updated CUSTOM_BASELINE_RELEVANCE 4 rows in 0.250 seconds (16 rows per second)
Thu, 04 May 2017 16:38:52 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/custom-fixlet-actions?sequence=2027216677
Thu, 04 May 2017 16:38:52 GMT bf:bfetl:debug Updated CUSTOM_FIXLET_ACTIONS 2 rows in 0.156 seconds (12 rows per second)
Thu, 04 May 2017 16:38:52 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/custom-fixlet-action-settings?sequence=2027216677
Thu, 04 May 2017 16:38:52 GMT bf:bfetl:debug Updated CUSTOM_FIXLET_ACTION_SETTINGS 0 rows in 0.031 seconds
Thu, 04 May 2017 16:38:52 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/custom-fixlet-action-settings-user-groups?sequence=2027216677
Thu, 04 May 2017 16:38:52 GMT bf:bfetl:debug Updated CUSTOM_FIXLET_ACTION_SETTINGS_USER_GROUPS 0 rows in 0.032 seconds
Thu, 04 May 2017 16:38:52 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/custom-fixlet-action-success?sequence=2027216677
Thu, 04 May 2017 16:38:52 GMT bf:bfetl:debug Updated CUSTOM_FIXLET_ACTION_SUCCESS 1 rows in 0.062 seconds (16 rows per second)
Thu, 04 May 2017 16:38:52 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/custom-fixlet-fields?sequence=2027216677
Thu, 04 May 2017 16:38:53 GMT bf:bfetl:debug Updated CUSTOM_FIXLET_FIELDS 1 rows in 0.235 seconds (4 rows per second)
Thu, 04 May 2017 16:38:53 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/custom-fixlet-relevance?sequence=2027216677
Thu, 04 May 2017 16:38:53 GMT bf:bfetl:debug Updated CUSTOM_FIXLET_RELEVANCE 2 rows in 0.093 seconds (21 rows per second)
Thu, 04 May 2017 16:38:53 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/dashboard-data?sequence=2027216677
Thu, 04 May 2017 16:38:54 GMT bf:bfetl:debug Updated DASHBOARD_DATA 13 rows in 1.813 seconds (7 rows per second)
Thu, 04 May 2017 16:38:54 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/external-analyses?sequence=2027216677
Thu, 04 May 2017 16:38:55 GMT bf:bfetl:debug Updated EXTERNAL_ANALYSES 23 rows in 0.062 seconds (370 rows per second)
Thu, 04 May 2017 16:38:55 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/external-analysis-fields?sequence=2027216677
Thu, 04 May 2017 16:38:55 GMT bf:bfetl:debug Updated EXTERNAL_ANALYSIS_FIELDS 51 rows in 0.125 seconds (408 rows per second)
Thu, 04 May 2017 16:38:55 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/external-analysis-properties?sequence=2027216677
Thu, 04 May 2017 16:38:55 GMT bf:bfetl:debug Updated EXTERNAL_ANALYSIS_PROPERTIES 58 rows in 0.297 seconds (195 rows per second)
Thu, 04 May 2017 16:38:55 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/external-analysis-property-translations?sequence=2027216677
Thu, 04 May 2017 16:38:55 GMT bf:bfetl:debug Updated EXTERNAL_ANALYSIS_PROPERTY_TRANSLATIONS 570 rows in 0.469 seconds (1215 rows per second)
Thu, 04 May 2017 16:38:55 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/external-analysis-relevance?sequence=2027216677
Thu, 04 May 2017 16:38:56 GMT bf:bfetl:debug Updated EXTERNAL_ANALYSIS_RELEVANCE 85 rows in 0.203 seconds (418 rows per second)
Thu, 04 May 2017 16:38:56 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/external-analysis-translations?sequence=2027216677
Thu, 04 May 2017 16:38:57 GMT bf:bfetl:debug Updated EXTERNAL_ANALYSIS_TRANSLATIONS 220 rows in 1.000 seconds (220 rows per second)
Thu, 04 May 2017 16:38:57 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/external-fixlet-actions?sequence=2027216677
Thu, 04 May 2017 16:41:14 GMT bf:bfetl:debug Updated EXTERNAL_FIXLET_ACTIONS 44309 rows in 137.735 seconds (321 rows per second)
Thu, 04 May 2017 16:41:27 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/external-fixlet-action-settings?sequence=2027216677
Thu, 04 May 2017 16:41:27 GMT bf:bfetl:debug Updated EXTERNAL_FIXLET_ACTION_SETTINGS 0 rows in 0.031 seconds
Thu, 04 May 2017 16:41:27 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/external-fixlet-action-settings-user-groups?sequence=2027216677
Thu, 04 May 2017 16:41:27 GMT bf:bfetl:debug Updated EXTERNAL_FIXLET_ACTION_SETTINGS_USER_GROUPS 0 rows in 0.032 seconds
Thu, 04 May 2017 16:41:27 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/external-fixlet-action-success?sequence=2027216677
Thu, 04 May 2017 16:41:27 GMT bf:bfetl:debug Updated EXTERNAL_FIXLET_ACTION_SUCCESS 0 rows in 0.031 seconds
Thu, 04 May 2017 16:41:27 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/external-fixlet-action-translations?sequence=2027216677
Thu, 04 May 2017 16:44:26 GMT bf:bfetl:debug Updated EXTERNAL_FIXLET_ACTION_TRANSLATIONS 321510 rows in 178.610 seconds (1800 rows per second)
Thu, 04 May 2017 16:44:33 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/external-fixlet-fields?sequence=2027216677
Thu, 04 May 2017 16:45:33 GMT bf:bfetl:debug Updated EXTERNAL_FIXLET_FIELDS 121340 rows in 60.475 seconds (2006 rows per second)
Thu, 04 May 2017 16:45:37 GMT bf:bfetl:debug GET https://servername.xx.vv.zzz:52703/api/etl/external-fixlet-relevance?sequence=2027216677

I’m seeing similar behavior in my v9.5.4 environment. We have 45k endpoints and my WebUI is installed on a different server.

Occasionally, the server will just “stall”. There could be 450 - 700 client check in files in the BufferDir folder.

Sometimes it resolves itself after 45 to an hour, other times I just reboot the server to keep information flowing.

When I opened a PMR, the first thing they asked me is did I have any overloaded relays (I have to relays that are known for accepting way too many clients, ala 4K each). I’m ordering new computers to act as additional Relays at the location that’s acting up.

I’m hoping that a scheduled upgrade to v9.5.5 with the ability to multi thread FillDB will help resolve the situation in combination with the new Relays. One thing to consider is that the Multi-Threaded features in FillDB are not active until you apply the “ParallelismEnabled” setting on the server and restart FillDB.

2 Likes

Tim… Im on 9.5.5 with ParallelisnEnabled set to 1 along with the other settings associated with the artical set on too.
We have a much smaller deployment here with 8K clients and 9 relays… One relay gets over 2000 clients periodically with another 2 having 1000+ so that may be related… I may bring up another in the same location as the 2K and see if that helps,

The root cause of the FillDB slow down could be this specific ETL step that takes about 40 minutes to complete. During these 40 minutes, very likely FillDB is stuck.
There two anomalies here:

  1. the number of transferred rows is very high for a deployment with 8K endpoint
  2. the insertion rate (443 rows per second) is very low

About the insertion rate, in our test environments (we use SSD disks), and in other customer environments, we observed insertion rates of 35K/40K rows per second, so about 100 times higher than the ones in your environment; your insertion rate is unexpectedly low.

The high number of transferred rows (bullet 1) could be due to the fact that the WebUI takes a lot of time to complete an ETL cycle in you environment and thus every time the ETL runs, lot of changes occurred on the server and have to be replicated in the WebUI DB.

Again looking at the log I can see:

Thu, 04 May 2017 10:49:50 GMT bf:bfetl:debug Updating statistics on COMPUTER_BASELINES
Thu, 04 May 2017 10:51:45 GMT bf:bfetl:debug Updating statistics on COMPUTER_FIXLETS
Thu, 04 May 2017 13:56:20 GMT bf:bfetl:debug Updating statistics on COMPUTER_ROLES
Thu, 04 May 2017 13:56:51 GMT bf:bfetl:debug Updating statistics on COMPUTER_SITES

The update of the statistics on the COMPUTER_FIXLETS table takes more than 3 hours, and this is the main contributor to the long time required by the ETL cycle. While the cardinality of COMPUTER_FIXLETS depends on the number of computers and the number of fixlets, in a deployment with 8K computers I do not expect it to be very large. And, even in larger environments, we never experienced such a long time to update the statistics.

There can be many different causes for this low performance of the WebUI DB. The disk subsystem could be the root cause, but also an high number of WebUI users (not sure how many operators use WebUI in your environment).

I would suggest to open a PMR so that we can use it to collect the performance data that we usually need to troubleshoot issues like this one and we can help to pinpoint the root cause for the slow insertion rate.

1 Like

This was my thought as well. Likely the storage used on the WebUI server where the WebUI cache is stored is too slow. It could also be that the storage and performance on the side of the database the root server uses is slow, or other things.

I don’t think the number of endpoints is related to the issue unless the majority of them are talking to the root server directly.

Did these “overloaded” relays get backed up with reports often? 4000 clients on a single relay should not be an issue if the relay is dedicated to being a relay and has fast enough storage / networking / etc. A lower number of clients per relay may help if the relay is getting backed up, but otherwise it will probably not help.

Related:

i never understood the concern with the Relays.

My concern was the stalled FillDB. Rebooting the server allows FillDB to process the waiting check-in files with no issues. Something is causing the FillDB service to stall out.

1 Like

I doesn’t make sense if FillDB has a bunch of pending reports to process and they are not getting cleared out at all and the same ones are sticking around.

It might make more sense if FillDB is consistently backed up, but still working through reports, just more come in as fast as they are processed, but this doesn’t seem to be what you are describing.

If you have a bunch of overloaded relays then they could be sending up lots of reports and never getting through them all quickly enough causing things to back up behind them, this effect could make it seem like FillDB is backed up as well when these relays are sending up lots of reports at once, though this still shouldn’t be an issue if FillDB is processing reports fast enough, other than an issue with the clients connect to the relays with the problem.

The drives on the server are all SSD, and even with 43k endpoints, there are rarely more than 20-30 files in the BufferDir folder. Surges might get up to ~100.

When FillDB stalls, there will be 600-800 files waiting to be processed.

If I try to stop the FillDB service, it fails to stop. Rebooting the server clears whatever is causing the hang up, and shorly after rebooting, the BufferDir folder clears out.

Short-term, I’m planning to solve the Relay issue by ordering some new hardware and deploy 10+ new dedicated Relays.

1 Like

This is exactly the same as I am seeing.
Typically a max of 20-30 files, with the occasional peak. Stopping Filldb fails. Stopping Webui service releases the buffer and folder without the need for a server restart or root server service restart.

While I dont have SSD;s, I do have fast drives on the server with lots of Ram and cores.

1 Like

The symptoms you describe do not sound like storage IO issues, but I would say there is no such thing as a “fast” spinning drive. The maximum IOPS of a spinning drive is around 200 while NVMe SSDs are over 1000 times faster at 200000+ IOPS. Disk Raid and IOPS Calculator - Expedient

1 Like

I’m starting to suspect Database conflicts. Something seems to lock a record on the server and FillDB doesn’t like it. WebUI?

1 Like

One thing you may also want to look into is to check to see if any of your operators are abusing the right click Send Refresh functionality in the console, see the following article for an explanation and knowledge/settings to avoid the problem:

http://www-01.ibm.com/support/docview.wss?uid=swg21688336

1 Like

In our setup, I deliberatelyturn off off the right click to many…
Im also not seeing any notify client forcerefresh actions …

Touch wood, no issues today… Im also noticing that the loginTimeoutSeconds configuration doesnt appear to be working for LDAP console operators but is for “local” operators. It did in the previous version.
It appears to also work for webui users… I have 23 Console users.

1 Like

That seems likely. Not sure if the WebUI ETL would cause that or not, or if something else would cause that.

The procedure we usually use to verify if the WebUI is slowing down FillDB, is to correlate the WebUI ETL log with the FillDB performance log. If you do not have the FillDB performance log enabled, I would suggest enabling it and waiting for the issue to reoccur.
FillDB writes very frequently in the Performance log, unless it is stuck waiting for some lock to be released. So the procedure to troubleshoot the issue is:

  • as soon as you experience a report increase in the FillDB buffer directory, you can check the FillDB performance log and verify if FillDB is writing logs or if it really stuck
  • check the WebUI ETL log and verify if any data transfer is occurring between the WebUI and the server. You should see a line like:
    bf:bfetl:debug GET
    This means the WebUI has started requesting data and is processing them
  • wait for the WebUI to complete this ETL call. You should see in the log something like:
    bf:bfetl:debug Updated <table_name> <rows_number> rows in seconds ( rows per second)
  • check again the FillDB performance log: if FillDB has restarted writing logs soon after the ETL completed, it means that the WebUI ETL had locked some data and prevented FillDB from updating them.

We have experienced in the past FillDB slowdowns related to the WebUI ETL and we are working to solve them with a future update.

1 Like

Ok… its Bogged down again…
Im going to open a PMR with this…I may have to shut down Webui totally

1 Like

Apparently there is an issue with ParallelismEnabled particularly if the number of FillDB threads exceeds CPU cores available which could cause FillDB to stall.

I would try disabling it and look to enable it with more conservative thread counts once your issue is resolved.

How many CPU cores does your root server have? Does the root server have a local or remote database?

@ jgstew
The Server has 2 processors with 16 cores and 32 logical processors… 32G ram … Local SQL2014 Database.

I have just changed the Registry entry for ParallismEnabled to 0

Will restarts all the services and lets see what happens.

Well turning off ParallismEnabled didnt fix it… after about 6 hours of waiting for the Webui to initialize, the FillDbdata Bufferdir filled up with 722 files and stopped. all 8K machines in the master console greyed out … Stopping the Webui Service then flushed the BufferDir and everything burst back into life .