Does the IEM console (on disk) cache come into play when the console is open and running, or does it only come into play when closing and opening the console?
Or more generally, what is the difference in use between the console’s RAM use and the on disk cache? Does the on disk cache ever contain things not currently in the console’s RAM when it is open that would be used instead of pulling from the root server?
Does Globally hiding Fixlets/Tasks lighten the load on non-Master Operator consoles?
Good question. I’m going to try to answer it as simply as I can, but caching is an often-complicated topic so, you know, here be dragons.
The primary purpose of the on-disk cache is to speed up initial Console loads when bandwidth between the Console and Root Server is limited. The cache also serves to lessen load on the Root Server when the Console is starting up. While the Console is running, it will refresh data from the Root Server directly into memory and then to the cache either in the background or on Console exit (depending on the object type).
In some situations the Console on-disk cache can contain data that is not in memory. For instance it could contain property results that have not yet been viewed in the Console that would potentially be lazy-loaded from the cache when e.g. a user adds a column to a computer view. This also applies to action results. After enough usage during a Console session, though, generally everything from the cache will be loaded into memory.
Yes, globally hiding fixlets and tasks can lighten the load for non-master operator consoles, although different deployments exhibit different memory patterns so it would depend on how large a percent of memory usage is taken up by fixlets/tasks. We usually see analysis/property results take up the largest percentage of memory, although lots of actions and action results or fixlets and fixlet results can do it, too. One thing to keep in mind is that the memory usage for a fixlet or action consists of a fixed cost for the definition and then also a cost per device that reports a result. So, limiting NMOs to fewer devices administered can also help memory usage (although I know that’s not always possible).
In our case, we want to maximize console performance while minimizing root server load. I’d say at this point improving console performance is more of a concern than the root server load.
We are not particularly concerned with the load or speed at which the console opens or closes, only while it is in use.
The console is only used on a single dedicated terminal server with a 10 gigabit link and tons of RAM and CPU to spare. We can have as many as 100 console sessions open at once, and performance can be quite poor, and seems to have gotten worse since our upgrade from 9.0.x to 9.1.x ( We currently have a PMR open about this )
It sounds like the on disk Console Cache does help somewhat, but only in certain cases. We currently have the console cache set to “Keep full cache on disk” and placed on fast SSDs.
I am curious if there are any settings we should be tweaking related to the console or the OS of the terminal server, or if we would see any improvement by having multiple terminal servers to have less console sessions per server. (or would this be worse)
You might want to try setting the Console to minimal cache settings, since you have fast access to the root server. There may be disk contention from all the terminal server sessions due to the caches, even though it’s on fast SSDs. In general, under conditions where there is tons of bandwidth between the Console and Root Server, we’ve seen better performance when setting cache to minimal.
Splitting up the Consoles across multiple terminal servers shouldn’t hurt performance at all. It may help if resource contention on the terminal server is an issue.
I’m going to attempt to summon @Aram and @steve to this thread, because they have more experience with large terminal server environments than I do.
I’m currently running my console with the cache turned completely off and I don’t see much improvement, but that is an interesting thought around disk contention and that in some cases minimal caching can be better.
I switched over to minimal caching and I’ll see what that is like.
If you don’t see much improvement when the cache is disabled, then the bottleneck should be on the root server/database, or local CPU/RAM resources. For 100 simultaneous users, you could be using up to 100 cores at any given time, depending on the console refresh interval, and 100-200GB of RAM. Are you sure you have enough resources on your terminal server? Assuming you have 64 or less cores, then you would benefit from another terminal server to spread the CPU load.
Beyond that, I would look at DB performance and CPU usage on the root server, and possibly adjusting your minimum console refresh interval. What is your console refresh rate and is that set for all operators globally?
That refresh rate is definitely helping. The CPU usage suggests you never have more than 6 or 7 consoles refreshing (or doing any other CPU intensive operation like loading a dashboard or large analysis, or deploying an action) at any given time, which is possible given your refresh rate, but seems surprising with 100 simultaneous users. It sounds like a lot of people just have the console open, but aren’t necessarily doing a lot within it other than getting updated results.
You’ll have to focus on the server and DB at this point.
Thanks for the feedback @steve , that definitely points us in the right direction.
It is definitely the case that the users that use the console throughout the day tend to leave it open 100% of the time.
We have just switched to a 10 min refresh and disabled all the Linux Patching sites that no one seemed to be using, then rebooted the console terminal server. Things seem much faster today.