A lot of it depends on how your environment is configured.
My environment supports +50k endpoints and I have 40 dedicated Relays to support user workstations, and another 30+ Relays dedicated to servers. We have areas of the network segmented off for certain types of servers for security reasons, and I keep two active Relays in each area. This results in the Server relays usually supporting fewer clients than the user Relays. I should also point out that we use Automatic Relay Selection for all computers other than the Relays themselves. To minimize the network traffic while clients look for an available Relay, I use Relay Affiliation Groups that endpoints assign themselves to based on geographic location. I also make use of the MOD function with the Computer ID to further spread endpoints between groups.
For me, I tend to do the following …
- Are the Relays all reporting (I actually have a Web Report that sends me an email when a relay hasn’t reported in 4 hours). ALL of my Relays are dedicated Relays, they serve no other role.
- Are the clients evenly spread across the Relays?
- Is any Relay serving a larger or smaller number of clients than expected?
- I check the number of Stopped and Expired actions. For performance reasons I only keep 3 months worth of Stopped/Expired actions. After that, I delete them. Depending on the size of your environment you may be able to keep more.
- Are clients checking into the system as expected? I can usually count on 20% of the endpoints being “Off Line” at any given time, Monday mornings it can be higher until everyone shows up at work and Lab Equipment is powered back on if it was powered off over the weekend.