Regarding relays, the current guidance is 1000:1 clients:relay; the roadmap at Think 2018 indicates we’ll see updates for up to 5000:1.
For those deploying relays on different platforms, do you see different practical limits on the effective workload?
Our core server is Windows, but most of our clients connect to CentOS7-based relays. We’re finding that the 1000:1 ratio is a firm limit on Linux; when relays approach that limit, “weird” things start happening, such as a machine appearing offline in the console when it’s known to be online. Often sending a refresh will make it reappear, but sometimes not.
Is this limit in fact constrainted by the root user’s ulimit?
http://www-01.ibm.com/support/docview.wss?uid=swg22006914
I’m reminded of another product I administered on Solaris and later Linux, which gave guidance on kernel parameters for a given sizing scenarios. I seem to recall that were thinks like open files and memory handlers, often scaling well beyond what was normal for a process running in userspace. Perhaps IBM should do something similar for dedicated relays.