Relay MaxChildCount Issue

techadmin · May 20, 2016, 12:31pm

I have added the below client setting to a relay.

_Enterprise Server_ClientRegister_MaxChildCount = 100

Clients have automatic relay selection in our environment and I noticed that this relay which has maxchildcound as 100 still accepts client to it and now the count is at 150 and rising.

Am i missing something? or should i do something apart from this setting?

Aram · May 20, 2016, 1:27pm

This setting only applies to Clients that have registered to the given Relay within the last 24 hours.

Note that in general, I recommend avoiding the use of this approach to limit the endpoints registering with a given Relay as it can lead to unexpected behavior given the time requirement described above. There are many different configuration settings and strategies that can be leveraged to ensure that the appropriate Clients are registering with the appropriate Relays.

techadmin · May 20, 2016, 1:31pm

Thanks @Aram for the clarification. When you get time can you suggest some best approach that can be implemented on the clients to chose the closest relay?

Aram · May 20, 2016, 1:35pm

Relay Affiliation ( https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/Tivoli%20Endpoint%20Manager/page/Relay%20Affiliation ) is among the better methods to guide and better control the automatic Relay selection process, and can usually be adapted to meet various requirements.

Could you provide a bit more detail around what you are attempting to achieve and/or the challenges you are running into so that we might provide more specific suggestions?

techadmin · May 20, 2016, 1:48pm

@Aram I will look into the relay affiliation. Here is the scenario, we have 1 Root Server and 2 TLR and 50 SLR. there are over 20k endpoints connecting to the 50 SLR and this was done manually over a period of time.

Now, we are planning to automate the relay selection process to make sure all endpoints points to its nearest relays and also to make sure that these endpoints should not connect to the 2 TLR. they should all be mapped to the 50 SLR.

So, how do it do it in the best possible way?

TLR - Top Level Relays
SLR - Sec. Level relays

Aram · May 20, 2016, 1:58pm

Here are some high level thoughts:

the Relays themselves should leverage manual Relay selection - the SLRs would have assigned primary/secondary TLRs
the TLRs would be ‘excluded’ from the automatic selection process by leveraging Affiliation, and configuring their advertisement list to not include *
the Clients would have appropriate seeklists configured to ensure they select from the appropriate list of SLRs, and subsequently failovers if none of the SLRs are available. This might be as simple as including all the SLRs in the seeklist, or can be more granular depending on your network and requirements

techadmin · May 20, 2016, 2:04pm

I’ll work on this and will get back to you for any other queries related to this.
Thanks again for your support.

TimRice · May 21, 2016, 3:14am

@Aram, what about using

_BESRelay_Selection_AutoSelectableRelay

On the TLR’s to prevent Endpoints from selecting them?

Aram · May 23, 2016, 12:59pm

Sure, that can be done too! If Relay Affiliation is already being leveraged however, and the TLR’s advertisement list doesn’t contain *, then the _BESRelay_Selection_AutoSelectableRelay setting essentially becomes redundant.

techadmin · May 23, 2016, 1:05pm

Nice find @TimRice! I have added this setting to the TLR.
But I’m also looking for the best way to do it through relay affiliation!

TimRice · May 23, 2016, 1:16pm

The official list of client settings is available from IBM.

@Aram, I’m more a “Belt & Suspenders” kind of Admin.

Pete_F · January 3, 2017, 2:27pm

Sorry to Dig this one up again, but this setting…
Which is correct ?

_Enterprise Server_ClientRegister_MaxChildCount

OR _Enterprise ServerClientRegister_MaxChildCount

Im seeing both mentioned, and am using the former set to 1000 but have 1200 and 1500 clients on two relays that are set to 1000

Aram · January 3, 2017, 3:36pm

The setting is:

_Enterprise Server_ClientRegister_MaxChildCount

As suggested earlier ( Relay MaxChildCount Issue ), this setting is based on the number of clients that have registered to the given Relay within the last 24 hours. As such, it is possible to see more devices report that they are connecting to the given Relay (if they are offline for instance) than the value of this setting.

I generally recommend against leveraging this setting as it can lead to unexpected (and even undesirable) behavior, and would suggest instead a number of other Relay selection/configuration strategies.

jgstew · January 3, 2017, 7:23pm

Just FYI, there is no reason to limit a relay to exactly 1000 endpoints if the relay is dedicated to the task and has enough resources. Windows Relays have some default OS settings that mean that more than 2048 endpoints could be a problem without adjusting those settings, and even then I’ve seen 3000 endpoints on a single windows relay have no significant issues. Linux based Relays don’t have the same limitation and could have many more endpoints connected at once.

Pete_F · January 3, 2017, 8:36pm

Thanks all… I see the underscore got removed in the post hence the confusion…
I was trying to reduce the load on two of the relays, but as they are dedicated and have some decent horsepower, I wont worry too much…
I have restricted by adjusting the firewall to block a specific subnet in the past too… that worked pretty well

jgstew · January 3, 2017, 9:01pm

If you have a relay that is on dedicated hardware, then generally the only limitation is the number of simultaneous TCP connections that the OS can handle and not the hardware, especially if using SSD storage. The other bottleneck could be the network card if it only has 1 gig and you have tons of downloads going at once across the LAN, but that can be solved with a 10gig NIC.

Relays don’t tend to need a lot of RAM or CPU. I think 4 cores and 8GB of RAM is more than needed in most cases. A lot of BigFix processing tends to be single threaded so you are usually better off with fewer but faster CPU cores than many slower cores.

I like to see fairly large relay caches for top level relays and relays behind slower WAN links. I also think that consumer level SSDs like the Samsung 850 Pro work well since the caches tend to be write once read many. You can get a good size SSD these days for fairly low cost, sometimes cheaper than 15k SCSI disks.

Sujay · December 8, 2017, 8:04am

This thread is very helpful.

Could someone please explain, at what frequency Relay checks inactive connections and drop them to take more connection?

For instance if _Enterprise Server_ClientRegister_MaxChildCount = 1000 and all connections are active now. and after 30 minutes 300 workstations went offline. How does Relay decide to dump those 300 workstations and allows 300 fresh to connect to it?

Aram · January 2, 2018, 8:34pm

The timing calculation is based on when a given Client registers with the Relay in question. A given registration is counted towards the MaxChildCount so long as it was within the last 24 hours.