Relay Autoselection step by step details

(imported topic written by SY57_Jim_Montgomery)

Hey folks, I’m trying to understand relay autoselection and what exactly happens. I don’t think the entire process is included in a document, so I’ve pulled the following information from various places. Please correct anything that is not accurate below.

In this scenario there is a BES server, 100 relays each at different remote wan locations, and I’m setting up a new BigFix agent at a remote wan location. The agent has the _BESClient_RelaySelect_MinRetryIntervalSeconds setting configured to 60, the default. Let’s also say that the agent is 3 hops away from his nearest relay. These guys have a wacky looking network, okay?

First, the client connects to the BES server, and registers and retrieves the list of all relays.

Next the client sends a ping with ttl 0 to every relay in that list (note this is NOT a broadcast ping) (he’s pinging 100 devices with a ttl 0 icmp here)

No response, then the client waits for _BESClient_RelaySelect_MinRetryIntervalSeconds seconds

Next the client sends a ping with ttl 1 to every relay in that list

No response, then the client waits DOUBLE _BESClient_RelaySelect_MinRetryIntervalSeconds seconds = 2 minutes

Next the client sends a ping with ttl 2 to every relay in that list

No response, then the client waits QUADRUPLE _BESClient_RelaySelect_MinRetryIntervalSeconds seconds = 4 minutes

Next the client sends a ping with ttl 3 to every relay in that list

Response from a relay.

The client keeps track of what relay he is using, he DOES not keep track of anything else, like what the other results were from pings, etc.

When the autoselection happens next time, after 6 hours, or if he can’t connect to the relay, then:

He’ll check his current relay and if that is still there, keep using it.

Otherwise, go through the previous process of pinging everything with increasing ttl, waiting double the time between each round of pings.

I appreciate your help in making sure I understand this clearly.

Thanks guys,

–Jim

(imported comment written by BenKus)

Hey Jim,

That is all correct except that the setting “_BESClient_RelaySelect_MinRetryIntervalSeconds” is only triggered if the agent fails to relay select at all (so if it goes through its rounds of pinging and it doesn’t get a response from anyone). In that case it triggers another relay select (and I think the default is 10 min later) and then doubles for every failure.

The amount of time it waits between rounds of pings is built-in to the agent, but I believe it is a pretty short amount of time.

Ben

(imported comment written by Shlomi91)

Hey,

just a quick question:

does the client ping the DNS name of the relays in the list or their IP address?

the reason i’m asking is because we don’t have DNS resolution between our WAN sites.

is there a “best practice” for working without DNS resolution?

also: what is the “last step”, if no relay in the list returns a response? does the client register the BES root server? (or in other words, in what case does the client register the root server)

thanks,

Shlomi

(imported comment written by jessewk)

Hi Shlomi,

It uses the DNS name by default, but you can switch to IP. This is very common among our large customers. See this KB article:

http://support.bigfix.com/cgi-bin/kbdirect.pl?id=182

For a complete description of the BES Relay selection behavior, start here:

http://support.bigfix.com/bes/misc/besrelays.html

Jesse