Relay SeekList Process

We have multiple SeekList prefixes (INT, DMZ, PCI) setup for our devices for relay auto select. When moving some of our devices from our internal network (INT) to our DMZ network (DMZ) they are disappearing out of our console. When looking at their client log it shows not being able to communicate with our root server. While this is correct (only DMZ relays have open ports back to our internal network) I would think the expected behavior would be that since the INT relays can no longer be found that the device would search for the next prefix in line which is DMZ and then locate one of our DMZ relays.

I am able to set the relay for these devices manually via their registry settings, but do not want to do this if auto select can be made to work. Can someone explain this process to me or am I correct and just have an issue I need to troubleshoot further?

This is indeed a “field of rakes”.

The client’s Affiliation SeekList and the relay’s Affiliation AdvertisementList values work only with Automatic Relay Selection. So be sure that the client relay selection is indeed set to Automatic, and that the DMZ Relays have their AdvertisementList value set.

The next thing is to consider how Automatic Relay Selection works. The client will attempt to ping each relay with ICMP. The AdvertisementSeekList tells the clients to try pinging each Affiliation Group first, before pinging relays outside of the Affiliation Group. Once a Relay replies to the ICMP Ping, the client will try to connect to it with tcp/52311 for BigFix traffic.

That’s crucial. If your DMZ Relay can’t be pinged, the client will not autoselect to it, regardless of affiliation groups.

Once a client tries to ping every relay (progressing through it’s affiliation groups, if set), then the client gives up on selecting a relay.

By default, after failing to find any relay the client will try to connect to the root server, as defined in the masthead file. This last round skips the ICMP Pings and just tries to connect with tcp/52311. So, if ICMP is blocked on all of your relays, then all of your automatic-select clients end up falling back to the root server.

You can change this failover behavior in several different ways. Most are described at https://www.ibm.com/support/knowledgecenter/SSQL82_9.5.0/com.ibm.bigfix.doc/Platform/Config/r_client_set.html#r_client_set__umgr

_BESClient_RelaySelect_FailoverRelay - the client will try this server (skipping ICMP) before trying the root server. Example values could include TopLevelRelayName.mydomain.com or 192.168.1.1

_BESClient_RelaySelect_FailoverRelayList - semicolon-delimited list of relay names or IP addresses to try before failing over to the root server. Examples could include toplevelrelay.mydomain.com;dmz-relay.mydomain.com;192.168.1.1

There is also a new option, as of 9.5.10 I think, to place a “Failover Relay” value in the masthead file itself, using the BESAdminTool. This can replace the name of your root server with a name of your choice. That’s handy if you want to do new client installs and not use a clientsettings.cfg to configure one of these FailoverRelay values; for instance we have a tool to update the MSI install package with the updated masthead file, but the MSI package doesn’t have any other builtin options to customize client settings.

External to BigFix, we have seen success with DNS trickery. The “False-Root” architecture is to basically use a DNS Alias (CNAME) record as the server name in the masthead file, but instead of actually pointing to the root server this CNAME points to a top-level relay (for internal systems) and, potentially, to a DMZ relay address on your Public DNS records. So a client inside your network, connecting to what it thinks is the root server for bigfix.mycompany.com is actually registering with a top-level relay, and the same client moving to the Internet thins it is connecting to bigfix.mycompany.com is talking to your DMZ relay.


So, those are a lot of options, and they have evolved over time. What I’d actually recommend is to keep the clients set to Automatic Relay Select, keep the Affiliation Groups (they’re useful inside the network when moving a machine from site to site), but also configure FailoverRelayList values to include your top-level relays and DMZ relays (as ‘Automatic Select’ is prone to fail in DMZ scenarios, esp. if ICMP is blocked somewhere.).

If you have any significant Internet/DMZ usage, then you should probably also check the new features for Persistent Connections and PeerNest that are available in 9.5.12 and higher. Those can be very helpful when the Relay is unable to send UDP notifications to the client that there is new content available, and tend to make your Internet clients much more responsive to new actions/fixlets.

2 Likes