Has anyone tried using an F5 VIP to act as a proxy to multiple BigFix Relays?

My organization has started moving to using a F5’s in place of a traditional DMZ.

To that end, I’m looking at how to best make what used to be my DMZ Relays available to the outside world.
It would be great if I could use a SINGLE F5 VIP configured to direct traffic in a “sticky” round robin to a collection of 5-6 Relays.

The clients themselves are configured for Command Polling, so the Relays don’t need to be able to reach out to the clients (not that UDP is routed over the Internet!), they just need to accept the Reports and pass them along and act as download staging locations.

How will the Relays react if I do something like this?

I’ve seen that work well, and I’ve seen it fail miserably.

Officially, it’s not a supported configuration - which does not mean it won’t work, but if anything breaks and we think it’s due to the F5, you’re a bit on your own.

A lot will depend on your relationship with the F5 admins and how much cooperation you’ll get.

Some considerations to note - the balancing algorithm needs to be very sticky, but I forget their terminology for that. The Internet client performs the Relay Authentication only during Registration.

If the client gets switched to a different Relay after it registers it will get 403 Forbidden errors from the new relay when it tries to gather sites or post reports. Since each gather/post is a separate HTTP session the F5 might switch the client to a different relay for each session depending on the algorithm. The load-balance algorithm needs to be based on the client IP address, not session or port numbers.

Ideally you could approach it slowly - set up the F5 balancer to the relays, but still allow clients to also talk directly to the relays. Switch only a few clients, manually, to use the F5 name or IP as their relay, and watch their logs over several days.

When you’re satisfied with testing you could configure a Relay Name Override on your DMZ relays so they advertise a name that maps to the F5, and add FailoverRelayList options on your clients for that name as well.

Some of the health-check content will not work as expected, since it is based on the clients reporting the name of their relay, and they will all report the F5 as their relay. The health checks will display that as an overloaded relay, thinking it has more clients than are healthy. I’m working on custom content based on the RelayChain info under the client’s __BESData/__Global/RelayChain logs, which show the real relay name and computer ID in that case. Requires 9.5.13 or higher on clients and relays.

That’s what I have off the top of my head. I’d love to see more feedback from anyone else using the F5 configuration and their experiences with it.

1 Like

Going slow sounds like good advice.

I’ll give it a try and report back what we see. I do have a fairly good relationship with our F5 guys.

I currently have several 1:1 F5 vip’s already in use for some Relays.

1 Like

Unsupported, but we have it setup to for some specific use cases. A few gotchas we found is the F5 is acting as a NAT so it’s not able to send UDP back to the client because the parent relay sees the source IP of that of the F5, not the actual client. But the work around was to enable persistent connections on the client and relays.

2 Likes

I’m aware of some 250K client deployments that have for many years used F5s in front of the BigFix relays. Overall, the F5 serves an important load balancing purpose, but it should be used sparingly. While the F5 does a good job at spreading the load, even with the so-called sticky settings checked and double-checked by the F5 admins, we still see clients that are registered with one relay connecting to another relay. This creates a problem where clients get confused when the relays don’t have the same site versions, which is especially common with mailbox sites. Mailboxes are what’s used for encrypted actions, also for whenever you target specific clients by name, or by targeting specific clients by clicking on them in an action targeting dialog listing. Because of this, I do not recommend relying on F5s exclusively. They are great for Top Level Relays (TLRs) and DMZ/internet relays when there’s a lot of clients, but it’s important to balance them with a relay affiliation setup that will migrate clients to local relays immediately ASAP.

Another thing to be aware of with F5s: There is a deep packet inspection feature. If it’s turned on for port 52311, then it will break client authentication with the relays, because it will try to inject a certificate that’s outside of BigFix’s control.

3 Likes

Thanks for the feedback.

I’ve been using 1:1 F5 vips with Relays for about a year now. I was hoping to cluster a few Relays behind a single vip and point the Fall Back Masthead setting at it, but I think I’ll just stick with the 1:1 model for now.

Maybe later, I’ll screw up my courage and try clustering a couple Relays behind a vip.

If you have a pool of relays being an F5, another downside is you can’t simply go report in WR or the console on how many clients are using Relay “x” because it’s no longer a 1:1 for client to Relay.

What I did is come up with a way to ask the Relay how many clients registered in the last 6 hours and then I just run a report on that property when viewing Relays.

if (exists relay service) then (number of rows of statement "select * from COMPUTER_REGISTRATIONS WHERE RegistrationTime >= strftime('%25s', 'now', '-6 hour')" of sqlite database of file (if windows of operating system then (pathname of parent folder of regapp "BESRelay.exe" & "\ClientRegisterData\registrationlist.db") else "/var/opt/BESRelay/ClientRegisterData/registrationlist.db")) else nothing

1 Like

As @JasonWalker mentioned, we could look in the __BESData/__Global/RelayChain logs to determine the Relay a client last connected with and report that in a property.

I can retrieve the last line of the most recent RelayChain log file with …

Line ((number of lines of File ((unique value of maxima of (((substrings before ".txt" of Names of it) as Integer) of Files of Folder "C:\Program Files (x86)\BigFix Enterprise\BES Client\__BESData\__Global\RelayChain")) as string & ".txt") of Folder "C:\Program Files (x86)\BigFix Enterprise\BES Client\__BESData\__Global\RelayChain") as Integer) of File ((unique value of maxima of (((substrings before ".txt" of Names of it) as Integer) of Files of Folder "C:\Program Files (x86)\BigFix Enterprise\BES Client\__BESData\__Global\RelayChain")) as string & ".txt") of Folder "C:\Program Files (x86)\BigFix Enterprise\BES Client\__BESData\__Global\RelayChain"

I’m sure someone can cleanup my relevance to make it neater and faster.

I like this…a lot. It’s a different approach than what I was taking and is quite efficient.
Here’s a slight simplification on it.

q: line (number of lines of it) of files ((unique value of maxima of (((substrings before ".txt" of Names of it) as Integer) of Files of it)) as string & ".txt") of Folders "__Global/RelayChain" of data folder of client
A: At 09:20:41 -0500 - S - s:1080572761(bes-root) - c:1615062694(SERVER-TEST1)

I also have some things to split it into properties:

// All Properties, as tuple string items
Q: (tuple string item 0 of it, following text of first " " of tuple string item 1 of it, tuple string item 2 of it, tuple string item 3 of it | "none", unique value of concatenation ";" of tuple string items whose (it starts with "r:") of it | "none", tuple string item (number of tuple string items of it - 1) of it ) of (tuple string of (name of it; substrings separated by " - " of lines (maximum of line numbers of lines of it) of it)) of files ((unique value of maxima of (((substrings before ".txt" of Names of it) as Integer) of Files of it)) as string & ".txt") of Folders "__Global/RelayChain" of data folder of client

A: 20210402.txt, 07:43:30 -0500, S, s:10566764(root-server), r:538926728(rop-level-relay);r:13805242(leaf-relay), c:3372983(client-name)


// Current Relay Chain
Q: tuple string of tuple string items whose (it starts with "r:" or it starts with "s:") of tuple string of substrings separated by " - " of lines (maximum of line numbers of lines of it) of files ((unique value of maxima of (((substrings before ".txt" of Names of it) as Integer) of Files of it)) as string & ".txt") of Folders "__Global/RelayChain" of data folder of client

A: s:10566764(root-server), r:538926728(top-level-relay), r:13805242(leaf-relay)

// Direct Parent Relay
Q: tuple string items (number of tuple string items of it - 1) of tuple string of tuple string items whose (it starts with "r:" or it starts with "s:") of tuple string of substrings separated by " - " of lines (maximum of line numbers of lines of it) of files ((unique value of maxima of (((substrings before ".txt" of Names of it) as Integer) of Files of it)) as string & ".txt") of Folders "__Global/RelayChain" of data folder of client

A: r:13805242(leaf-relay)

edit: Forgot to “generalize” the data folder of client path. No need to hard-code C:\Program Files (x86). Now it’ll work for BigFix client installed in non-default paths, and it’ll also work with Linux/UNIX/Mac client platforms.

2 Likes