Hello all - I learned a little bit more about relays and top level relays through the recent hosted events. I couldn’t attend the second one unfortunately but the first was helpful.
Our current relay design is not very efficient and probably beating up the core BigFix server a little more than it should be. Here’s the implementation that I inherited and am looking to improve. Essentially, we have 6 relay servers from 4 different geographical locations all pointing directly to the core BigFix server. Our main office has 3 relay servers for some reason but we’re probably only serving a total of ~400ish endpoints. Our remote locations have one relay per remote location. it should be noted that main office relay server 3 is also a bare metal imaging server / PXE server for operating system deployment.
At this point, I think the remote site relays are fine, they are never going to serve more than enough to require more than a single relay.What I need to figure out is how to redesign the main office relay servers. I’m thinking I can retire two of them, and I need to convert one of them into a top level relay.Example:
my laundry list of questions:
is this poor design? would you do anything differently?
should there be a second top level relay for redundancy? how would you configure this, if so?
in the past we’ve manually assigned our relays to point to the bigfix server directly. if we were wanting to use top level relays, would we just do the same change and point them to the top level relay instead, or are there other changes required as well?
Can a OSD/PXE server/bare metal deployment server also be a top level relay, or is this a poor choice, should this be kept separately?
lastly, we potentially have a project that will add roughly 1000 new geographical locations with a few endpoints at each. are there any considerations/changes you’d make for relays with this in mind? They will be able to communicate back to the primary location, but have varying connection speeds, with the worst offenders being low end DSL lines.
How many total clients are we talking? A top level relay is best practice but might not be necessary if you are serving less than a couple thousand clients.
is this poor design? would you do anything differently?
Depends entirely on your client count – 6 relays pointing to the BigFix server is perfectly fine.
should there be a second top level relay for redundancy? how would you configure this, if so?
You could just configure your remote relays to point to the server and the top level relay.
in the past we’ve manually assigned our relays to point to the bigfix server directly. if we were wanting to use top level relays, would we just do the same change and point them to the top level relay instead, or are there other changes required as well?
That’s all you’ll have to do
Can a OSD/PXE server/bare metal deployment server also be a top level relay, or is this a poor choice, should this be kept separately?
Depends on the size of your deployment but this should be fine. If you are performance constrained then you should split them out otherwise unless you’re currently experiencing performance issues.
If you have less than 4000 clients total, then you might not need any relays at all except for the purposes of caching behind slow WAN links at remote sites.
Assuming you have enough clients to matter, then I would recommend 1 top level relay that ALL other relays talk to. No relay should talk directly to the root server except for the the top level relay, and this top level relay should have an extra large cache. This will take processing and downloading load off of the root server.
You could have another relay between the top level relay and the main office relays, but it doesn’t seem like you have enough clients in the main office to justify having multiple relays for this purpose.
In general, I think your 2nd diagram makes sense, but if you are going to add 1000 remote sides, I would probably have at least 2 relays that the remote sites use for redundancy.
Again, a lot of this depends on how many endpoints total we are talking about.
I don’t remember the relay selecting other relays details offhand, but all of the remote sites would have either Relay1 for remote sites as their primary and the other as secondary, while others would use Relay2 for remote sites as their primary and the other as secondary. This would divide up the load a bit, while also having redundancy.
When we are planning these we often will allocate an internet relay with client authentication enabled. No clients have the internet relay as their primary relay – instead we set the internet relay as the failover relay for all of our clients. This way – as long as a client is online, there is at least one relay it can talk to.
I also like the idea of using a failover relay that uses a different port, like 443, incase outgoing 52311 is blocked for some reason. I haven’t tried this specifically, but I keep meaning to.
I didn’t realize you could host the relay service on a different port – for some reason i thought the port had to be consistent across the installation but I guess it makes sense that you should just be able to provide a different port number when setting the relay?
The only thing I’d add to the above is that I generally have connectivity over a WAN or to a slow offsite office occur relay to relay, and have the clients at the remote site connect to the relay placed at the remote site. This way, you have one connection over the slow or WAN link, and faster connectivity between client & relay over a local connection.