Bandwidth Issue Sev 2

mbartosh · December 3, 2025, 8:20pm

Bigfix version 11.0.5.203 and client 11.0.5.204.

I had a sev 2 this morning due to a Bigfix download. There is a local relay 1 hop away, but relay selection selected a head relay 18 hops away. Throttling was apparently 0. The size of the download was 3.8 GB (monthly update for Windows 11). My question is why wasn’t the local relay selected? We have relay affiliation, but that fixlet didn’t run until more than an hour later.

Sequence of Events in Client Log

Relay Select 8:56 am select relay06 18 hops away local relay 1 hop
Download monthly update 8:58 am from relay06 3.8 GB
Gap in log 8:58 am -> 9:47 am
Relay Affiliation 10.43 am
Bandwidth Throttle 10:44 am
Evaluation Cycle 90

JasonWalker · December 4, 2025, 1:34am

Most often that occurs because the client cannot ping the relay (ICMP). With Automatic Relay Selection (even with Affiliation), the client will not select a relay that it cannot ping; and the ping responses are used to determine how many hops away the relays are, so the closest can be chosen.

Ensure that the ping traffic to your expected relay is not blocked. If the ping traffic is going through ok, then I'd check into other issues like any network or DNS outages at that specific time when the relay selection chose the distant relay at 8:56 am.

mbartosh · December 4, 2025, 1:53pm

We have our _BESClient_RelaySelect_ResistFailureIntervalSeconds set to 6 hours. Would that have anything to do with it? The default is 10 minutes.

Another question, if the client starts a download from one relay will that download persist on that relay or will the download switch to another relay when a relay select is performed and a new relay is selected.

JasonWalker · December 4, 2025, 2:03pm

ResistFailureIntervalSeconds would cause the client to not try to select another relay when the current one is offline, and instead it would wait up to six hours before selecting another relay. If its existing relay comes back online the client should again report, but otherwise if the current relay fails or is removed then the client would stay offline for six hours before trying to select another relay.

For the downloads, if the original relay gets a download request then it will send the request upstream through its relay chain. If the client that made the request goes away, that is irrelevant to the original relay - that relay will continue the download from its parent until the download is complete and staged on the relay.

The client does not begin the download from the relay until the download has been completely cached on the relay. If the client is still connected to the original relay when the cache is completed, the client will start the download from that original relay.

With the download from the relay to the client in-progress, if the client switches to another relay then the client will attempt to resume the download from the new relay, at the in-progress point (it should not have to restart the download from the beginning). But the new target relay may again have to request the download from its own parent, and the final client cannot start resuming the download until the new target relay has fully-cached the download file.