Sigh… we had an Operator inadvertently install the BES Relay on about 800 clients. We’ve uninstalled most of them, but some other clients are still looking for those accidental relays:
At 13:25:22 -0400 - RegisterOnce: Attempting secure registration with 'accidental relay #1... At 13:27:32 -0400 - RegisterOnce: GetURL failed - General transport failure. - BAD SERVERNAME (winsock error 4294967290 - registration url - accidental relay #1... etc...
What can we do to purge/replace the relay lists for these computers so that they communicate with a “real” relay?
What can we do (aside from removing access to the BES Support site) to prevent this in the future (i.e. lock down the published relay list)?
Are you using automatic relay selection?
I think if you use affiliation lists it could prevent the clients to select relays that are not part of the affiliation lists, I am not 100% sure on that.
That’s what I’m considering as a preventative measure for the future: a “Production” Affiliation Group assigned to all clients and the “real” Relays so that, if a rogue Relay is stood up, it’ll be ignored.
For now, though, I need to reset our relays.dat so that only “real” Relays are in it instead of the 800+ that were created last week.
I think an updated version of the actionsite will update the relays.dat, you can try sending a blank action and forcing a relay selection if you have a big deployment do it in batches
The problem is that we have computers not connecting to the server because they’re stuck with the 800+ relays version of relays.dat and are slowly churning through them, never getting through the list to an actual relay. May have to go hands on with 200+ computers to remedy.
Restart BESClient service and see if that works. It should.
If you already have a fallover relay in your masthead, all clients should go back to that if they cannot reach their primary/secondary relay. If you don’t have that set in masthead, you can pair that with a “_BESClient_RelaySelect_FailoverRelayList” and a list of what relay(s) to try in event the client can’t connect to it’s primary/secondary. I use both the masthead and the property. There is also a _BESClient_RelaySelect_FailoverRelay IIRC property, but I prefer the list so I can specify 1 or more relays.
Restarting just starts the autoselection process over again. We do have a failover relay defined, but the affected clients are still working from the last relays.dat they received from the server (with 800+ options) at a rate of around 1 per minute.
Oh dang, I see the issue. A quick solution would be to delete the __BESData. It wouldn’t register a new ID but would redownload all site content and lose logs of of course.
Outside of that, I could think of playing with the registry to force to manual selection with a specific relay but then you’ll have to set back to automatic.
I wonder if supplying a new relay.dat would also work?
Supplying a new relay.dat won’t work - the client will just replace it with the current copy from the actionsite.
If you have to reach out and touch the client, easiest just to configure the Registry for manual relay selection. You may also need to remove the HostSelector registry value so it starts a new clean selection instead of trying to reconnect to the last used relay.