Relay server error: buffer directory overload check semaphore

(imported topic written by SystemAdmin)

I was upgrading some 2003 servers to SP2, then updating Microsoft security patches on 6 target servers.

I had a 2 hour window to complete the objective. Usually this isn’t an issue, but yesterday I was not able to meet the conditions.

I deployed SP2 the night before, as I always do, and all devices reported “Pending Restart”, as expected.

My window started at 2:00pm and ended at 4:00pm. All these devices reported to a specific relay server (rly1).

Prior to 2:00pm (at 11:00am), I restarted relay services on all relay servers, as recommended by Big Fix tech from phone conversation to ensure performance. Verified services were up and running and relay logs showed all good.

Rebooted target servers at 2:00pm. After coming up and up, they reported only 11 relevant security patches after 20min. Deployed patches and rebooted target servers. 20min later, no more patches showing relevant.

Ran “refresh” from management console on target servers several times after to ensure no more patches needed.

At 3:57pm, ran final refresh request.

Target servers went “grey” in console, then popped back to black. After, they showed relevant for 45 security patches. I was out of time and management was not happy.

Looked at relay log on relay server, at 2:50pm and 3:00pm, log showed following error:

“Error: Timed out while waiting on buffer directory overload check semaphore.”

Target servers client logs all showed refresh requests throughout the process.

How can I fix this?

-thanks

(imported comment written by SystemAdmin)

We have the same error message on some relays here. If you find the fix, please update the thread. Thanks!

(imported comment written by BenKus)

Hey Charles,

It sounds like you have some issues that need to be looked into… You shouldn’t need to restart relays and you shouldn’t need to send refreshes…

I am wondering if your relay bufferdir was full temporarily and it delayed your reports… however, that would be pretty rare unless you had a very underpowered relay, an awful lot of agents reporting to the relay, or a server that was very behind.

Ben

(imported comment written by rdamours91)

Could you see something like this if you had horrible disk I/O on the BigFix server/relays? I think I saw this in the past before I switched to a raid 10 setup.