While this message written repeatedly, relay looks like not functioning and we see “winsock error -10” in client logs frequently.
Our relay is running on AIX 7.1/VIOC, connecting 200 - 700 clients.
netstat command shows many CLOSE_WAIT but the number of it is 200 - 300 at most, so I think it might be normal.
PumpSocket in general does not indicate a problem, means simple the agent have closed the communication its side … the number of CLOSE_WAIT should be normal as well …
the only thing I’d worried about is that relay that ‘looks like not functioning’ …
As test, can check if from the agent the http://relayhostname:52311/rd command answer correctly ( using both IP or FQN ) … DNS issue? firewall? proxy?
If these checks do not suggest nothing, better open a PMR permitting the support team to have a look to relay logs and ‘client diagnostic’ and go deeper on the issue.
We occasionally see this on some AIX relays as well. I normally notice it when clients stop reporting and/or move to another relay.
I haven’t found the cause, or a real solution. Restarting the client and relay service seems to get thing working normally again, sometimes it takes a few restarts.
I’ve had success just killing the CLOSE_WAIT sockets as well using the command below:
for i in netstat -Aan |grep 52311 |grep CLOSE | awk '{print $1}' ; do rmsock $i tcpcb ; done
I found that when this error occurs CPU utilization of our relay server was closing to 1 - 2% despite that it is always over 10% while working normally.
So I am thinking slow down of relay for any reasons causes this PumpSocket error, but I have no idea about the root causes …