Time_wait

(imported topic written by SystemAdmin)

Our BES server (v6) and relays end up with a large number of connections in a TIME_WAIT state. I know what that state means in general, so my question is should I count those connections when monitoring the state of the server and relays. Do those TIME_WAIT connections prevent other clients from connecting or get dropped as clients need to create new connections?

Thanks,

Jim

(imported comment written by SystemAdmin)

Hi Jim,

We need to be careful about defining what a ‘large’ number is but it seems like you have a lot of experience monitoring network connections. Several hundred TIME_WAIT connections isn’t a concern while 10-20 thousand may represent a maximum number of connection problem for the OS and indicate a server that is being overwhelmed by massive numbers of incoming connections. It is possible that the server will begin refusing connections once you reach its limits and this usually can be seen in the BES Client logs. You’ll see the BES Client fail to gather data occasionally but succeed on retrying a few minutes later.

Do you know where the OS limit is on the total number of connections?

Anyway, we typically put the following windows setting at 30 seconds to help in turning over TIME_WAIT connections faster.

http://technet2.microsoft.com/WindowsServer/en/library/38b8bf76-b7d3-473c-84e8-e657c0c619d11033.mspx?mfr=true

Technically the OS is supposed to be able to clear TIME_WAIT connections even if the connection hasn’t reached the expiration period but we have seen that lowering the expiration period seems to help. When you have a server at this load level it is usually good to try to fix it in many different ways. So, try to increase its ability to handle the connections while at the same time find ways to reduce the number of incoming connections.

Here are some ways to help reduce the number of incoming connections:

  1. Add more BES Relays.

  2. Add a Failover BES Relay

_BESClient_RelaySelect_FailoverRelay (http://support.bigfix.com/bes/misc/besconfigsettings.html)

  1. Change BES Client behavior to report less frequently ( _BESClient_Report_MinimumInterval )

(imported comment written by SystemAdmin)

Thank you, Tyler. We’re in the 1000-2000 range, so I don’t think it’s a huge issue.

Since we’re running this on Windows 2000, it will wait until it’s reached the maximum and then closes TIME_WAIT connections that are 60 seconds old. Looks like by default the ports available for a single IP address is 1024 to 5000, or 3977 outbound connections, which is limited by the MaxFreeTWTcbs parameter to about 1000 connections. http://smallvoid.com/tweak/winnt/network.html has a lot of good information on this. Take a look at #23 and below. It appears that Win2003 handles the TIME_WAIT connections a bit better than Win2K also.

I had been considering changing the default TcpTimedWaitDelay value, so I think I will try that first and hope to avoid tweaking much further. :slight_smile:

Thanks again,

Jim