TIME_WAIT is an incredible part of the TCP/IP stack that enables connections to linger until the client properly closes the connection. However, in some cases, the client does not close the connection properly or efficiently. This can result in TCP connections in the TIME_WAIT state to persist until the operating system purges them. For a busy system, this can result in a denial of service on the host because all available connections get tied up in the TIME_WAIT state.
Having worked with many large-scale and high-performance systems through the years, I've seen this scenario play out many times. Fortunately, each operating system has its own way of optimizing for this scenario to minimize the impact.
To determine if this is a problem on your host, track the number of connections in the TIME_WAIT state. For example, on UNIX/Linux and MacOS systems, you can count these connections with netstat:
$ netstat -an|grep -c TIME_WAIT
45564
Here are TCP tunings per operating system that I have used to mitigate this issue:
RedHat/Oracle Linux 8: /etc/sysctl.conf
net.netfilter.nf_conntrack_tcp_timeout_time_wait=1
net.ipv4.tcp_tw_reuse=1
net.ipv4.tcp_fin_timeout=1
RedHat/Oracle Linux 7: /etc/sysctl.conf
net.netfilter.nf_conntrack_tcp_timeout_time_wait=1
net.ipv4.tcp_tw_reuse=1
net.ipv4.tcp_fin_timeout=1
Solaris 10/11:
ndd -set /dev/tcp tcp_time_wait_interval 30000
Windows:
Reduce TcpTimedWaitDelay from the default of 2 minutes (120 seconds) down to around 20 seconds
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters
—> TcpTimedWaitDelay": dword:00000028
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters
—> StrictTimeWaitSeqCheck: dword:00000001
Notes for Windows:
· Changing these values requires a reboot. Plan to do that out of your production hours.
· TcpTimedWaitDelay is 2 minutes by default, even if the value is not present in the registry.
· You must set the StrictTimeWaitSeqCheck to 0x1 or the TcpTimedWaitDelay value will have no effect.
While changing this parameter the following important points needs to be considered:
Changing these values requires a reboot. Plan to do that out of your production hours.
TcpTimedWaitDelay is 2 minutes by default, even if the value is not present in the registry.
You must set the StrictTimeWaitSeqCheck to 0x1 or the TcpTimedWaitDelay value will have no effect.
References:
TcpTimedWaitDelay - https://technet.microsoft.com/en-us/library/cc938217.aspx
https://docs.microsoft.com/en-us/previous-versions/windows/it-pro/windows-server-2008-R2-and-2008/cc731521(v=ws.10)#BKMK_setdynamicportrange
https://support.microsoft.com/en-in/help/929851/the-default-dynamic-port-range-for-tcp-ip-has-changed-in-windows