问题
- 最近重构一个以前写的FTP Server,压测时,服务器的backlog是20,然后client总是有一些链接,已经
new Socket()
成功(根据测试,这意味着三次握手完成),但是就是收不到welcome信息,服务端的log看到的accept的链接数目少于client打开的链接数目(服务端的accept也没有报错),少的数量刚好是client端hang住在读welcome信息那些链接的数目。后来尝试了调整ulimit
无果,调整了backlog后就好了。看起来就像是一些链接被悄无声息(没有返回FIN或RST之类的报文)地从 等待被应用accept的队列 移除。 - 我抽查的十几个hang住的端口对应的报文都是三次握手成功,然后还hang住时是没有新的报文的,client一关闭,发送了FIN后,一种情况是,server端就发送数据给client,另一种情况是server端对client的FIN报文返回RST。
报文如下
第一种情况的报文,这种是client hang住没多久就关掉整个client进程时出现的情况
1
2
3
4
5
6
7
8117890 271.658989117 127.0.0.1 127.0.0.1 TCP 74 [TCP Port numbers reused] 40900 → 8083 [SYN, ECN, CWR] Seq=0 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=298647083 TSecr=0 WS=128
201498 272.673187012 127.0.0.1 127.0.0.1 TCP 74 [TCP Retransmission] 40900 → 8083 [SYN] Seq=0 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=298648097 TSecr=0 WS=128
201591 272.673232077 127.0.0.1 127.0.0.1 TCP 74 8083 → 40900 [SYN, ACK] Seq=0 Ack=1 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=298648068 TSecr=298648097 WS=128
201643 272.673288084 127.0.0.1 127.0.0.1 TCP 66 40900 → 8083 [ACK] Seq=1 Ack=1 Win=43776 Len=0 TSval=298648097 TSecr=298648068
203783 297.152188039 127.0.0.1 127.0.0.1 TCP 66 40900 → 8083 [FIN, ACK] Seq=1 Ack=1 Win=43776 Len=0 TSval=298672576 TSecr=298648068
203823 297.152451323 127.0.0.1 127.0.0.1 TCP 73 8083 → 40900 [PSH, ACK] Seq=1 Ack=2 Win=29312 Len=7 TSval=298672576 TSecr=298672576
0000 68 65 6c 6c 6f 0d 0a hello..
203825 297.152464118 127.0.0.1 127.0.0.1 TCP 54 40900 → 8083 [RST] Seq=2 Win=0 Len=0第二种情况的报文,这种是client hang住很久才关掉整个client进程时出现的情况
1
2
3
4
5
621004 108.400081512 127.0.0.1 127.0.0.1 TCP 74 46538 → 8083 [SYN, ECN, CWR] Seq=0 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=298483826 TSecr=0 WS=128
92060 109.409186851 127.0.0.1 127.0.0.1 TCP 74 [TCP Retransmission] 46538 → 8083 [SYN] Seq=0 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=298484836 TSecr=0 WS=128
92108 109.409228584 127.0.0.1 127.0.0.1 TCP 74 8083 → 46538 [SYN, ACK] Seq=0 Ack=1 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=298484804 TSecr=298484836 WS=128
92155 109.409268850 127.0.0.1 127.0.0.1 TCP 66 46538 → 8083 [ACK] Seq=1 Ack=1 Win=43776 Len=0 TSval=298484836 TSecr=298484804
112908 254.340209126 127.0.0.1 127.0.0.1 TCP 66 46538 → 8083 [FIN, ACK] Seq=1 Ack=1 Win=43776 Len=0 TSval=298629764 TSecr=298484804
112909 254.340214301 127.0.0.1 127.0.0.1 TCP 54 8083 → 46538 [RST] Seq=1 Win=0 Len=0
复现
server
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26import java.io.IOException;
import java.net.ServerSocket;
import java.net.Socket;
import java.util.concurrent.Executor;
import java.util.concurrent.Executors;
public class TestBacklogServer {
public static void main(String[] args) throws IOException {
Executor executor = Executors.newFixedThreadPool(128);
ServerSocket socket = new ServerSocket(8083, 10);
int cnt = 0;
while (true) {
final Socket client = socket.accept();
System.out.println(cnt + ": accept one: " + client.getRemoteSocketAddress());
cnt++;
executor.execute(() -> {
try {
client.getOutputStream().write("hello\r\n".getBytes());
client.close();
} catch (IOException e) {
e.printStackTrace();
}
});
}
}
}client。之所以设置了timeout,是为了拿到hang住的端口号,测试时可以不设置timeout,然后在没有新的输出时,threadDump,可以看到有多个链接hang在read那里
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.io.InputStream;
import java.net.ServerSocket;
import java.net.Socket;
import java.net.SocketTimeoutException;
import java.util.HashSet;
import java.util.Vector;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.LongAdder;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class TestBacklogClient {
private static final LongAdder adder = new LongAdder();
public static void main(String[] args) throws IOException, InterruptedException {
Vector<Integer> timeoutPorts = new Vector<>(1240);
ExecutorService pool = Executors.newFixedThreadPool(128);
for (int i = 0; i < 10240; i++) {
int finalI = i;
pool.execute(() -> {
try {
Socket socket = new Socket("localhost", 8083);
socket.setSoTimeout(1024 * 10);
System.out.println("connect Success one: " + finalI);
String r = readLine(socket.getInputStream());
assert r == null || "hello\r\n".equals(r);
if (r == null) {
timeoutPorts.add(socket.getLocalPort());
}
System.out.println("read success one: " + finalI);
} catch (IOException e) {
e.printStackTrace();
}
});
}
pool.shutdown();
pool.awaitTermination(1024, TimeUnit.DAYS);
System.out.println(adder.sum());
System.out.println(timeoutPorts);
}
private static String readLine(InputStream in) throws IOException {
byte[] buf = new byte[1024];
StringBuilder builder = new StringBuilder();
try {
while (true) {
int t = in.read(buf);
if (t < 0) {
break;
}
builder.append(new String(buf, 0, t));
}
return builder.toString();
} catch (SocketTimeoutException ex) {
adder.add(1);
System.out.println("time out, read: " + builder.length()
+ ", " + builder.toString()
+ ", hasCRLF: " + (builder.lastIndexOf("\r\n") >= 0));
}
return null;
}
}localhost的系统配置信息
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20$ uname -r
4.18.0-13-generic
$ /bin/cat /proc/sys/net/ipv4/tcp_max_syn_backlog
512
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.10
Release: 18.10
Codename: cosmic
$ java -version
openjdk version "11.0.1" 2018-10-16
OpenJDK Runtime Environment 18.9 (build 11.0.1+13)
OpenJDK 64-Bit Server VM 18.9 (build 11.0.1+13, mixed mode)
$ /bin/cat /proc/sys/net/core/somaxconn
128vps的配置信息
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20root@vultr:~# ./jdk-11.0.2/bin/java -version
java version "11.0.2" 2019-01-15 LTS
Java(TM) SE Runtime Environment 18.9 (build 11.0.2+9-LTS)
Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.2+9-LTS, mixed mode)
root@vultr:~# uname -r
4.18.0-10-generic
root@vultr:~# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.10
Release: 18.10
Codename: cosmic
root@vultr:~# /bin/cat /proc/sys/net/ipv4/tcp_max_syn_backlog
128
root@vultr:~# /bin/cat /proc/sys/net/core/somaxconn
128复现过程出现过这个现象:client端
ctrl-c
关掉后,server端突然就接收到了剩下的所有链接。这个在server运行在localhost时并且 一旦所有其他没有被hang住的链接完成后马上就ctrl-c
时出现过,但是在server运行在vps上没出现过。
猜测原因一
按照TCP/IP详解卷二的说法,如果listen的那个端点的队列满了(已经完成三次握手的队列,其大小由backlog确定),那么就不回复syn。但是,有个问题,如果多个syn同时到达,那么即使这时候队列未满,也可能接受了这些syn的一部分后队列就满了,那么linux系统如何处理——是否会存在有些链接三次握手完成,但是放不进队列里。
按照这篇文章(http://veithen.io/2014/01/01/how-tcp-backlog-works-in-linux.html)的说法,如果遇到这种情况,服务器会不断发送
syn/ack
报文(如果/proc/sys/net/ipv4/tcp_abort_on_overflow
是1
,则发送RST)我自己也抓到了这种报文,其中8083是server,server的backlog只有10,然后有10240个client企图链接该server。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15No. Time Source Destination Protocol Length Info
117974 271.659812650 127.0.0.1 127.0.0.1 TCP 74 [TCP Port numbers reused] 40970 → 8083 [SYN, ECN, CWR] Seq=0 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=298647084 TSecr=0 WS=128
117976 271.659818367 127.0.0.1 127.0.0.1 TCP 74 8083 → 40970 [SYN, ACK, ECN] Seq=0 Ack=1 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=298647084 TSecr=298647084 WS=128
117977 271.659825016 127.0.0.1 127.0.0.1 TCP 66 40970 → 8083 [ACK] Seq=1 Ack=1 Win=43776 Len=0 TSval=298647084 TSecr=298647084
201482 272.673178662 127.0.0.1 127.0.0.1 TCP 74 [TCP Retransmission] 8083 → 40970 [SYN, ACK, ECN] Seq=0 Ack=1 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=298648097 TSecr=298647084 WS=128
201563 272.673220860 127.0.0.1 127.0.0.1 TCP 66 [TCP Dup ACK 117977#1] 40970 → 8083 [ACK] Seq=1 Ack=1 Win=43776 Len=0 TSval=298648097 TSecr=298647084
201993 274.693205982 127.0.0.1 127.0.0.1 TCP 74 [TCP Retransmission] 8083 → 40970 [SYN, ACK, ECN] Seq=0 Ack=1 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=298650117 TSecr=298648097 WS=128
201995 274.693247830 127.0.0.1 127.0.0.1 TCP 66 [TCP Dup ACK 117977#2] 40970 → 8083 [ACK] Seq=1 Ack=1 Win=43776 Len=0 TSval=298650117 TSecr=298647084
201997 274.693527487 127.0.0.1 127.0.0.1 TCP 73 8083 → 40970 [PSH, ACK] Seq=1 Ack=1 Win=43776 Len=7 TSval=298650117 TSecr=298650117
0000 68 65 6c 6c 6f 0d 0a hello..
201999 274.693556287 127.0.0.1 127.0.0.1 TCP 66 40970 → 8083 [ACK] Seq=1 Ack=8 Win=43776 Len=0 TSval=298650117 TSecr=298650117
202001 274.695599421 127.0.0.1 127.0.0.1 TCP 66 8083 → 40970 [FIN, ACK] Seq=8 Ack=1 Win=43776 Len=0 TSval=298650119 TSecr=298650117
202009 274.737168658 127.0.0.1 127.0.0.1 TCP 66 40970 → 8083 [ACK] Seq=1 Ack=9 Win=43776 Len=0 TSval=298650161 TSecr=298650119
203731 297.151676910 127.0.0.1 127.0.0.1 TCP 66 40970 → 8083 [FIN, ACK] Seq=1 Ack=9 Win=43776 Len=0 TSval=298672575 TSecr=298650119
203732 297.151682855 127.0.0.1 127.0.0.1 TCP 66 8083 → 40970 [ACK] Seq=9 Ack=2 Win=43776 Len=0 TSval=298672575 TSecr=298672575在
/proc/sys/net/ipv4/tcp_abort_on_overflow
设为1
之后,抓到了如下报文。需要注意,那个RST并不是服务器关闭了,服务器后面还对其他链接响应了很多报文1
2
3
4
5
6
7No. Time Source Destination Protocol Length Info
5904 9.174057671 127.0.0.1 127.0.0.1 TCP 74 54448 → 8083 [SYN, ECN, CWR] Seq=0 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=300065429 TSecr=0 WS=128
92075 10.190560916 127.0.0.1 127.0.0.1 TCP 74 [TCP Retransmission] 54448 → 8083 [SYN] Seq=0 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=300066446 TSecr=0 WS=128
92209 12.206628025 127.0.0.1 127.0.0.1 TCP 74 [TCP Retransmission] 54448 → 8083 [SYN] Seq=0 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=300068462 TSecr=0 WS=128
92258 12.206814283 127.0.0.1 127.0.0.1 TCP 74 8083 → 54448 [SYN, ACK] Seq=0 Ack=1 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=300068462 TSecr=300068462 WS=128
92319 12.207072233 127.0.0.1 127.0.0.1 TCP 66 54448 → 8083 [ACK] Seq=1 Ack=1 Win=43776 Len=0 TSval=300068462 TSecr=300068462
92345 12.207466000 127.0.0.1 127.0.0.1 TCP 54 8083 → 54448 [RST] Seq=1 Win=0 Len=0需要注意,那个RST并不是服务器关闭了,服务器后面还对其他链接响应了很多报文——如下面所示,最后的那个No.为
92345
后面还有很多报文,都是8083端口的server与client的通信1
2
3
4
5
6No. Time Source Destination Protocol Length Info
92345 12.207466000 127.0.0.1 127.0.0.1 TCP 54 8083 → 54448 [RST] Seq=1 Win=0 Len=0
92346 12.207985091 127.0.0.1 127.0.0.1 TCP 73 8083 → 54168 [PSH, ACK] Seq=1 Ack=1 Win=43776 Len=7 TSval=300068463 TSecr=300068462
92347 12.208004676 127.0.0.1 127.0.0.1 TCP 66 54168 → 8083 [ACK] Seq=1 Ack=8 Win=43776 Len=0 TSval=300068463 TSecr=300068463
92348 12.208041224 127.0.0.1 127.0.0.1 TCP 66 8083 → 54168 [FIN, ACK] Seq=8 Ack=1 Win=43776 Len=0 TSval=300068463 TSecr=300068463
92349 12.208052138 127.0.0.1 127.0.0.1 TCP 73 8083 → 54172 [PSH, ACK] Seq=1 Ack=1 Win=43776 Len=7 TSval=300068463 TSecr=300068462然而,这个并不是上述问题的原因,因为打印出来的hang住的端口号,跟发生重传
syn/ack
的端口号对不上
猜测原因二
- 网上搜到这篇文章http://www.10tiao.com/html/749/201411/201005717/1.html
- 案例分析(二)那里描述的现象刚好跟跟上述问题非常相似——也是client握手成功但是读不到信息,作者分析的原因是,第三次握手的ACK因为AcceptQueue溢出而被丢弃,导致client进入了ESTABLISHED状态,但是server只是在
SYN_RECV
状态,并且因为net.ipv4.tcp_synack_retries=1
(我的测试机器的这个参数是5),所以如果Server重传的SYN+ACK报文对应的ACK还是被丢弃,那么就会进入client以为连接成功,但是server并没有连接成功的状态 - 然而,我把server架设到另一个机子上,然后使用
watch -n 0.5 "netcat -n | grep 8083 | grep -i "syn""
检查server那里是否有链接是一直处于syn_recv
状态,但是整个过程中看到的处于SYN_RECV
状态的链接端口号都是在不断变动(说明它们不是hang住的链接),并且client端所有没有hang住的链接都完成后,只剩下hang住的链接时,server端是看不到处于syn_recv
状态的链接的 - 所以我认为可能也不是这个原因(欢迎指正!)
结尾
- 这个问题并没有解决,如果有大佬知道原因,或者有一些想法,欢迎与我交流,谢谢!