Myrycom Benchmarking
We received two myricom NICs for evaluation
03:00.0 Ethernet controller: MYRICOM Inc. Myri-10G Dual-Protocol NIC (10G-PCIE-8A)
These netperf tests were run (always started on h3 with d01 as its target):
netperf -H 192.168.6.1 -t TCP_STREAM -C -c -l 60
netperf -H 192.168.6.1 -t TCP_SENDFILE -l 60 -C -c -F /boot/vmlinuz-2.6.27.19-intel-atlas-generic
netperf -H 192.168.6.1 -t UDP_STREAM -l 60 -C -c -- -m 8972 -s 1M -S 1M
netperf -t TCP_RR -l 60 -H 192.168.6.1 -C -c -f m -v 2 -- -b 4 -r 64K,64K
Results are as follows (first the summary, then the long verbatim output)
|
|
with firmware 1.4.31 |
with firmware 1.4.38 |
Part |
Netperf Test |
BW |
TX_CPU % |
RX_CPU % |
BW |
TX_CPU % |
RX_CPU % |
D |
TCP_RR |
8458 |
14.89 |
14.55 |
8245 |
14.91 |
14.71 |
A |
TCP_STREAM |
6147 |
8.38 |
32.99 |
A |
TCP_SENDFILE |
6156 |
5.41 |
12.42 |
A |
UDP_STREAM_TX |
6177 |
13.65 |
|
A |
UDP_STREAM_RX |
6153 |
|
15.68 |
A |
TCP_RR |
7687 |
12.96 |
38.15 |
B |
TCP_STREAM |
5886 |
8.61 |
16.20 |
B |
TCP_SENDFILE |
5872 |
7.38 |
15.76 |
B |
UDP_STREAM_TX |
3979 |
16.36 |
|
B |
UDP_STREAM_RX |
3949 |
|
46.80 |
B |
TCP_RR |
7301 |
14.45 |
27.63 |
C |
TCP_STREAM |
6148 |
8.77 |
30.50 |
C |
TCP_SENDFILE |
6155 |
5.36 |
26.36 |
C |
UDP_STREAM_TX |
6176 |
12.63 |
|
C |
UDP_STREAM_RX |
6175 |
|
26.67 |
C |
TCP_RR |
7638 |
12.79 |
33.77 |
D |
TCP_STREAM |
6132 |
10.14 |
34.59 |
6127 |
10.27 |
20.03 |
D |
TCP_SENDFILE |
6131 |
6.63 |
35.66 |
6133 |
6.89 |
14.85 |
D |
UDP_STREAM_TX |
6157 |
13.53 |
|
6157 |
13.01 |
|
D |
UDP_STREAM_RX |
6151 |
|
15.23 |
6139 |
|
12.88 |
(A) No tuning, default Linux kernel module, MTU 9000
- driver coming with Linux (version 1.4.3-1.358)
- No tuning whatsoever, both NICs were connected with each other directly.
- MTU 9000
h3:~# netperf -H 192.168.6.1 -t TCP_STREAM -C -c -l 60
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.6.1 (192.168.6.1) port 0 AF_INET : demo
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 16384 60.01 6147.38 8.38 32.99 0.894 3.517
h3:~# netperf -H 192.168.6.1 -t TCP_SENDFILE -l 60 -C -c -F /boot/vmlinuz-2.6.27.19-intel-atlas-generic
TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.6.1 (192.168.6.1) port 0 AF_INET : demo
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 16384 60.00 6155.85 5.41 12.42 0.576 1.323
h3:~# netperf -H 192.168.6.1 -t UDP_STREAM -l 60 -C -c -- -m 8972 -s 1M -S 1M
UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.6.1 (192.168.6.1) port 0 AF_INET : demo
Socket Message Elapsed Messages CPU Service
Size Size Time Okay Errors Throughput Util Demand
bytes bytes secs # # 10^6bits/sec % SS us/KB
262142 8972 60.00 5163311 0 6176.7 13.65 1.454
262142 60.00 5143321 6152.8 15.68 1.671
h3:~# netperf -t TCP_RR -l 60 -H 192.168.6.1 -C -c -f m -v 2 -- -b 4 -r 64K,64K
TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.6.1 (192.168.6.1) port 0 AF_INET : first burst 4
Local /Remote
Socket Size Request Resp. Elapsed Tput CPU CPU S.dem S.dem
Send Recv Size Size Time 10^6bits local remote local remote
bytes bytes bytes bytes secs. per sec % S % S us/Tr us/Tr
16384 87380 65536 65536 60.00 7687.24 12.96 38.15 141.434 416.361
16384 87380
Alignment Offset RoundTrip Trans Throughput
Local Remote Local Remote Latency Rate 10^6bits/s
Send Recv Send Recv usec/Tran per sec Outbound Inbound
8 0 0 0 682.023 7331.127 3843.622 3843.622
(B) No tuning, default Linux kernel module, MTU 1500
- driver coming with Linux (version 1.4.3-1.358)
- No tuning whatsoever, both NICs were connected with each other directly.
- MTU 1500
h3:~# netperf -H 192.168.6.1 -t TCP_STREAM -C -c -l 60
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.6.1 (192.168.6.1) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 16384 60.00 5886.46 8.61 16.20 0.958 1.804
h3:~# netperf -H 192.168.6.1 -t TCP_SENDFILE -l 60 -C -c -F /boot/vmlinuz-2.6.27.19-intel-atlas-generic
TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.6.1 (192.168.6.1) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 16384 60.00 5873.82 7.38 15.76 0.824 1.758
h3:~# netperf -H 192.168.6.1 -t UDP_STREAM -l 60 -C -c -- -m 8972 -s 1M -S 1M
UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.6.1 (192.168.6.1) port 0 AF_INET
Socket Message Elapsed Messages CPU Service
Size Size Time Okay Errors Throughput Util Demand
bytes bytes secs # # 10^6bits/sec % SS us/KB
262142 8972 60.00 3325997 0 3978.8 16.36 2.716
262142 60.00 3300758 3948.6 46.80 7.768
h3:~# netperf -t TCP_RR -l 60 -H 192.168.6.1 -C -c -f m -v 2 -- -b 4 -r 64K,64K
TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.6.1 (192.168.6.1) port 0 AF_INET : first burst 4
Local /Remote
Socket Size Request Resp. Elapsed Tput CPU CPU S.dem S.dem
Send Recv Size Size Time 10^6bits local remote local remote
bytes bytes bytes bytes secs. per sec % S % S us/Tr us/Tr
16384 87380 65536 65536 60.00 7300.62 14.45 27.63 166.085 317.446
16384 87380
Alignment Offset RoundTrip Trans Throughput
Local Remote Local Remote Latency Rate 10^6bits/s
Send Recv Send Recv usec/Tran per sec Outbound Inbound
8 0 0 0 718.142 6962.416 3650.311 3650.311
(C) sysctl tuning, default Linux kernel module, MTU 9000
- driver coming with Linux (version 1.4.3-1.358)
- net.core.rmem_max = 16777216
- net.core.wmem_max = 16777216
- net.ipv4.tcp_rmem = 4096 87380 16777216
- net.ipv4.tcp_wmem = 4096 65536 16777216
- net.core.netdev_max_backlog = 250000
- MTU 9000
netperf -H 192.168.6.1 -t TCP_STREAM -C -c -l 60
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.6.1 (192.168.6.1) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 65536 65536 60.02 6147.76 8.77 30.50 0.935 3.251
h3:~# netperf -H 192.168.6.1 -t TCP_SENDFILE -l 60 -C -c -F /boot/vmlinuz-2.6.27.19-intel-atlas-generic
TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.6.1 (192.168.6.1) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 65536 65536 60.02 6154.67 5.36 26.36 0.570 2.807
h3:~# netperf -H 192.168.6.1 -t UDP_STREAM -l 60 -C -c -- -m 8972 -s 1M -S 1M
UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.6.1 (192.168.6.1) port 0 AF_INET
Socket Message Elapsed Messages CPU Service
Size Size Time Okay Errors Throughput Util Demand
bytes bytes secs # # 10^6bits/sec % SS us/KB
2097152 8972 60.00 5162688 0 6175.9 12.63 1.340
2097152 60.00 5162194 6175.3 26.67 2.830
h3:~# netperf -t TCP_RR -l 60 -H 192.168.6.1 -C -c -f m -v 2 -- -b 4 -r 64K,64K
TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.6.1 (192.168.6.1) port 0 AF_INET : first burst 4
Local /Remote
Socket Size Request Resp. Elapsed Tput CPU CPU S.dem S.dem
Send Recv Size Size Time 10^6bits local remote local remote
bytes bytes bytes bytes secs. per sec % S % S us/Tr us/Tr
65536 87380 65536 65536 60.00 7637.70 12.79 33.77 140.450 370.890
65536 87380
Alignment Offset RoundTrip Trans Throughput
Local Remote Local Remote Latency Rate 10^6bits/s
Send Recv Send Recv usec/Tran per sec Outbound Inbound
8 0 0 0 686.447 7283.881 3818.851 3818.851
(D) sysctl+coalscent tuning, default Linux kernel module, MTU 9000
- driver coming with Linux (version 1.4.3-1.358)
- net.core.rmem_max = 16777216
- net.core.wmem_max = 16777216
- net.ipv4.tcp_rmem = 4096 87380 16777216
- net.ipv4.tcp_wmem = 4096 65536 16777216
- net.core.netdev_max_backlog = 250000
- MTU 9000
- ethtool -C eth4 rx-usecs 25
h3:~# netperf -H 192.168.6.1 -t TCP_STREAM -C -c -l 60
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.6.1 (192.168.6.1) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 65536 65536 60.02 6131.50 10.14 34.59 1.084 3.697
h3:~# netperf -H 192.168.6.1 -t TCP_SENDFILE -l 60 -C -c -F /boot/vmlinuz-2.6.27.19-intel-atlas-generic
TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.6.1 (192.168.6.1) port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 65536 65536 60.02 6131.31 6.63 35.66 0.708 3.812
h3:~# netperf -H 192.168.6.1 -t UDP_STREAM -l 60 -C -c -- -m 8972 -s 1M -S 1M
UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.6.1 (192.168.6.1) port 0 AF_INET
Socket Message Elapsed Messages CPU Service
Size Size Time Okay Errors Throughput Util Demand
bytes bytes secs # # 10^6bits/sec % SS us/KB
2097152 8972 60.00 5146619 0 6156.7 13.53 1.441
2097152 60.00 5141397 6150.5 15.23 1.623
h3:~# netperf -t TCP_RR -l 60 -H 192.168.6.1 -C -c -f m -v 2 -- -b 4 -r 64K,64K
TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.6.1 (192.168.6.1) port 0 AF_INET : first burst 4
Local /Remote
Socket Size Request Resp. Elapsed Tput CPU CPU S.dem S.dem
Send Recv Size Size Time 10^6bits local remote local remote
bytes bytes bytes bytes secs. per sec % S % S us/Tr us/Tr
65536 87380 65536 65536 60.00 8458.40 14.89 14.55 147.711 144.303
65536 87380
Alignment Offset RoundTrip Trans Throughput
Local Remote Local Remote Latency Rate 10^6bits/s
Send Recv Send Recv usec/Tran per sec Outbound Inbound
8 0 0 0 619.843 8066.561 4229.201 4229.201
--
CarstenAulbert - 30 Mar 2009