簡介
因為要做基於RDMA的分布式系統,所以買了2塊二手InfiniBand做開發,這一篇博客記錄一下infiniband網卡的測試
- 網卡型號:Mellanox ConnectX-2 MHQH29B Dual Port 4x QDR PCIe 2.0 x8
- 機器環境:ubuntu 14.04, ubuntu 12.04
2塊網卡可以不通過交換機直連,(使用OpenSM)
安裝步驟
- 安裝網卡,並連線(沒有驅動的時候,網卡上的燈不亮)
- 下載和安裝mellanox驅動:http://www.mellanox.com/page/software_overview_ib
- 安裝時建議加--force選項
- 安裝完成后系統會檢測網卡的pcie配置,比如會提示當前一個網卡插在x4插槽
- 重啟機器(網卡連通的端口的燈會亮)
- 給每個網卡配置靜態ip,例如:
auto ib1
iface ib1 inet static
address 10.0.0.1
netmask 255.255.255.0
測試步驟
網卡信息查看
- ibnodes命令,會發現端口連接的信息
mlx@m04:~$ ibnodes
Ca : 0x0002c903000ae254 ports 2 "up75 HCA-1"
Ca : 0x0002c903000ec606 ports 2 "m04 HCA-1"
- ifconfig會發現ib端口
ib0 Link encap:UNSPEC HWaddr A0-00-02-20-FE-80-00-00-00-00-00-00-00-00-00-00
UP BROADCAST MULTICAST MTU:4092 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:256
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
ib1 Link encap:UNSPEC HWaddr A0-00-03-00-FE-80-00-00-00-00-00-00-00-00-00-00
inet addr:10.0.0.1 Bcast:10.0.0.255 Mask:255.255.255.0
inet6 addr: fe80::202:c903:e:c608/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:2044 Metric:1
RX packets:54575 errors:0 dropped:0 overruns:0 frame:0
TX packets:67623 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:256
RX bytes:3174514 (3.1 MB) TX bytes:891903946 (891.9 MB)
- ibstatus可以查看網卡狀態,如下所示,可以發現port 2協商的速度為4X QDR
mlx@m04:~$ ibstatus
Infiniband device 'mlx4_0' port 1 status:
default gid: fe80:0000:0000:0000:0002:c903:000e:c607
base lid: 0x0
sm lid: 0x0
state: 1: DOWN
phys state: 2: Polling
rate: 10 Gb/sec (4X)
link_layer: InfiniBand
Infiniband device 'mlx4_0' port 2 status:
default gid: fe80:0000:0000:0000:0002:c903:000e:c608
base lid: 0x1
sm lid: 0x1
state: 4: ACTIVE
phys state: 5: LinkUp
rate: 40 Gb/sec (4X QDR)
link_layer: InfiniBand
2台機器無需交換機連通
使用opensm(需root權限)
mlx@m04:~$ sudo opensm
[sudo] password for mlx:
-------------------------------------------------
OpenSM 4.7.0.MLNX20160523.25f7c7a
Command Line Arguments:
Log File: /var/log/opensm.log
-------------------------------------------------
OpenSM 4.7.0.MLNX20160523.25f7c7a
Using default GUID 0x2c903000ec608
Entering DISCOVERING state
Entering MASTER state
此時可以互ping:
mlx@m04:~$ ping 10.0.0.2
PING 10.0.0.2 (10.0.0.2) 56(84) bytes of data.
64 bytes from 10.0.0.2: icmp_seq=1 ttl=64 time=0.294 ms
64 bytes from 10.0.0.2: icmp_seq=2 ttl=64 time=0.155 ms
64 bytes from 10.0.0.2: icmp_seq=3 ttl=64 time=0.151 ms
64 bytes from 10.0.0.2: icmp_seq=4 ttl=64 time=0.155 ms
^C
--- 10.0.0.2 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3000ms
rtt min/avg/max/mdev = 0.151/0.188/0.294/0.063 ms
速度測試
-
一台機器開啟opensm(需root權限),使用ib_send_bw
-
把一台機器作為server:
mlx@m04:~$ ib_send_bw -a -c UD -d mlx4_0 -i 2
************************************
* Waiting for client to connect... *
************************************
- 把另外一台機器作為client:由於up75的網卡插在PCIe 2.0 x4端口,所以速度僅達到了x4的上限,沒有達到40Gb/s
mlx@up75:~$ ib_send_bw -a -c UD -d mlx4_0 -i 2 10.0.0.1
Max msg size in UD is MTU 4096
Changing to this MTU
---------------------------------------------------------------------------------------
Send BW Test
Dual-port : OFF Device : mlx4_0
Number of qps : 1 Transport type : IB
Connection type : UD Using SRQ : OFF
TX depth : 128
CQ Moderation : 100
Mtu : 4096[B]
Link type : IB
Max inline data : 0[B]
rdma_cm QPs : OFF
Data ex. method : Ethernet
---------------------------------------------------------------------------------------
local address: LID 0x02 QPN 0x0238 PSN 0xf162c2
remote address: LID 0x01 QPN 0x021a PSN 0xbc213c
---------------------------------------------------------------------------------------
#bytes #iterations BW peak[MB/sec] BW average[MB/sec] MsgRate[Mpps]
2 1000 5.72 5.20 2.727911
4 1000 11.49 11.34 2.972020
8 1000 22.99 22.61 2.963387
16 1000 45.98 45.31 2.969666
32 1000 91.70 90.55 2.967229
64 1000 183.14 180.77 2.961664
128 1000 366.79 361.35 2.960143
256 1000 727.44 718.16 2.941597
512 1000 1088.50 1044.70 2.139549
1024 1000 1264.96 1263.29 1.293610
2048 1000 1407.22 1406.43 0.720094
4096 1000 1492.93 1492.75 0.382143
延遲測試
-
一台機器開啟opensm(需root權限),使用ib_send_lat
-
把一台機器作為server:
mlx@m04:~$ ib_send_lat -a -c UD -d mlx4_0 -i 2
************************************
* Waiting for client to connect... *
************************************
- 把另外一台機器作為client:
mlx@up75:~$ ib_send_lat -a -c UD -d mlx4_0 -i 2 10.0.0.1
Max msg size in UD is MTU 4096
Changing to this MTU
---------------------------------------------------------------------------------------
Send Latency Test
Dual-port : OFF Device : mlx4_0
Number of qps : 1 Transport type : IB
Connection type : UD Using SRQ : OFF
TX depth : 1
Mtu : 4096[B]
Link type : IB
Max inline data : 188[B]
rdma_cm QPs : OFF
Data ex. method : Ethernet
---------------------------------------------------------------------------------------
local address: LID 0x02 QPN 0x0239 PSN 0x29d370
remote address: LID 0x01 QPN 0x021b PSN 0xbc98c4
---------------------------------------------------------------------------------------
#bytes #iterations t_min[usec] t_max[usec] t_typical[usec]
2 1000 1.25 14.72 1.34
4 1000 1.24 88.94 1.27
8 1000 1.20 77.49 1.22
16 1000 1.21 66.69 1.23
32 1000 1.23 61.58 1.25
64 1000 1.27 12.92 1.30
128 1000 1.42 6.98 1.44
256 1000 1.94 173.62 1.97
512 1000 2.22 41.65 2.25
1024 1000 2.79 37.47 2.81
2048 1000 3.91 18.85 3.94
4096 1000 6.16 38.06 6.20
---------------------------------------------------------------------------------------
