目前准備通過
netstat -s ethtool -S cat /proc/net/dev cat /proc/net/snmp
cat /sys/class/net/<NIC>/statistics/
查看drop 統計
同時通過sar -n DEV 1 5 查看流量
tcpdump 抓包分析 報文特征
已經交給前場提取數據
其實想用dropwatch分析但是 現場不支持

1、首先內核必須大於等於2.6.30; 2、編譯內核時應該加上“NET_DROP_MONITOR=y”;
明天處理!!!!!
目前收集的信息中有比較重要有用的信息為:
rx_no_buffer_count: 180972127 rx_missed_errors: 127669376

root@localhost / # cat /proc/net/snmp Ip: Forwarding DefaultTTL InReceives InHdrErrors InAddrErrors ForwDatagrams InUnknownProtos InDiscards InDelivers OutRequests OutDiscards OutNoRoutes ReasmTimeout ReasmReqds ReasmOKs ReasmFails FragOKs FragFails FragCreates Ip: 2 64 43266916579 0 58 0 0 0 43266916521 39496151188 19 1096 0 0 0 0 687465 0 1577650 Icmp: InMsgs InErrors InDestUnreachs InTimeExcds InParmProbs InSrcQuenchs InRedirects InEchos InEchoReps InTimestamps InTimestampReps InAddrMasks InAddrMaskReps OutMsgs OutErrors OutDestUnreachs OutTimeExcds OutParmProbs OutSrcQuenchs OutRedirects OutEchos OutEchoReps OutTimestamps OutTimestampReps OutAddrMasks OutAddrMaskReps Icmp: 10609 83 10226 217 0 0 0 151 15 0 0 0 0 43171445 0 43171273 0 0 0 0 26 146 0 0 0 0 IcmpMsg: InType0 InType3 InType8 InType11 OutType0 OutType3 OutType8 IcmpMsg: 15 10226 151 217 146 43171273 26 Tcp: RtoAlgorithm RtoMin RtoMax MaxConn ActiveOpens PassiveOpens AttemptFails EstabResets CurrEstab InSegs OutSegs RetransSegs InErrs OutRsts Tcp: 1 200 120000 -1 638924225 792069897 51631303 76843627 105 43070509878 47284276280 4631429 0 60909693 Udp: InDatagrams NoPorts InErrors OutDatagrams RcvbufErrors SndbufErrors Udp: 195179897 532 1998 218221001 0 0 UdpLite: InDatagrams NoPorts InErrors OutDatagrams RcvbufErrors SndbufErrors UdpLite: 0 0 0 0 0 0 root@localhost / # netstat -s Ip: 317249965 total packets received 58 with invalid addresses 0 forwarded 0 incoming packets discarded 317249907 incoming packets delivered 841450781 requests sent out 19 outgoing packets dropped 1096 dropped because of missing route 687465 fragments received ok 1577650 fragments created Icmp: 10609 ICMP messages received 83 input ICMP message failed. ICMP input histogram: destination unreachable: 10226 timeout in transit: 217 echo requests: 151 echo replies: 15 43171452 ICMP messages sent 0 ICMP messages failed ICMP output histogram: destination unreachable: 43171280 echo request: 26 echo replies: 146 IcmpMsg: InType0: 15 InType3: 10226 InType8: 151 InType11: 217 OutType0: 146 OutType3: 43171280 OutType8: 26 Tcp: 638924267 active connections openings 792070011 passive connection openings 51631325 failed connection attempts 76843631 connection resets received 98 connections established 120843208 segments received 39642604 segments send out 4631433 segments retransmited 0 bad segments received. 60909716 resets sent Udp: 195179954 packets received 532 packets to unknown port received. 1998 packet receive errors 218221066 packets sent 0 receive buffer errors 0 send buffer errors UdpLite: TcpExt: 947 resets received for embryonic SYN_RECV sockets 148679 packets pruned from receive queue because of socket buffer overrun 531225716 TCP sockets finished time wait in fast timer 347738973 delayed acks sent 2698538 delayed acks further delayed because of locked socket Quick ack mode was activated 2595825 times 28481169 packets directly queued to recvmsg prequeue. 6717820 bytes directly in process context from backlog 1033332179 bytes directly received in process context from prequeue 1010153350 packet headers predicted 14758935 packets header predicted and directly queued to user 3242078974 acknowledgments not containing data payload received 3880126246 predicted acknowledgments 1536 times recovered from packet loss due to fast retransmit 13263 times recovered from packet loss by selective acknowledgements 1733 congestion windows recovered without slow start by DSACK 136942 congestion windows recovered without slow start after partial ack 16400 TCP data loss events TCPLostRetransmit: 639 1574 timeouts after reno fast retransmit 14245 timeouts after SACK recovery 78 timeouts in loss state 50060 fast retransmits 4432 forward retransmits 16193 retransmits in slow start 4159092 other TCP timeouts 86 classic Reno fast retransmits failed 552 SACK retransmits failed 22768427 packets collapsed in receive queue due to low socket buffer 3110543 DSACKs sent for old packets 57636 DSACKs sent for out of order packets 34337 DSACKs received 9 DSACKs for out of order packets received 4438261 connections reset due to unexpected data 4890686 connections reset due to early user close 21201 connections aborted due to timeout TCPSACKDiscard: 2 TCPDSACKIgnoredOld: 22762 TCPDSACKIgnoredNoUndo: 7791 TCPSpuriousRTOs: 1396 TCPSackShifted: 42109 TCPSackMerged: 43323 TCPSackShiftFallback: 63448 TCPBacklogDrop: 1435 TCPDeferAcceptDrop: 139642 TCPTimeWaitOverflow: 171130 IpExt: InMcastPkts: 5 InBcastPkts: 1213607 InOctets: 1673780726 OutOctets: -1198119745 InMcastOctets: 140 InBcastOctets: 104766229 root@localhost / # ethtool -S eth1 NIC statistics: rx_packets: 27862085998 tx_packets: 1262711762 rx_bytes: 2039732992330 tx_bytes: 125303834249 rx_broadcast: 35401078 tx_broadcast: 26016705 rx_multicast: 54576108 tx_multicast: 808459 rx_errors: 0 tx_errors: 0 tx_dropped: 0 multicast: 54576108 collisions: 0 rx_length_errors: 0 rx_over_errors: 0 rx_crc_errors: 0 rx_frame_errors: 0 rx_no_buffer_count: 1571897 rx_missed_errors: 1390 tx_aborted_errors: 0 tx_carrier_errors: 0 tx_fifo_errors: 0 tx_heartbeat_errors: 0 tx_window_errors: 0 tx_abort_late_coll: 0 tx_deferred_ok: 0 tx_single_coll_ok: 0 tx_multi_coll_ok: 0 tx_timeout_count: 0 tx_restart_queue: 0 rx_long_length_errors: 0 rx_short_length_errors: 0 rx_align_errors: 0 tx_tcp_seg_good: 0 tx_tcp_seg_failed: 0 rx_flow_control_xon: 0 rx_flow_control_xoff: 0 tx_flow_control_xon: 220291 tx_flow_control_xoff: 221680 rx_long_byte_count: 2039732992330 rx_csum_offload_good: 27739152135 rx_csum_offload_errors: 0 rx_header_split: 0 alloc_rx_buff_failed: 0 tx_smbus: 0 rx_smbus: 0 dropped_smbus: 0 rx_dma_failed: 0 tx_dma_failed: 0 root@localhost / # ethtool -S eth2 NIC statistics: rx_packets: 1261353932 tx_packets: 27862955528 rx_bytes: 125372925112 tx_bytes: 2040275968171 rx_broadcast: 25614225 tx_broadcast: 35423396 rx_multicast: 808438 tx_multicast: 54576132 rx_errors: 0 tx_errors: 0 tx_dropped: 0 multicast: 808438 collisions: 0 rx_length_errors: 0 rx_over_errors: 0 rx_crc_errors: 0 rx_frame_errors: 0 rx_no_buffer_count: 180972127 rx_missed_errors: 127669376 tx_aborted_errors: 0 tx_carrier_errors: 0 tx_fifo_errors: 0 tx_heartbeat_errors: 0 tx_window_errors: 0 tx_abort_late_coll: 0 tx_deferred_ok: 0 tx_single_coll_ok: 0 tx_multi_coll_ok: 0 tx_timeout_count: 0 tx_restart_queue: 0 rx_long_length_errors: 0 rx_short_length_errors: 0 rx_align_errors: 0 tx_tcp_seg_good: 0 tx_tcp_seg_failed: 0 rx_flow_control_xon: 0 rx_flow_control_xoff: 0 tx_flow_control_xon: 0 tx_flow_control_xoff: 0 rx_long_byte_count: 125372925112 rx_csum_offload_good: 1228845122 rx_csum_offload_errors: 10 rx_header_split: 0 alloc_rx_buff_failed: 0 tx_smbus: 0 rx_smbus: 0 dropped_smbus: 0 rx_dma_failed: 0 tx_dma_failed: 0 root@localhost / # cat /proc/net/dev Inter-| Receive | Transmit face |bytes packets errs drop fifo frame compressed multicast|bytes packets errs drop fifo colls carrier compressed lo: 384657885396 1478036246 0 0 0 0 0 0 384657885396 1478036246 0 0 0 0 0 0 bond0: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 eth0: 1820898757 15863793 0 0 0 0 0 0 26863219776 242854331 0 0 0 0 0 0 teql0: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 sit0: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 eth1: 2039734404414 27862095641 0 1390 0 0 0 54576108 125317855145 1262724651 0 0 0 0 0 0 eth2: 125385953071 1261365428 0 127669376 0 0 0 808463 2040277248858 27862964224 0 0 0 0 0 0 eth3: 27566182128841 397592489315 0 0 0 0 0 769327 70776340796697 437380843982 0 0 0 0 0 0 eth4: 71536212472219 436926157198 0 14415212551 0 0 0 54574179 27741862259957 401183191271 0 0 0 0 0 0 eth5: 39944246780346 43763462640 0 2456 0 0 0 277 35696516892117 36611633993 0 0 0 0 0 0 eth6: 35697179600486 36611633978 0 0 0 0 0 76995504 39944845468824 43763451889 0 0 0 0 0 0 eth1.1: 0 0 0 0 0 0 0 0 1026 13 0 0 0 0 0 0 eth2.1: 0 0 0 0 0 0 0 0 936 12 0 0 0 0 0 0 eth3.1: 0 0 0 0 0 0 0 0 936 12 0 0 0 0 0 0 eth4.1: 0 0 0 0 0 0 0 0 558 7 0 0 0 0 0 0 eth5.1: 0 0 0 0 0 0 0 0 1494 19 0 0 0 0 0 0 eth6.1: 0 0 0 0 0 0 0 0 468 6 0 0 0 0 0 0
查看驅動代碼看下 rx_no_buffer_count 以及rx_missed_errors 是怎么來的吧
mpc = E1000_READ_REG(hw, E1000_MPC); adapter->stats.mpc += mpc; IGB_STAT("rx_missed_errors", stats.mpc),
IGB_STAT("rx_no_buffer_count", stats.rnbc), adapter->stats.rnbc += E1000_READ_REG(hw, E1000_RNBC);
這幾個參數都是從 芯片寄存器讀取出來的:
大概意思是;
rx_no_buffer_count = E1000_RNBC
rx_missed_error = E1000_MPC
在網上收了一下 RNBC 以及MPC 得到了如下一段話:
Missed Packets Count – MPC
Counts the number of missed packets. Packets are missed when the receive FIFO has insufficient space to store the incoming packet. This can be caused because of too few buffers allocated, or because there
is insufficient bandwidth on the PCI bus. Events setting this counter causes ICR.Rx Miss, the Receiver Overrun Interrupt, to be set. This register does not increment if receives are not enabled.
These packets are also counted in the Total Packets Received register as well as in Total Octets Received.
rx_missed_errors與硬中斷有關。也就是在DMA傳送完,發送硬中斷之前,網卡的FIFO緩沖已經滿了,導致接收的數據要立即丟掉;按道理調整 rx fifo ethtool -G ethx xxx 就可以吧!!!
也可能這樣理解是錯誤的-----------------------------------------
Receive No Buffers Count – RNBC
This register counts the number of times that frames were received when there were no available buffers in host memory to store those frames (receive descriptor head and tail pointers were equal).
The packet is still received if there is space in the FIFO. This register only increments if receives are enabled (RCTL.RXEN is set). This register does not increment when flow control packets are received.
rx_no_buffer_count 應該是指在網卡通過DMA將設備FIFO中的skb->data傳送到rx_buffer_info時,發現對應的rx_buffer_info還沒有unmap,也就無法送到內核memory主存。
也就是 和軟中斷處理的速度有關---系統處理數據較慢導致??
不知這樣理解是否正確?????
參考:https://lp007819.wordpress.com/2013/05/
目前自己環境10g光口是出現一個現象:
ifconfig 的drop 和ethtool drop 不相等?
之前是一樣的? 那這兩個drop 是什么概念呢?port.drop呢?
-----大約看了一下驅動的意思; rx_dropped 就是表示 網卡 RNBC 問題; 確實也是 提高了 rx ring 就解決問題?
ifconfig ethx 這個參數的意義是?
RX dropped: 應該是 內核協議棧丟棄的報文+ 網卡fifo不夠丟棄的吧!!!
RX overruns:rx_fifo_errors ?? 應該和rx_missed_errors 差不多: 表示網卡 fifo 不夠吧!!不過具體網卡驅動其值不一樣
RX frames:不知道
算了驅動不同 其值不一樣 ;具體問題具體分析吧
看了一下 丟包時:ethtool 命令不錯
其主要有 ethtool -S -g -G -i -a 啥的
一般都會查看工作模式是否正常以及CRC---Speed,Duplex,CRC overruns 是否一直增大 查看/修改網卡的buffer size情況
客戶現場具體是什么問題?
網卡問題? 真的是報文速率過大? 還是??
下周再來看吧!!!
rx_over_errors: 0 rx_crc_errors: 0 rx_frame_errors: 0 rx_no_buffer_count: 180972127 rx_missed_errors: 127669376
這幾個參數 ------ fifo-error missed-error no-buffer 下載 在詳細看芯片手冊-------------------------------