线上问题: ifconfig 出现drop


目前准备通过  

netstat -s    ethtool -S  cat /proc/net/dev   cat /proc/net/snmp 

cat  /sys/class/net/<NIC>/statistics/

查看drop 统计

同时通过sar -n DEV 1 5 查看流量

tcpdump 抓包分析 报文特征

已经交给前场提取数据

其实想用dropwatch分析但是 现场不支持

1、首先内核必须大于等于2.6.302、编译内核时应该加上“NET_DROP_MONITOR=y”;
View Code

明天处理!!!!!

 

目前收集的信息中有比较重要有用的信息为:

 rx_no_buffer_count: 180972127 rx_missed_errors: 127669376
root@localhost / # cat /proc/net/snmp        
Ip: Forwarding DefaultTTL InReceives InHdrErrors InAddrErrors ForwDatagrams InUnknownProtos InDiscards InDelivers OutRequests OutDiscards OutNoRoutes ReasmTimeout ReasmReqds ReasmOKs ReasmFails FragOKs FragFails FragCreates
Ip: 2 64 43266916579 0 58 0 0 0 43266916521 39496151188 19 1096 0 0 0 0 687465 0 1577650
Icmp: InMsgs InErrors InDestUnreachs InTimeExcds InParmProbs InSrcQuenchs InRedirects InEchos InEchoReps InTimestamps InTimestampReps InAddrMasks InAddrMaskReps OutMsgs OutErrors OutDestUnreachs OutTimeExcds OutParmProbs OutSrcQuenchs OutRedirects OutEchos OutEchoReps OutTimestamps OutTimestampReps OutAddrMasks OutAddrMaskReps
Icmp: 10609 83 10226 217 0 0 0 151 15 0 0 0 0 43171445 0 43171273 0 0 0 0 26 146 0 0 0 0
IcmpMsg: InType0 InType3 InType8 InType11 OutType0 OutType3 OutType8
IcmpMsg: 15 10226 151 217 146 43171273 26
Tcp: RtoAlgorithm RtoMin RtoMax MaxConn ActiveOpens PassiveOpens AttemptFails EstabResets CurrEstab InSegs OutSegs RetransSegs InErrs OutRsts
Tcp: 1 200 120000 -1 638924225 792069897 51631303 76843627 105 43070509878 47284276280 4631429 0 60909693
Udp: InDatagrams NoPorts InErrors OutDatagrams RcvbufErrors SndbufErrors
Udp: 195179897 532 1998 218221001 0 0
UdpLite: InDatagrams NoPorts InErrors OutDatagrams RcvbufErrors SndbufErrors
UdpLite: 0 0 0 0 0 0
root@localhost / # netstat -s
Ip:
    317249965 total packets received
    58 with invalid addresses
    0 forwarded
    0 incoming packets discarded
    317249907 incoming packets delivered
    841450781 requests sent out
    19 outgoing packets dropped
    1096 dropped because of missing route
    687465 fragments received ok
    1577650 fragments created
Icmp:
    10609 ICMP messages received
    83 input ICMP message failed.
    ICMP input histogram:
        destination unreachable: 10226
        timeout in transit: 217
        echo requests: 151
        echo replies: 15
    43171452 ICMP messages sent
    0 ICMP messages failed
    ICMP output histogram:
        destination unreachable: 43171280
        echo request: 26
        echo replies: 146
IcmpMsg:
        InType0: 15
        InType3: 10226
        InType8: 151
        InType11: 217
        OutType0: 146
        OutType3: 43171280
        OutType8: 26
Tcp:
    638924267 active connections openings
    792070011 passive connection openings
    51631325 failed connection attempts
    76843631 connection resets received
    98 connections established
    120843208 segments received
    39642604 segments send out
    4631433 segments retransmited
    0 bad segments received.
    60909716 resets sent
Udp:
    195179954 packets received
    532 packets to unknown port received.
    1998 packet receive errors
    218221066 packets sent
    0 receive buffer errors
    0 send buffer errors
UdpLite:
TcpExt:
    947 resets received for embryonic SYN_RECV sockets
    148679 packets pruned from receive queue because of socket buffer overrun
    531225716 TCP sockets finished time wait in fast timer
    347738973 delayed acks sent
    2698538 delayed acks further delayed because of locked socket
    Quick ack mode was activated 2595825 times
    28481169 packets directly queued to recvmsg prequeue.
    6717820 bytes directly in process context from backlog
    1033332179 bytes directly received in process context from prequeue
    1010153350 packet headers predicted
    14758935 packets header predicted and directly queued to user
    3242078974 acknowledgments not containing data payload received
    3880126246 predicted acknowledgments
    1536 times recovered from packet loss due to fast retransmit
    13263 times recovered from packet loss by selective acknowledgements
    1733 congestion windows recovered without slow start by DSACK
    136942 congestion windows recovered without slow start after partial ack
    16400 TCP data loss events
    TCPLostRetransmit: 639
    1574 timeouts after reno fast retransmit
    14245 timeouts after SACK recovery
    78 timeouts in loss state
    50060 fast retransmits
    4432 forward retransmits
    16193 retransmits in slow start
    4159092 other TCP timeouts
    86 classic Reno fast retransmits failed
    552 SACK retransmits failed
    22768427 packets collapsed in receive queue due to low socket buffer
    3110543 DSACKs sent for old packets
    57636 DSACKs sent for out of order packets
    34337 DSACKs received
    9 DSACKs for out of order packets received
    4438261 connections reset due to unexpected data
    4890686 connections reset due to early user close
    21201 connections aborted due to timeout
    TCPSACKDiscard: 2
    TCPDSACKIgnoredOld: 22762
    TCPDSACKIgnoredNoUndo: 7791
    TCPSpuriousRTOs: 1396
    TCPSackShifted: 42109
    TCPSackMerged: 43323
    TCPSackShiftFallback: 63448
    TCPBacklogDrop: 1435
    TCPDeferAcceptDrop: 139642
    TCPTimeWaitOverflow: 171130
IpExt:
    InMcastPkts: 5
    InBcastPkts: 1213607
    InOctets: 1673780726
    OutOctets: -1198119745
    InMcastOctets: 140
    InBcastOctets: 104766229
root@localhost / # ethtool -S eth1
NIC statistics:
     rx_packets: 27862085998
     tx_packets: 1262711762
     rx_bytes: 2039732992330
     tx_bytes: 125303834249
     rx_broadcast: 35401078
     tx_broadcast: 26016705
     rx_multicast: 54576108
     tx_multicast: 808459
     rx_errors: 0
     tx_errors: 0
     tx_dropped: 0
     multicast: 54576108
     collisions: 0
     rx_length_errors: 0
     rx_over_errors: 0
     rx_crc_errors: 0
     rx_frame_errors: 0
     rx_no_buffer_count: 1571897
     rx_missed_errors: 1390
     tx_aborted_errors: 0
     tx_carrier_errors: 0
     tx_fifo_errors: 0
     tx_heartbeat_errors: 0
     tx_window_errors: 0
     tx_abort_late_coll: 0
     tx_deferred_ok: 0
     tx_single_coll_ok: 0
     tx_multi_coll_ok: 0
     tx_timeout_count: 0
     tx_restart_queue: 0
     rx_long_length_errors: 0
     rx_short_length_errors: 0
     rx_align_errors: 0
     tx_tcp_seg_good: 0
     tx_tcp_seg_failed: 0
     rx_flow_control_xon: 0
     rx_flow_control_xoff: 0
     tx_flow_control_xon: 220291
     tx_flow_control_xoff: 221680
     rx_long_byte_count: 2039732992330
     rx_csum_offload_good: 27739152135
     rx_csum_offload_errors: 0
     rx_header_split: 0
     alloc_rx_buff_failed: 0
     tx_smbus: 0
     rx_smbus: 0
     dropped_smbus: 0
     rx_dma_failed: 0
     tx_dma_failed: 0
root@localhost / # ethtool -S eth2
NIC statistics:
     rx_packets: 1261353932
     tx_packets: 27862955528
     rx_bytes: 125372925112
     tx_bytes: 2040275968171
     rx_broadcast: 25614225
     tx_broadcast: 35423396
     rx_multicast: 808438
     tx_multicast: 54576132
     rx_errors: 0
     tx_errors: 0
     tx_dropped: 0
     multicast: 808438
     collisions: 0
     rx_length_errors: 0
     rx_over_errors: 0
     rx_crc_errors: 0
     rx_frame_errors: 0
     rx_no_buffer_count: 180972127
     rx_missed_errors: 127669376
     tx_aborted_errors: 0
     tx_carrier_errors: 0
     tx_fifo_errors: 0
     tx_heartbeat_errors: 0
     tx_window_errors: 0
     tx_abort_late_coll: 0
     tx_deferred_ok: 0
     tx_single_coll_ok: 0
     tx_multi_coll_ok: 0
     tx_timeout_count: 0
     tx_restart_queue: 0
     rx_long_length_errors: 0
     rx_short_length_errors: 0
     rx_align_errors: 0
     tx_tcp_seg_good: 0
     tx_tcp_seg_failed: 0
     rx_flow_control_xon: 0
     rx_flow_control_xoff: 0
     tx_flow_control_xon: 0
     tx_flow_control_xoff: 0
     rx_long_byte_count: 125372925112
     rx_csum_offload_good: 1228845122
     rx_csum_offload_errors: 10
     rx_header_split: 0
     alloc_rx_buff_failed: 0
     tx_smbus: 0
     rx_smbus: 0
     dropped_smbus: 0
     rx_dma_failed: 0
     tx_dma_failed: 0
root@localhost / # cat /proc/net/dev
Inter-|   Receive                                                |  Transmit
 face |bytes    packets errs drop fifo frame compressed multicast|bytes    packets errs drop fifo colls carrier compressed
    lo: 384657885396 1478036246    0    0    0     0          0         0 384657885396 1478036246    0    0    0     0       0          0
 bond0:       0       0    0    0    0     0          0         0        0       0    0    0    0     0       0          0
  eth0: 1820898757 15863793    0    0    0     0          0         0 26863219776 242854331    0    0    0     0       0          0
 teql0:       0       0    0    0    0     0          0         0        0       0    0    0    0     0       0          0
  sit0:       0       0    0    0    0     0          0         0        0       0    0    0    0     0       0          0
  eth1: 2039734404414 27862095641    0 1390    0     0          0  54576108 125317855145 1262724651    0    0    0     0       0          0
  eth2: 125385953071 1261365428    0 127669376    0     0          0    808463 2040277248858 27862964224    0    0    0     0       0          0
  eth3: 27566182128841 397592489315    0    0    0     0          0    769327 70776340796697 437380843982    0    0    0     0       0          0
  eth4: 71536212472219 436926157198    0 14415212551    0     0          0  54574179 27741862259957 401183191271    0    0    0     0       0          0
  eth5: 39944246780346 43763462640    0 2456    0     0          0       277 35696516892117 36611633993    0    0    0     0       0          0
  eth6: 35697179600486 36611633978    0    0    0     0          0  76995504 39944845468824 43763451889    0    0    0     0       0          0
eth1.1:       0       0    0    0    0     0          0         0     1026      13    0    0    0     0       0          0
eth2.1:       0       0    0    0    0     0          0         0      936      12    0    0    0     0       0          0
eth3.1:       0       0    0    0    0     0          0         0      936      12    0    0    0     0       0          0
eth4.1:       0       0    0    0    0     0          0         0      558       7    0    0    0     0       0          0
eth5.1:       0       0    0    0    0     0          0         0     1494      19    0    0    0     0       0          0
eth6.1:       0       0    0    0    0     0          0         0      468       6    0    0    0     0       0          0
View Code

查看驱动代码看下 rx_no_buffer_count 以及rx_missed_errors 是怎么来的吧

mpc = E1000_READ_REG(hw, E1000_MPC); adapter->stats.mpc += mpc; IGB_STAT("rx_missed_errors", stats.mpc),
IGB_STAT("rx_no_buffer_count", stats.rnbc), adapter->stats.rnbc += E1000_READ_REG(hw, E1000_RNBC);

这几个参数都是从 芯片寄存器读取出来的:

大概意思是;

rx_no_buffer_count = E1000_RNBC
rx_missed_error = E1000_MPC

在网上收了一下 RNBC 以及MPC 得到了如下一段话:

 Missed Packets Count – MPC 

  Counts the number of missed packets. Packets are missed when the receive FIFO has insufficient space to store the incoming packet. This can be caused because of too few buffers allocated, or because there
is insufficient bandwidth on the PCI bus. Events setting this counter causes ICR.Rx Miss, the Receiver Overrun Interrupt, to be set. This register does not increment if receives are not enabled.
These packets are also counted in the Total Packets Received register as well as in Total Octets Received.

rx_missed_errors与硬中断有关。也就是在DMA传送完,发送硬中断之前,网卡的FIFO缓冲已经满了,导致接收的数据要立即丢掉;按道理调整 rx fifo  ethtool  -G ethx  xxx 就可以吧!!! 

也可能这样理解是错误的-----------------------------------------

Receive No Buffers Count – RNBC
  This register counts the number of times that frames were received when there were no available buffers in host memory to store those frames (receive descriptor head and tail pointers were equal).
The packet is still received if there is space in the FIFO. This register only increments if receives are enabled (RCTL.RXEN is set). This register does not increment when flow control packets are received.

rx_no_buffer_count 应该是指在网卡通过DMA将设备FIFO中的skb->data传送到rx_buffer_info时,发现对应的rx_buffer_info还没有unmap,也就无法送到内核memory主存。

也就是 和软中断处理的速度有关---系统处理数据较慢导致??

不知这样理解是否正确?????

 

 

 参考:https://lp007819.wordpress.com/2013/05/

 目前自己环境10g光口是出现一个现象:

 

 ifconfig 的drop 和ethtool drop 不相等?

之前是一样的?  那这两个drop 是什么概念呢?port.drop呢?

-----大约看了一下驱动的意思; rx_dropped 就是表示 网卡 RNBC 问题; 确实也是 提高了 rx ring 就解决问题?

ifconfig ethx 这个参数的意义是?

 RX dropped: 应该是 内核协议栈丢弃的报文+ 网卡fifo不够丢弃的吧!!!

RX overruns:rx_fifo_errors ?? 应该和rx_missed_errors 差不多: 表示网卡 fifo 不够吧!!不过具体网卡驱动其值不一样 

RX frames:不知道 

算了驱动不同 其值不一样 ;具体问题具体分析吧

看了一下 丢包时:ethtool 命令不错

其主要有 ethtool  -S  -g -G -i -a 啥的

一般都会查看工作模式是否正常以及CRC---Speed,Duplex,CRC     overruns 是否一直增大  查看/修改网卡的buffer size情况

 

客户现场具体是什么问题?

网卡问题?  真的是报文速率过大? 还是??

下周再来看吧!!!

rx_over_errors: 0
     rx_crc_errors: 0
     rx_frame_errors: 0
     rx_no_buffer_count: 180972127
     rx_missed_errors: 127669376

这几个参数 ------ fifo-error   missed-error  no-buffer 下载 在详细看芯片手册-------------------------------

 

 


免责声明!

本站转载的文章为个人学习借鉴使用,本站对版权不负任何法律责任。如果侵犯了您的隐私权益,请联系本站邮箱yoyou2525@163.com删除。



 
粤ICP备18138465号  © 2018-2025 CODEPRJ.COM