一、查看zabbix server端log
查看zabbix server日志發現這台主機的日志有大量報錯信息"first network error"以及 another network error
12522:20200915:003129.375 resuming Zabbix agent checks on host "192.168.200.38": connection restored
12511:20200915:003211.039 Zabbix agent item "agent.hostname" on host "192.168.200.38" failed: first network error, wait for 15 seconds
12513:20200915:003226.526 Zabbix agent item "system.cpu.util[,system,avg15]" on host "192.168.200.38" failed: another network error, wait for 15 seconds
12526:20200915:003237.372 Zabbix agent item "vm.memory.size[total]" on host "192.168.200.38" failed: another network error, wait for 15 seconds
12526:20200915:003256.377 resuming Zabbix agent checks on host "192.168.200.38": connection restored
12511:20200915:003330.181 Zabbix agent item "system.swap.size[,free]" on host "192.168.200.38" failed: first network error, wait for 15 seconds
12526:20200915:003350.383 resuming Zabbix agent checks on host "192.168.200.38": connection restored
12515:20200915:003426.192 Zabbix agent item "system.cpu.util[,system,avg15]" on host "192.168.200.38" failed: first network error, wait for 15 seconds
12524:20200915:003441.390 resuming Zabbix agent checks on host "192.168.200.38": connection restored
12515:20200915:003520.333 Zabbix agent item "perf_counter[\2\16]" on host "192.168.200.38" failed: first network error, wait for 15 seconds
12520:20200915:003537.209 Zabbix agent item "vm.memory.size[total]" on host "192.168.200.38" failed: another network error, wait for 15 seconds
12526:20200915:003559.396 resuming Zabbix agent checks on host "192.168.200.38": connection restored
12513:20200915:003630.007 Zabbix agent item "system.swap.size[,free]" on host "192.168.200.38" failed: first network error, wait for 15 seconds
12526:20200915:003650.401 resuming Zabbix agent checks on host "192.168.200.38": connection restored
12514:20200915:003731.023 Zabbix agent item "system.swap.size[,pfree]" on host "192.168.200.38" failed: first network error, wait for 15 seconds
12524:20200915:003750.409 resuming Zabbix agent checks on host "192.168.200.38": connection restored
12510:20200915:003759.024 Zabbix agent item "net.if.out[Nutanix VirtIO Ethernet Adapter,bytes]" on host "192.168.200.38" failed: first network error, wait for 15 seconds
二、查看主機TCP連接
發現存在大量的TIME_WAIT連接
三、 百度查明原因
從系統啟動,Windows Vista 中、 在 Windows 7 中,Windows Server 2008 中和在 Windows Server 2008 R2 中的 497 天后未關閉 TIME_WAIT 狀態的所有 TCP/IP 端口
意思是說,系統啟動的497天以后,所有在"TIME_WAIT"狀態的TCP鏈接都不會被關閉。TCP端口逐漸被占用完,不能創建新的TCP/IP連接
四、解決方案
1、重啟服務器
重啟服務器可以暫時解決這個問題,但是運行497天,仍會會出現這個問題
2、安裝補丁
微軟官網公告地址:https://support.microsoft.com/zh-cn/help/2553549/all-the-tcp-ip-ports-that-are-in-a-time-wait-status-are-not-closed-aft
由於已經微軟已經停止更新了,現在已經無法下載補丁包了,可以使用window update來更新補丁