1.
16313:20140109:110809.577 resuming IPMI checks on host [10.1.3.41]: connection restored 16337:20140109:113655.574 IPMI item [Current_1] on host [10.1.3.41] failed: first network error, wait for 15 seconds 16313:20140109:113717.981 IPMI item [Current_1] on host [10.1.3.41] failed: another network error, wait for 15 seconds 16313:20140109:113733.014 IPMI item [Inlet_Temp] on host [10.1.3.41] failed: another network error, wait for 15 seconds
出現情況:有時候能夠獲取數據,有時候抱着個錯誤,
解決:調整server中的Timeout取值,從3s-->10s
2.
16315:20140109:140225.429 cannot send list of active checks to [10.192.0.5]: host [10.192.0.5] not found 16317:20140109:140425.532 cannot send list of active checks to [10.192.0.5]: host [10.192.0.5] not found
這個問題出現在直接刪除主機10.192.0.5后,serverlog出現的
將10.192.0.5上的agent停止即可。
========================================================
第一個問題解決方法不是這樣的。因為設置后仍然出現了錯誤,但是如果我將查詢間隔從1800s調整到60秒,查看了大概幾十分鍾,沒有出現這個問題。但背后的原因仍不清楚。