Mysql5.7 的錯誤日志中最常見的note級別日志解釋


 
 
 
在使用mysql5.7的時候,發現了不少在mysql5.6上不曾見過的日志,級別為note, 最常見的note日志以下三種,下面我們來逐個解釋。
第一種,Aborted connection . 如上圖,信息如下:
2016-03-17T14:44:24.102542Z 59 [Note] Aborted connection 59 to db: ‘unconnected’ user: ‘mha’ host: ‘197.xx.xx.xx’ (Got an error reading communication packets)
2016-03-17T14:44:31.273897Z 61 [Note] Aborted connection 61 to db: ‘unconnected’ user: ‘mha’ host: ‘197.xx.xx.xx(Got an error reading communication packets)
2016-03-17T14:44:31.273897Z 61 [Note] Aborted connection 61 to db: ‘unconnected’ user: ‘mha’ host: ‘197.xx.xx.xx(Got an error reading communication packets)
 
2016-03-17T14:44:31.273897Z 61 [Note] Aborted connection 61 to db: ‘unconnected’ user: ‘mha’ host: ‘197.xx.xx.xx(Got  timeout  reading communication packets)
 
上面兩個是連接中斷的錯誤信息,原因不一樣, Got an error reading communication packets  的原因是因為網絡等原因導致Got  timeout  reading communication packets 這個錯誤的原因是會話的idle時間達到了數據庫指定的timeout時間。
 
第二種:SLAVE多線程同步的信息

信息如下:
2016-03-14T15:48:26.432150Z 73 [Note]Multi-threaded slave statistics for channel ”: seconds elapsed = 121; eventsassigned = 100374529; worker queues filled over
 overrun level = 0; waited due aWorker queue full = 0; waited due the total size = 0; waited at clock conflicts= 1451875661700 waited (count) when Workers occupied = 3211993 waited whenWorkers occupied = 445032386000
2016-03-14T15:50:28.398745Z 73 [Note]Multi-threaded slave statistics for channel ”: seconds elapsed = 122; eventsassigned = 100500481; worker queues filled over
 overrun level = 0; waited due aWorker queue full = 0; waited due the total size = 0; waited at clock conflicts= 1452001865500 waited (count) when Workers occupied = 3211993 waited whenWorkers occupied = 445032386000
 
我們通過源代碼,找到下面一段,該段實現了上述日志的輸出。
  if ((my_now
 – rli->mts_last_online_stat)>=
           mts_online_stat_period)
        {
         sql_print_information(“Multi-threadedslave
 statistics%s: “
                                “seconds
 elapsed = %lu; “
                                “events
 assigned = %llu; “
                                “worker
 queues filled over overrun level = %lu;”
                                “waited
 due a Worker queue full = %lu; “
                                “waited
 due the total size = %lu; “
                                “waited
 at clock conflicts = %llu “
                               “waited(count)
 when Workers occupied = %lu “
                                “waited
 when Workers occupied = %llu”,
                                rli->get_for_channel_str(),
                                static_cast<unsignedlong>
                                (my_now – rli->mts_last_online_stat),
                                rli->mts_events_assigned,
                                rli->mts_wq_overrun_cnt,
                                rli->mts_wq_overfill_cnt,
                                rli->wq_size_waits_cnt,
                                rli->mts_total_wait_overlap,
                                rli->mts_wq_no_underrun_cnt,
                                rli->mts_total_wait_worker_avail);
          rli->mts_last_online_stat=my_now;  
 
 通過這一句(my_now
 – rli->mts_last_online_stat),  以及最后一句rli->mts_last_online_stat=my_now;   可以得知,
 seconds elapsed 就是上一次統計跟這一次統計的時間間隔。
而mts_online_stat_period =120秒, 硬代碼,因此就是幾乎每隔120秒,就有上述日志的輸出。
 通過進一步查看原代碼,初步了解上述日志信息的含義,如下:
  
events assigned:總共有多少個event被分配執行,計的是總數。
worker queues filled over overrun level:多線程同步中,worker 的私有隊列長度超長的次數,計的是總數。
waited due a Worker queue full :因為worker的隊列超長而產生等待的次數,計的是總數。
waited due the total size :超過最大size的次數,這個由參數slave_pending_jobs_size_max  指定。
waited at clock conflicts :因為邏輯時間產生沖突的等待時間,單位是納秒。
waited (count) when Workers occupied :因為workder被占用而出現等待的次數。(總計值)。
waited when Workersoccupied :因為workder被占用而出現等待的總時間,總計值,單位是納秒。
 
 
第三種:page_cleaner線程的輸出日志
 

如圖,信息如下:
 
2016-03-24T17:45:28.005117Z 0 [Note] InnoDB:page_cleaner: 1000ms intended loop took 4750ms.The
 settings might not be optimal. (flushed=1519 and evicted=0, during the time.)
 
查找源代碼,發現是上面的日志由下面一段代碼輸出:
        if (ret_sleep
 == OS_SYNC_TIME_EXCEEDED) {
            ulint 
 curr_time = ut_time_ms();
 
            if (curr_time
 >next_loop_time + 3000) {
                if (warn_count
 == 0) {
                    ib::info()
 << “page_cleaner: 1000ms”
                        ” intended
 loop took “
                        <<1000 + curr_time
                           – next_loop_time
                        <<“ms. The
 settings might not”
                        ” be optimal.
 (flushed=”
                        <<n_flushed_last
                        <<” and evicted=”
                        <<n_evicted
                        <<“, during
 the time.)”;
                    if (warn_interval
 >300) {
                        warn_interval= 600;
                    }else {
                        warn_interval*= 2;
                    }
 
                    warn_count= warn_interval;
                } else {
                    –warn_count;
                }
            } else {
                /* reset counter */
                warn_interval= 1;
                warn_count= 0;
            }
 
            next_loop_time= curr_time + 1000;
            n_flushed_last= n_evicted = 0;
        }
 
 
通過分析源代碼, 輸出上述日志的條件是curr_time> next_loop_time + 3000 ,即比原定的循環時間next_loop_time多3000毫秒,而next_loop_time的標准時間是1000毫秒,即1秒鍾做一次刷新頁的操作。
loop took 4750ms ,即是這次刷新循環的實際經歷時間
 
后面還有一個 (flushed=1519 and evicted=0,during the time.)這樣的日志, 即對應n_flushed_lastn_evicted 變量,而這兩個變量又由n_flushed_list與n_flushed_lru賦值。
 
./storage/innobase/buf/buf0flu.cc:3322:                 n_flushed_last +=n_flushed_list;
./storage/innobase/buf/buf0flu.cc:3321:                 n_evicted += n_flushed_lru;
 
 
而n_flushed_list與n_flushed_lru賦值函數為pc_wait_finished,如下,我們來看看該函數的注釋。
pc_wait_finished(&n_flushed_lru,&n_flushed_list);
 
 
/**
Wait until all flush requests are finished.
@param n_flushed_lru    numberof pages flushed from the end of the LRU list.
@param n_flushed_list   numberof pages flushed from the end of the
            flush_list.
@return         trueif all flush_list flushing batch were success. */
static
bool
pc_wait_finished(
    ulint*  n_flushed_lru,
    ulint*  n_flushed_list)
{
。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。
}
通過源代碼的注釋,我們獲知如下信息:
n_flushed_lru   number of pages flushed from the end of the LRU list.
n_flushed_lru  這個值表示從lru 列表尾部刷新的頁數,也就是日志中如evicted=0 指標的所表示的值,如果該值不為零,則存在innodb buffer不夠的嫌疑。
n_flushed_list  這個是從刷新列表中刷新的頁數,也就是臟頁數,也就是日志中flushed=1519 的值。
 
 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM