FastDFS不同步怎么破

本文轉載自查看原文 2020-08-16 23:05 1067 同步/ fastdfs/ binlog

一、背景說明

FastDFS是一款開源的分布式文件系統，具體介紹就不說了，有興趣的可以自行百度下。

以下是官方的架構圖：

一次完整的寫交互過程如下：

1、Client向Tracker查詢可用的Storage；

2、Tracker隨機返回一個Storage；

3、Client向Storage發起寫請求；

一次完整的讀交互：

1、Client向Tracker查詢可用的Storage；

2、Tracker隨機返回一個Storage；

3、Client向Storage發起讀請求；

可以看到每個Storage都是對等的，即每個Storage上存儲的文件都是全量的。

最近一朋友線上FastDFS服務器老是報文件不存在的錯誤，版本為5.11：

[2020-08-12 23:16:37] WARNING - file: storage_service.c, line: 6899, client ip: xx.xx.xxx.xxx, logic file: 06/75/xxxx.jpg not exist

架構如下：

2台Tracker，2台Storage。

每台機器上都有上述報錯。

二、FastDFS同步機制分析

我們先分析FastDFS如何實現文件在不同服務器的同步的，FastDFS是以binglog的格式同步各自上傳/修改的文件的，具體位置在安裝目錄的data/sync目錄下，文件一般叫binlog.000這樣，以下為我開發機的截圖：

具體內容如下：

1589182799 C M00/00/00/rBVrTV65AU-ACKi2AAARqXyG2io334.jpg
1589182885 C M00/00/00/rBVrTV65AaWAAYwKAAARqXyG2io765.jpg
1589427410 C M00/00/00/rBVrTV68vNKAbceuAAARqXyG2io657.jpg

第1列是時間戳，第2列是修改內容，示例中大部分是創建文件，所以是C，其它參考文件 storage/storage_sync.h：

 
                 #define STORAGE_OP_TYPE_SOURCE_CREATE_FILE  'C'  //upload file 
                
                 #define STORAGE_OP_TYPE_SOURCE_APPEND_FILE  'A'  //append file 
                
                 #define STORAGE_OP_TYPE_SOURCE_DELETE_FILE  'D'  //delete file 
                
                 #define STORAGE_OP_TYPE_SOURCE_UPDATE_FILE  'U'  //for whole file update such as metadata file 
                
                 #define STORAGE_OP_TYPE_SOURCE_MODIFY_FILE  'M'  //for part modify 
                
                 #define STORAGE_OP_TYPE_SOURCE_TRUNCATE_FILE  'T'  //truncate file 
                
                 #define STORAGE_OP_TYPE_SOURCE_CREATE_LINK  'L'  //create symbol link

有了binglog只是保證不同服務器可以同步數據了，真正實現還有很多東西要考慮：

1、每次是全量還是增量同步，如果是增量，如何記錄最后同步的位置，同步的位置做持久化嗎；

2、binlog如何保證可靠性，即FastDFS實現的時候是binlog刷磁盤即fsync后才返回給客戶端嗎；

關於第1點，FastDFS是實現增量同步的，最后位置保存在安裝目錄的data/sync目錄下，擴展名是mark的文件，具體格式是這樣的：

172.21.107.236_23000.mark

即 IP_端口.mark。

如果集群中有兩個Storage，172.21.104.36, 172.21.104.35，則在36上有1個mark文件：172.21.104.35_23000.mark，而在35上mark文件也只有1個：

172.21.104.36_23000.mark。

mark文件具體內容如下：

binlog_index=0
binlog_offset=3422
need_sync_old=1
sync_old_done=1
until_timestamp=1596511256
scan_row_count=118
sync_row_count=62

關鍵參數是binlog_offset，即binlog中最后同步成功的偏移量，每同步一個文件后，都會將偏移量更新。

那binlog是異步還是同步將binlog同步給其它的Storage呢，答案是異步，具體可以參考函數：storage_sync_thread_entrance，這個函數是線程的入口，FastDFS在啟動時會啟動這個線程用來同步：

int storage_sync_thread_start(const FDFSStorageBrief *pStorage)
{
  int result;
  pthread_attr_t pattr;
  pthread_t tid;

 //省略非關鍵代碼

  /*
  //printf("start storage ip_addr: %s, g_storage_sync_thread_count=%d\n", 
      pStorage->ip_addr, g_storage_sync_thread_count);
  */

  if ((result=pthread_create(&tid, &pattr, storage_sync_thread_entrance, \
    (void *)pStorage)) != 0)
  {
    logError("file: "__FILE__", line: %d, " \
      "create thread failed, errno: %d, " \
      "error info: %s", \
      __LINE__, result, STRERROR(result));

    pthread_attr_destroy(&pattr);
    return result;
  }

在這個線程中，會周期地讀取binlog，然后同步給其它的Storage：

 
                 while  
                 (g_continue_flag && (!g_sync_part_time || \ 
                
                 (current_time >= start_time && \ 
                
                 current_time <= end_time)) && \ 
                
                 (pStorage->status == FDFS_STORAGE_STATUS_ACTIVE || \ 
                
                 pStorage->status == FDFS_STORAGE_STATUS_SYNCING)) 
                
                 { 
                
                 //讀取binlog 
                
                 read_result = storage_binlog_read(&reader, \ 
                
                 &record, &record_len); 
                
                 //省略非關鍵代碼 
                
                 } 
                
                 if  
                 (read_result != 0) 
                
                 { 
                
                 //省略非關鍵代碼 
                
                 } 
                
                 else  
                 if  
                 ((sync_result=storage_sync_data(&reader, \ 
                
                 &storage_server, &record)) != 0) 
                
                 { 
                
                 //上面就是就binlog同步到其它Storage 
                
                 logDebug( 
                 "file: " 
                 __FILE__ 
                 ", line: %d, "  
                 \ 
                
                 "binlog index: %d, current record "  
                 \ 
                
                 "offset: %" 
                 PRId64 
                 ", next "  
                 \ 
                
                 "record offset: %" 
                 PRId64, \ 
                
                 __LINE__, reader.binlog_index, \ 
                
                 reader.binlog_offset, \ 
                
                 reader.binlog_offset + record_len); 
                
                 if  
                 (rewind_to_prev_rec_end(&reader) != 0) 
                
                 { 
                
                 logCrit( 
                 "file: " 
                 __FILE__ 
                 ", line: %d, "  
                 \ 
                
                 "rewind_to_prev_rec_end fail, " 
                 \ 
                
                 "program exit!" 
                 , __LINE__); 
                
                 g_continue_flag =  
                 false 
                 ; 
                
                 } 
                
                 break 
                 ; 
                
                 } 
                
                 if  
                 (reader.last_scan_rows != reader.scan_row_count) 
                
                 { 
                
                 //定稿mark文件 
                
                 if  
                 (storage_write_to_mark_file(&reader) != 0) 
                
                 { 
                
                 logCrit( 
                 "file: " 
                 __FILE__ 
                 ", line: %d, "  
                 \ 
                
                 "storage_write_to_mark_file fail, "  
                 \ 
                
                 "program exit!" 
                 , __LINE__); 
                
                 g_continue_flag =  
                 false 
                 ; 
                
                 break 
                 ; 
                
                 } 
                
                 }

可以看到，這個線程周期性地調用storage_binlog_read 讀取binlog，然后調用storage_sync_data同步給其它Storage，然后調用storage_write_to_mark_file 將mark文件寫入到磁盤持久化。

通過上面的分析，可以判斷FastDFS在異步情況下是會丟數據的，因為同步binlog給其它Storage是異步的，所以還沒同步之前這台機器掛了並且起不來，數據是會丟失的；

另外binlog不是每1次都刷磁盤的，有參數設置，單位為秒：

sync_binlog_buff_interval

即保證多久將將mark文件刷新到磁盤中，果設置大於0，也是會容易丟失數據的。

三、解決方案

回到問題本身，為什么出現數據不同步呢，是因為在搭建 FastDFS的時候，運維的同學直接從其它服務器上拷過來的，包括整個data目錄，也包括data下面的sync目錄，這樣就容易出現mark文件的偏移量不准的問題。

如何解決呢，手動修改mark文件，將binlog_offset設為0，這樣FastDFS就會從頭同步文件，碰到已經存在的文件，系統會略過的，這是我開發機上的日志：

[2020-08-11 20:27:36] DEBUG - file: storage_sync.c, line: 143, sync data file, logic file: M00/00/00/rBVrTV8yZl6ATOQyAAAJTMk6Vgo7337.md on dest server xx.xx.xx.xx:23000 already exists, and same as mine, ignore it

當然前提是日志級別開到DEBUG級別。

PS：

源代碼中同步文件成功是沒有日志的，寫mark文件成功也是沒有日志的，為了調試方便，我們都加上相關的調試日志了。

保存mark文件加日志可以在函數storage_write_to_mark_file中加入一條info日志。

  if ((result=storage_write_to_fd(pReader->mark_fd, \
    get_mark_filename_by_reader, pReader, buff, len)) == 0)
  {
    pReader->last_scan_rows = pReader->scan_row_count;
    pReader->last_sync_rows = pReader->sync_row_count;

    logInfo("file: "__FILE__", line: %d, " \
              "write server:%s mark file success, offset:%d", \
              __LINE__, pReader->storage_id, pReader->binlog_offset);
        
  }

往期精彩文章：

redis-port支持前綴遷移

Nginx配置不當險釀S0

分布式文件系統FastDFS

碼字不易，如果覺得這篇文章有幫助，請關注我的個人公眾號：

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 WAS不同步啟動報錯 zabbix時間不同步 DNS主從不同步問題 chronyd不同步的坑處理 vncviewer鼠標不同步問題 FastDFS文件同步 linux系統時間與網絡時間不同步解決視頻的聲音和畫面不同步問題 Redis 4.0 從節點寫入不同步問題 [Docker]掛載文件不同步