minio 4*4 集群 故障測試


因為一個minio有點集群故障(文件寫入異常),所以基於官方的理論測試下集群容錯性

一個計算規則

4*4 模式的,默認使用的糾刪碼條紋為16 (官方的模式是取最大,但是計算頁面可以調整,對於minio來說這個是自動的),按照套路應該是可以一個server 以及4個盤異常的,不應該出現一個節點異常造成服務不可用的
對於糾刪碼的奇偶校驗可以自己設置,一般集群環境是4

 

 

環境准備

  • docker-compose 文件
 
version: '3.7'
services:
  sidekick:
    image: minio/sidekick:v1.2.0
    tty: true
    ports:
    - "80:80"
    command: --health-path=/minio/health/ready --address :80 http://minio{1...4}:9000
  gateway:
    image: minio/minio:RELEASE.2022-03-26T06-49-28Z
    command: gateway s3 http://sidekick --console-address ":19000"
    environment:
      MINIO_ACCESS_KEY: minio
      MINIO_SECRET_KEY: minio123
    ports:
    - "9000:9000"
    - "19000:19000"
  minio1:
    image: minio/minio:RELEASE.2022-03-26T06-49-28Z
    volumes:
      - data1-1:/data1
      - data1-2:/data2
      - data1-3:/data3
      - data1-4:/data4
    ports:
      - "9001:9000"
      - "19001:19001"
    environment:
      MINIO_ACCESS_KEY: minio
      MINIO_SECRET_KEY: minio123
    command: server http://minio{1...4}/data{1...4} --console-address ":19001"
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:9000/minio/health/live"]
      interval: 30s
      timeout: 20s
      retries: 3
  minio2:
    image: minio/minio:RELEASE.2022-03-26T06-49-28Z
    volumes:
      - data2-1:/data1
      - data2-2:/data2
      - data2-3:/data3
      - data2-4:/data4
    ports:
      - "9002:9000"
      - "19002:19002"
    environment:
      MINIO_ACCESS_KEY: minio
      MINIO_SECRET_KEY: minio123
    command: server http://minio{1...4}/data{1...4} --console-address ":19002"
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:9000/minio/health/live"]
      interval: 30s
      timeout: 20s
      retries: 3
  minio3:
    image: minio/minio:RELEASE.2022-03-26T06-49-28Z
    volumes:
      - data3-1:/data1
      - data3-2:/data2
      - data3-3:/data3
      - data3-4:/data4
    ports:
      - "9003:9000"
      - "19003:19003"
    environment:
      MINIO_ACCESS_KEY: minio
      MINIO_SECRET_KEY: minio123
    command: server http://minio{1...4}/data{1...4} --console-address ":19003"
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:9000/minio/health/live"]
      interval: 30s
      timeout: 20s
      retries: 3
  minio4:
    image: minio/minio:RELEASE.2022-03-26T06-49-28Z
    volumes:
      - data4-1:/data1
      - data4-2:/data2
      - data4-3:/data3
      - data4-4:/data4
    ports:
      - "9004:9000"
      - "19004:19004"
    environment:
      MINIO_ACCESS_KEY: minio
      MINIO_SECRET_KEY: minio123
    command: server http://minio{1...4}/data{1...4} --console-address ":19004"
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:9000/minio/health/live"]
      interval: 30s
      timeout: 20s
      retries: 3
volumes:
  data1-1:
  data1-2:
  data1-3:
  data1-4:
  data2-1:
  data2-2:
  data2-3:
  data2-4:
  data3-1:
  data3-2:
  data3-3:
  data3-4:
  data4-1:
  data4-2:
  data4-3:
  data4-4:

故障簡單測試

直接停止一個服務節點(會有4個driver 下線),比如docker-compose stop minio1 ,然后通過gateway 以及入口上傳文件測試服務的讀寫情況,之后啟動服務之后看看數據恢復情況

服務節點日志

可以查看minio4 可以看到信息

 

 

 

 


查看minio1 的文件恢復情況

 

 

說明

經過以上簡單的測試,實際上minio 的集群是靠譜的,核心可能還是在nginx 入口設置有些問題,后續應該調整下,以上只是實際的問題,想驗證下理論是否有問題

參考資料

https://github.com/minio/minio/blob/master/docs/distributed/SIZING.md
https://min.io/product/erasure-code-calculator?number_of_servers=8&drives_per_server=16&drive_capacity=8&stripe_size=16&parity_count=4
https://github.com/rongfengliang/minio-cluster-sidekick-learning


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM