GlusterFS是一種分布式分布式文件系統,默認采用無中心完全對等架構,搭建維護使用十分簡單,是很受歡迎的分布式文件系統。
官網https://www.gluster.org/,官網上表示Gluster 5是最新版本,點進去發現CentOS 8的,有點超前,不過RHCE8測試版1發布了。那么我們還是要用長期支持版本Gluster 4.1了。
1.基本環境的准備
1 在三台機器上都執行 2 [root@g1 ~]# yum install -y centos-release-gluster glusterfs-server glusterfs-fuse glusterfs-rdma glusterfs 3 [root@g1 ~]# ls /etc/yum.repos.d/ 4 CentOS7-Base-163.repo CentOS-Base.repo CentOS-CR.repo CentOS-Debuginfo.repo CentOS-fasttrack.repo CentOS-Gluster-4.1.repo CentOS-Media.repo CentOS-Sources.repo CentOS-Storage-common.repo CentOS-Vault.repo epel.repo salt-latest.repo 5 [root@g1 ~]# cat /etc/hosts 6 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 7 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 8 9 192.168.56.11 g1 10 192.168.56.12 g2 11 192.168.56.13 g3 12 [root@g1 ~]# mkdir -p /glusterfs/data{1,2,3} # 我直接使用這三個目錄了,因為虛擬機 13 [root@g1 ~]# systemctl start glusterfsd.service 14 [root@g1 ~]# systemctl enable glusterfsd.service 15 正規情況請先格式化磁盤並分別掛載 16 [root@g1 ~]# mkfs.xfs -i size=512 /dev/sdb 17 [root@g1 ~]# echo "/dev/sdb /glusterfs/data1 xfs defaults 1 2" >>/etc/fstab
2.池相關
我們將每一台機器看做一個池,池會提供后續塊資源(某個磁盤空間),若干個池組成一個glusterfs集群
1 在任意一台機器上 2 [root@g1 ~]# gluster pool list # 只有自己一個 3 UUID Hostname State 4 ce160a74-6f02-4760-862c-1daf6cfa4300 localhost Connected 5 [root@g1 ~]# gluster peer status # 沒有同等的小伙伴 6 Number of Peers: 0 7 [root@g1 ~]# gluster peer probe g2 # 將g2添加進來,這里主機名要解析,當然直接寫ip添加也是可以的 8 peer probe: success. 9 [root@g1 ~]# gluster peer probe g3 # 將g3添加進來,這里注意不需要添加自己 10 peer probe: success. 11 [root@g1 ~]# gluster peer status # 此時集群內對等的池數量有2個 12 Number of Peers: 2 13 14 Hostname: g2 15 Uuid: 1da148ac-0c81-4434-830e-ab3f66c046ea 16 State: Peer in Cluster (Connected) 17 18 Hostname: g3 19 Uuid: aff7232f-a731-4def-b996-db3723c1fc97 20 State: Peer in Cluster (Connected) 21 [root@g1 ~]# gluster pool list # 池則為3個 22 UUID Hostname State 23 1da148ac-0c81-4434-830e-ab3f66c046ea g2 Connected 24 aff7232f-a731-4def-b996-db3723c1fc97 g3 Connected 25 ce160a74-6f02-4760-862c-1daf6cfa4300 localhost Connected 26 [root@g1 ~]# gluster peer detach g3 # 從集群中去掉一個池
3.卷相關
池是集群內部資源,無法直接提供給外部使用,我們將池中的塊資源組成一個卷,卷是對外提供掛載使用的。一個集群可以提供多個卷。
一、分布卷: 將文件已hash算法隨機分布到 一台服務器節點中存儲

二、復制卷: 將文件復制到 replica x 個節點中。

三、條帶卷:將文件切割成數據塊,分別存儲到 stripe x 個節點中,類似於raid0。條帶卷的理念很好,將大文件分散至若干個磁盤后提升IO速度,但是實際情況中偶爾會出現丟失塊情況,因此與條帶卷相關的所以卷類型在生產都不可以使用!!!

四、分布式條帶卷:分布卷與條帶卷的組合。生產不使用。

五、分布式復制卷:分布卷與復制卷的組合。這是我們最常用的類型!!!

六、條帶復制卷:條帶卷與復制卷的組合。生產不使用。

七、混合卷:三種卷模式的組合。生產不使用。

1 [root@g1 ~]# gluster volume list # 目前集群沒有卷 2 No volumes present in cluster 3 [root@g1 ~]# gluster volume create test replica 2 g1:/glusterfs/data1 g2:/glusterfs/data1 g3:/glusterfs/data1 g1:/glusterfs/data2 g2:/glusterfs/data2 # 我們用池內5塊盤創建一個分布式復制卷,復制數為2 4 Replica 2 volumes are prone to split-brain. Use Arbiter or Replica 3 to avoid this. See: http://docs.gluster.org/en/latest/Administrator%20Guide/Split%20brain%20and%20ways%20to%20deal%20with%20it/. 5 Do you still want to continue? 6 (y/n) y 7 number of bricks is not a multiple of replica count 8 9 Usage: 10 volume create <NEW-VOLNAME> [stripe <COUNT>] [replica <COUNT> [arbiter <COUNT>]] [disperse [<COUNT>]] [disperse-data <COUNT>] [redundancy <COUNT>] [transport <tcp|rdma|tcp,rdma>] <NEW-BRICK>?<vg_name>... [force] # 一共出現了2次問題,第一次提示說,復制數為2可能會造成腦裂,推薦復制三份,是否繼續?生產要使用3,現在我就繼續了。第二個說使用的塊設備數量不是復制分數的倍數。因此像復制數為2的卷我們需要使用2468個塊,復制數為3則為369,以此類推 11 [root@g1 ~]# gluster volume create test replica 2 g1:/glusterfs/data1 g2:/glusterfs/data1 g3:/glusterfs/data1 g1:/glusterfs/data2 12 Replica 2 volumes are prone to split-brain. Use Arbiter or Replica 3 to avoid this. See: http://docs.gluster.org/en/latest/Administrator%20Guide/Split%20brain%20and%20ways%20to%20deal%20with%20it/. 13 Do you still want to continue? 14 (y/n) y 15 volume create: test: failed: The brick g1:/glusterfs/data1 is being created in the root partition. It is recommended that you don't use the system's root partition for storage backend. Or use 'force' at the end of the command if you want to override this behavior. # 第一個還是提示警告,第二個是因為我直接使用了/下的目錄並不是掛載的盤,所以會有這個提示,正常掛載的磁盤不會有此報錯 16 [root@g1 ~]# gluster volume create test replica 2 g1:/glusterfs/data1 g2:/glusterfs/data1 g3:/glusterfs/data1 g1:/glusterfs/data2 force # 強制操作一波 17 volume create: test: success: please start the volume to access data 18 [root@g1 ~]# gluster volume list # 可以看見我們的集群中存在的卷 19 test 20 [root@g1 ~]# gluster volume info test # 查看卷的信息 21 22 Volume Name: test 23 Type: Distributed-Replicate 24 Volume ID: 92ffe586-ea14-4b7b-9b89-5dfd626cb6d4 25 Status: Created 26 Snapshot Count: 0 27 Number of Bricks: 2 x 2 = 4 28 Transport-type: tcp 29 Bricks: 30 Brick1: g1:/glusterfs/data1 31 Brick2: g2:/glusterfs/data1 32 Brick3: g3:/glusterfs/data1 33 Brick4: g1:/glusterfs/data2 34 Options Reconfigured: 35 transport.address-family: inet 36 nfs.disable: on 37 performance.client-io-threads: off 38 [root@g1 ~]# gluster volume start test # 將卷啟動 39 volume start: test: success 40 [root@g1 ~]# gluster volume status test # 查看卷的狀態 41 Status of volume: test 42 Gluster process TCP Port RDMA Port Online Pid 43 ------------------------------------------------------------------------------ 44 Brick g1:/glusterfs/data1 49152 0 Y 81315 45 Brick g2:/glusterfs/data1 49152 0 Y 72882 46 Brick g3:/glusterfs/data1 49152 0 Y 72834 47 Brick g1:/glusterfs/data2 49153 0 Y 81337 48 Self-heal Daemon on localhost N/A N/A Y 81360 49 Self-heal Daemon on g2 N/A N/A Y 72905 50 Self-heal Daemon on g3 N/A N/A Y 72857 51 52 Task Status of Volume test 53 ------------------------------------------------------------------------------ 54 There are no active volume tasks 55 [root@g1 ~]# netstat -tpln|grep gluster 56 tcp 0 0 0.0.0.0:49152 0.0.0.0:* LISTEN 81315/glusterfsd # 為磁盤打開的通訊端口 57 tcp 0 0 0.0.0.0:49153 0.0.0.0:* LISTEN 81337/glusterfsd # 為磁盤打開的通訊端口 58 tcp 0 0 0.0.0.0:24007 0.0.0.0:* LISTEN 7381/glusterd # 服務本身 59 [root@g1 ~]# ps -ef|grep 81315 60 root 81315 1 0 17:12 ? 00:00:00 /usr/sbin/glusterfsd -s g1 --volfile-id test.g1.glusterfs-data1 -p /var/run/gluster/vols/test/g1-glusterfs-data1.pid -S /var/run/gluster/643869d38d7edd70.socket --brick-name /glusterfs/data1 -l /var/log/glusterfs/bricks/glusterfs-data1.log --xlator-option *-posix.glusterd-uuid=ce160a74-6f02-4760-862c-1daf6cfa4300 --process-name brick --brick-port 49152 --xlator-option test-server.listen-port=49152 61 root 81569 96704 0 17:14 pts/1 00:00:00 grep --color=auto 81315 62 63 服務端在默認情況卷已經打開了很多的優化參數,我們只需要根據服務器自身資源修改一些值即可 64 [root@g1 ~]# gluster volume set test performance.cache-size 1GB # 讀緩存大小1G,如果此值超過機器內存,則在掛載時會報錯 65 volume set: success 66 [root@g1 ~]# gluster volume set test performance.io-thread-count 32 # 設置io線程 67 volume set: success 68 [root@g1 ~]# gluster volume quota test enable # 打開限額功能,內部機房一般不用 69 quota command failed : Quota is already enabled 70 [root@g1 ~]# gluster volume quota test limit-usage / 10GB # 設置最多可使用量 71 volume quota : success 72 [root@g1 ~]# gluster volume set test performance.cache-refresh-timeout 5 # 設置緩存刷新時間 73 volume set: success 74 [root@g1 ~]# gluster volume info test 75 76 Volume Name: test 77 Type: Distributed-Replicate 78 Volume ID: 92ffe586-ea14-4b7b-9b89-5dfd626cb6d4 79 Status: Started 80 Snapshot Count: 0 81 Number of Bricks: 2 x 2 = 4 82 Transport-type: tcp 83 Bricks: 84 Brick1: g1:/glusterfs/data1 85 Brick2: g2:/glusterfs/data1 86 Brick3: g3:/glusterfs/data1 87 Brick4: g1:/glusterfs/data2 88 Options Reconfigured: 89 performance.cache-refresh-timeout: 5 90 performance.io-thread-count: 32 91 performance.cache-size: 1GB 92 features.quota-deem-statfs: on 93 nfs.disable: on 94 features.inode-quota: on 95 features.quota: on 96 97 98 3.掛載相關 99 我們在一台可以進行主機名解析的機器上進行掛載測試 100 [root@c4 /]# yum install -y glusterfs glusterfs-fuse 101 [root@c4 /]# mount -t glusterfs -o backup-volfile-servers=g2:g3,log-level=WARNING g1:/test /mnt # 掛載時指定備用服務器 102 [root@c4 /]# df -h|grep mnt # 很明顯我們的限額起了作用 103 g1:/test 10G 0 10G 0% /mnt 104 [root@c4 ~]# cd /mnt 105 [root@c4 mnt]# touch file{1..10} 106 可以看到確實均勻分布在了我們卷里的4塊盤之上 107 [root@g1 glusterfs]# tree . 108 . 109 ├── data1 110 │ ├── file10 111 │ ├── file3 112 │ ├── file4 113 │ ├── file7 114 │ └── file9 115 ├── data2 116 │ ├── file1 117 │ ├── file2 118 │ ├── file5 119 │ ├── file6 120 │ └── file8 121 └── data3 122 [root@g2 glusterfs]# tree . 123 . 124 ├── data1 125 │ ├── file10 126 │ ├── file3 127 │ ├── file4 128 │ ├── file7 129 │ └── file9 130 ├── data2 131 └── data3 132 [root@g3 glusterfs]# tree . 133 . 134 ├── data1 135 │ ├── file1 136 │ ├── file2 137 │ ├── file5 138 │ ├── file6 139 │ └── file8 140 ├── data2 141 └── data3 142 當然我們還有nfs掛載與cifs掛載,只是這兩種掛載我們幾乎使用不到的,而且使用方法也很簡單,就不多寫了。使用常規的原生掛載方式是很不錯的選擇,可以獲得更高的並發性能和透明的失效轉移功能
4.運維相關
擴容操作
1 [root@g1 ~]# gluster peer probe g3 # 將新節點添加到集群里,如果是原本集群內的機器操作則省略 2 peer probe: success. Host g3 port 24007 already in peer list # 這台機器添加過了 3 [root@g1 ~]# gluster volume info test # 此時該卷塊設備為4個 4 5 Volume Name: test 6 Type: Distributed-Replicate 7 Volume ID: 92ffe586-ea14-4b7b-9b89-5dfd626cb6d4 8 Status: Started 9 Snapshot Count: 0 10 Number of Bricks: 2 x 2 = 4 11 Transport-type: tcp 12 Bricks: 13 Brick1: g1:/glusterfs/data1 14 Brick2: g2:/glusterfs/data1 15 Brick3: g3:/glusterfs/data1 16 Brick4: g1:/glusterfs/data2 17 [root@g1 ~]# gluster volume add-brick test g2:/glusterfs/data2 g3:/glusterfs/data2 g1:/glusterfs/data3 g2:/glusterfs/data3 g3:/glusterfs/data3 18 volume add-brick: failed: Incorrect number of bricks supplied 5 with count 2 # 很明顯又是之前的塊設備與復制數備份問題。因此注意服務器上的磁盤數量要與卷復制數匹配問題,比如一個復制數為3的卷,買了10塊盤其中一塊是加不進來的 19 [root@g1 ~]# gluster volume add-brick test g2:/glusterfs/data2 g3:/glusterfs/data2 g1:/glusterfs/data3 g2:/glusterfs/data3 20 volume add-brick: failed: The brick g1:/glusterfs/data3 is being created in the root partition. It is recommended that you don't use the system's root partition for storage backend. Or use 'force' at the end of the command if you want to override this behavior. # 我還是用的/所以要強制 21 [root@g1 ~]# gluster volume add-brick test g2:/glusterfs/data2 g3:/glusterfs/data2 g1:/glusterfs/data3 g2:/glusterfs/data3 force 22 volume add-brick: success 23 [root@g1 ~]# gluster volume info test # 卷的塊設備變多了 24 25 Volume Name: test 26 Type: Distributed-Replicate 27 Volume ID: 92ffe586-ea14-4b7b-9b89-5dfd626cb6d4 28 Status: Started 29 Snapshot Count: 0 30 Number of Bricks: 4 x 2 = 8 31 Transport-type: tcp 32 Bricks: 33 Brick1: g1:/glusterfs/data1 34 Brick2: g2:/glusterfs/data1 35 Brick3: g3:/glusterfs/data1 36 Brick4: g1:/glusterfs/data2 37 Brick5: g2:/glusterfs/data2 38 Brick6: g3:/glusterfs/data2 39 Brick7: g1:/glusterfs/data3 40 Brick8: g2:/glusterfs/data3 41 [root@g1 ~]# gluster volume rebalance test start # 讓以前的數據再次均勻分布 42 volume rebalance: test: success: Rebalance on test has been started successfully. Use rebalance status command to check status of the rebalance process. 43 ID: a2f4b603-283a-4303-8ad0-84db00adb5a5 44 [root@g1 ~]# gluster volume rebalance test status # 查看任務狀態,要均衡文件較大時需要一段時間 45 Node Rebalanced-files size scanned failures skipped status run time in h:m:s 46 --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- 47 localhost 2 0Bytes 10 0 0 completed 0:00:00 48 g2 1 0Bytes 9 0 0 completed 0:00:00 49 g3 3 0Bytes 6 0 0 completed 0:00:00 50 volume rebalance: test: success 51 [root@g1 ~]# gluster volume rebalance test stop # 等所有狀態completed就可以停了 52 Node Rebalanced-files size scanned failures skipped status run time in h:m:s 53 --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- 54 localhost 2 0Bytes 10 0 0 completed 0:00:00 55 g2 1 0Bytes 9 0 0 completed 0:00:00 56 g3 3 0Bytes 6 0 0 completed 0:00:00 57 volume rebalance: test: success: rebalance process may be in the middle of a file migration. 58 The process will be fully stopped once the migration of the file is complete. 59 Please check rebalance process for completion before doing any further brick related tasks on the volume. 60 61 [root@g1 ~]# gluster volume rebalance test status # 現在該卷上已經沒有在均衡的任務了 62 volume rebalance: test: failed: Rebalance not started for volume test.
縮容操作
1 [root@g1 ~]# gluster volume remove-brick test g2:/glusterfs/data2 g3:/glusterfs/data2 g1:/glusterfs/data3 g2:/glusterfs/data3 start # 去掉復制倍數的塊設備,此時去除的設備上的數據會開始遷移 2 Running remove-brick with cluster.force-migration enabled can result in data corruption. It is safer to disable this option so that files that receive writes during migration are not migrated. 3 Files that are not migrated can then be manually copied after the remove-brick commit operation. 4 Do you want to continue with your current cluster.force-migration settings? (y/n) y 5 volume remove-brick start: success 6 ID: 9b4657c0-ed29-4c75-8bb6-7b8f277f02ec 7 [root@g1 ~]# gluster volume remove-brick test g2:/glusterfs/data2 g3:/glusterfs/data2 g1:/glusterfs/data3 g2:/glusterfs/data3 status # 查看遷移狀態 8 Node Rebalanced-files size scanned failures skipped status run time in h:m:s 9 --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- 10 localhost 0 0Bytes 10 0 0 completed 0:00:00 11 g2 0 0Bytes 5 0 0 completed 0:00:00 12 g3 0 0Bytes 5 0 0 completed 0:00:00 13 [root@g1 ~]# gluster volume remove-brick test g2:/glusterfs/data2 g3:/glusterfs/data2 g1:/glusterfs/data3 g2:/glusterfs/data3 commit # completed后從卷中將塊設備去除掉 14 volume remove-brick commit: success 15 Check the removed bricks to ensure all files are migrated. 16 If files with data are found on the brick path, copy them via a gluster mount point before re-purposing the removed brick. 17 [root@g1 ~]# gluster volume remove-brick test g2:/glusterfs/data2 g3:/glusterfs/data2 g1:/glusterfs/data3 g2:/glusterfs/data3 status # 沒有任務 18 volume remove-brick status: failed: remove-brick not started for volume test. 19 [root@g1 ~]# gluster volume info test # 又變回了以前的4塊盤 20 21 Volume Name: test 22 Type: Distributed-Replicate 23 Volume ID: 92ffe586-ea14-4b7b-9b89-5dfd626cb6d4 24 Status: Started 25 Snapshot Count: 0 26 Number of Bricks: 2 x 2 = 4 27 Transport-type: tcp 28 Bricks: 29 Brick1: g1:/glusterfs/data1 30 Brick2: g2:/glusterfs/data1 31 Brick3: g3:/glusterfs/data1 32 Brick4: g1:/glusterfs/data2 33 [root@g1 ~]# ls -a /glusterfs/data3 # 將剛才所有被去除的塊設備下隱藏目錄刪除,否則可能會影響到以后再次使用此塊設備加入其它卷 34 . .. .glusterfs 35 [root@g1 ~]# rm -fr /glusterfs/data3/.glusterfs/
更換操作
1 [root@g1 ~]# gluster volume replace-brick test g3:/glusterfs/data2 g3:/glusterfs/data3 commit force # 使用g3:/glusterfs/data3替換掉g3:/glusterfs/data2 2 volume replace-brick: success: replace-brick commit force operation successful 3 [root@g1 ~]# gluster volume info test 4 5 Volume Name: test 6 Type: Distributed-Replicate 7 Volume ID: 92ffe586-ea14-4b7b-9b89-5dfd626cb6d4 8 Status: Started 9 Snapshot Count: 0 10 Number of Bricks: 4 x 2 = 8 11 Transport-type: tcp 12 Bricks: 13 Brick1: g1:/glusterfs/data1 14 Brick2: g2:/glusterfs/data1 15 Brick3: g3:/glusterfs/data1 16 Brick4: g1:/glusterfs/data2 17 Brick5: g2:/glusterfs/data2 18 Brick6: g3:/glusterfs/data3 # 被換掉了 19 Brick7: g1:/glusterfs/data3 20 Brick8: g2:/glusterfs/data3
