GlusterFS集群創建
一、簡介
GlusterFS概述
- Glusterfs是一個開源的分布式文件系統,是Scale存儲的核心,能夠處理千數量級的客戶端.在傳統的解決 方案中Glusterfs能夠靈活的結合物理的,虛擬的和雲資源去體現高可用和企業級的性能存儲.
- Glusterfs通過TCP/IP或InfiniBand RDMA網絡鏈接將客戶端的存儲資塊源聚集在一起,使用單一的全局命名空間來管理數據,磁盤和內存資源.
- Glusterfs基於堆疊的用戶空間設計,可以為不同的工作負載提供高優的性能.
- Glusterfs支持運行在任何標准IP網絡上標准應用程序的標准客戶端,如下圖1所示,用戶可以在全局統一的命名空間中使用NFS/CIFS等標准協議來訪問應用數據.
Glusterfs主要特征
- 擴展性和高性能
- 高可用
- 全局統一命名空間
- 彈性hash算法
- 彈性卷管理
- 基於標准協議
工作原理:
1) 首先是在客戶端, 用戶通過glusterfs的mount point 來讀寫數據, 對於用戶來說,集群系統的存在對用戶是完全透明的,用戶感覺不到是操作本地系統還是遠端的集群系統。
2) 用戶的這個操作被遞交給 本地linux系統的VFS來處理。
3) VFS 將數據遞交給FUSE 內核文件系統:在啟動 glusterfs 客戶端以前,需要想系統注冊一個實際的文件系統FUSE,如上圖所示,該文件系統與ext3在同一個層次上面, ext3 是對實際的磁盤進行處理, 而fuse 文件系統則是將數據通過/dev/fuse 這個設備文件遞交給了glusterfs client端。所以, 我們可以將 fuse文件系統理解為一個代理。
4) 數據被fuse 遞交給Glusterfs client 后, client 對數據進行一些指定的處理(所謂的指定,是按照client 配置文件據來進行的一系列處理, 我們在啟動glusterfs client 時需要指定這個文件。
5) 在glusterfs client的處理末端,通過網絡將數據遞交給 Glusterfs Server,並且將數據寫入到服務器所控制的存儲設備上。
常用卷類型
分布(distributed)
復制(replicate)
條帶(striped)
基本卷:
(1) distribute volume:分布式卷
(2) stripe volume:條帶卷
(3) replica volume:復制卷
復合卷:
(4) distribute stripe volume:分布式條帶卷
(5) distribute replica volume:分布式復制卷
(6) stripe replica volume:條帶復制卷
(7) distribute stripe replicavolume:分布式條帶復制卷
二、環境規划
注:node1-node6 為服務端 ,node-client為客戶端
操作系統 | IP | 主機名 | 硬盤數量(三塊) |
centos 7.3 | 172.16.2.51 | node1 | sdb:2G sdc:2G sdd:2G |
centos 7.3 | 172.16.2.52 | node2 | sdb:2G sdc:2G sdd:2G |
centos 7.3 | 172.16.2.53 | node3 | sdb:2G sdc:2G sdd:2G |
centos 7.3 | 172.16.2.54 | node4 | sdb:2G sdc:2G sdd:2G |
centos 7.3 | 172.16.2.55 | node5 | sdb:2G sdc:2G sdd:2G |
centos 7.3 | 172.16.2.56 | node6 | sdb:2G sdc:2G sdd:2G |
centos 7.3 | 172.16.2.57 | node7-client | sda:20G |
1、環境准備:(node1-node6 同時操作)
1.1 給node1-node6 每台主機添加三塊各2G硬盤。
[root@node1 ~]# df -h
文件系統 容量 已用 可用 已用% 掛載點
/dev/mapper/cl-root 18G 4.2G 14G 24% /
devtmpfs 473M 0 473M 0% /dev
tmpfs 489M 84K 489M 1% /dev/shm
tmpfs 489M 7.1M 482M 2% /run
tmpfs 489M 0 489M 0% /sys/fs/cgroup
/dev/sdd 2.0G 33M 2.0G 2% /glusterfs/sdd
/dev/sdc 2.0G 33M 2.0G 2% /glusterfs/sdc
/dev/sdb 2.0G 33M 2.0G 2% /glusterfs/sdb
/dev/sda1 297M 158M 140M 54% /boot
tmpfs 98M 16K 98M 1% /run/user/42
tmpfs 98M 0 98M 0% /run/user/0
1.2 關閉防火牆,seLinux,同步時間
關閉防火牆
systemctl stop firewalld
systemctl disable firewalld
關閉SELinux
sed 's/=permissive/=disabled/' /etc/selinux/config
setenforce 0
同步時間
ntpdate ntp.gwadar.cn
1.3 主機解析(hosts文件配置)
[root@node1 ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
172.16.2.51 node1
172.16.2.52 node2
172.16.2.53 node3
172.16.2.54 node4
172.16.2.55 node5
172.16.2.56 node6
1.4 測試所有存儲節點網絡情況
for i in {1..6}
do
ping -c3 node$i &> /dev/null && echo "$i up "
done
1.5 配置epel源
yum install http://mirrors.163.com/centos/7.3.1611/extras/x86_64/Packages/epel-release-7-9.noarch.rpm
1.6 配置glusterfs 的本地 yum源(采用網絡源方式)
vim /etc/yum.repos.d/gluster-epel.repo
[root@node1 ~]# cat /etc/yum.repos.d/gluster.repo
[gluster]
name=gluster
baseurl=https://buildlogs.centos.org/centos/7/storage/x86_64/gluster-3.8/
gpgcheck=0
enabled=1
1.7 安裝GlusterFS和rpcbind(他是一個RPC服務,主要是在nfs共享時候負責通知客戶端,服務器的nfs端口號的。簡單理解rpc就是一個中介服務。)
yum install -y glusterfs-server samba rpcbind
systemctl start glusterd.service
systemctl enable glusterd.service
systemctl start rpcbind // rpcbind 用於以nfs方式掛在
systemctl enable rpcbind
systemctl status rpcbind
以上操作均在node1-node6上同時操作
三、Gluster管理
1、gluster 命令幫助
[root@node1 ~]# gluster peer help
peer detach { <HOSTNAME> | <IP-address> } [force] - detach peer specified by <HOSTNAME>
peer help - Help command for peer
peer probe { <HOSTNAME> | <IP-address> } - probe peer specified by <HOSTNAME>
peer status - list status of peers
pool list - list all the nodes in the pool (including localhost)
2、添加GlusterFS節點(在node1上操作就可以)
[root@node1 ~]# gluster peer probe node2
peer probe: success.
[root@node1 ~]# gluster peer probe node3
peer probe: success.
[root@node1 ~]# gluster peer probe node4
peer probe: success.
查看所添加節點的狀態
[root@node1 ~]# gluster peer status
Number of Peers: 3
Hostname: node2
Uuid: 67c60312-a312-43d6-af77-87cbbc29e1aa
State: Peer in Cluster (Connected)
Hostname: node3
Uuid: d79c3a0b-585a-458d-b202-f88ac1439d0d
State: Peer in Cluster (Connected)
Hostname: node4
Uuid: 97094c6e-afc8-4cfb-9d26-616aedc55236
State: Peer in Cluster (Connected)
從存儲池中刪除節點
[root@node1 ~]# gluster peer detach node2
peer detach: success
[root@node1 ~]# gluster peer probe node2
peer probe: success.
3、創建卷
3.1創建分布卷
[root@node1 ~]# gluster volume create dis_vol \
> node1:/glusterfs/sdb/dv1 \
> node2:/glusterfs/sdb/dv1 \
> node3:/glusterfs/sdb/dv1
volume create: dis_vol: success: please start the volume to access data
查看分布卷
[root@node1 ~]# gluster volume info dis_vol
3.2創建復制卷
[root@node1 ~]# gluster volume create rep_vol replica 3 \
> node1:/glusterfs/sdb/rv2 \
> node2:/glusterfs/sdb/rv2 \
> node3:/glusterfs/sdb/rv2
volume create: rep_vol: success: please start the volume to access data
查看
[root@node1 ~]# gluster volume info rep_vol
3.3創建條帶卷
[root@node1 ~]# gluster volume create str_vol stripe 3 \
> node1:/glusterfs/sdb/sv3 \
> node2:/glusterfs/sdb/sv3 \
> node3:/glusterfs/sdb/sv3
volume create: str_vol: success: please start the volume to access data
查看gluster volume info str_vol
3.4 創建分布條帶卷
[root@node1 ~]# gluster volume create dir_str_vol stripe 4 \
> node1:/glusterfs/sdb/dsv4 \
> node2:/glusterfs/sdb/dsv4 \
> node3:/glusterfs/sdb/dsv4 \
> node4:/glusterfs/sdb/dsv4 \
> node5:/glusterfs/sdb/dsv4 \
> node6:/glusterfs/sdb/dsv4 \
> node1:/glusterfs/sdc/dsv4 \
> node2:/glusterfs/sdc/dsv4
volume create: dir_str_vol: failed: Host node5 is not in ' Peer in Cluster' state
3.5 創建分布復制卷
[root@node1 ~]# gluster volume create dir_rep_vol replica 2 \
> node2:/glusterfs/sdb/drv5 \
> node1:/glusterfs/sdb/drv5 \
> node3:/glusterfs/sdb/drv5 \
> node4:/glusterfs/sdb/drv5
volume create: dir_rep_vol: success: please start the volume to access data
3.6 創建分布條帶復制
[root@node1 ~]# gluster volume create dis_str_rep_vol stri 2 repl 2 \
> node1:/glusterfs/sdb/dsrv6 \
> node2:/glusterfs/sdb/dsrv6 \
> node3:/glusterfs/sdb/drsv6 \
> node4:/glusterfs/sdb/drsv6
volume create: dis_str_rep_vol: success: please start the volume to access data
3.7 創建條帶復制卷
[root@node1 ~]# gluster volume create str_rep_vol stripe 2 replica 2 \
> node1:/glusterfs/sdb/srv7 \
> node2:/glusterfs/sdb/srv7 \
> node3:/glusterfs/sdb/srv7 \
> node4:/glusterfs/sdb/srv7
volume create: str_rep_vol: success: please start the volume to access data
3.8 創建分散卷(不常用)
[root@node1 ~]# gluster volume create disperse_vol disperse 4 \
> node1:/glusterfs/sdb/dv8 \
> node2:/glusterfs/sdb/dv8 \
> node3:/glusterfs/sdb/dv8 \
> node4:/glusterfs/sdb/dv8
There isn't an optimal redundancy value for this configuration. Do you want to create the volume with redundancy 1 ? (y/n) y
volume create: disperse_vol: success: please start the volume to access data
查看卷的狀態
[root@node1 ~]# gluster volume info disperse_vol
Volume Name: disperse_vol
Type: Disperse
Volume ID: 8be1cd6f-49f8-4b11-bcfb-ef5a4f22e224
Status: Created
Snapshot Count: 0
Number of Bricks: 1 x (3 + 1) = 4
Transport-type: tcp
Bricks:
Brick1: node1:/glusterfs/sdb/dv8
Brick2: node2:/glusterfs/sdb/dv8
Brick3: node3:/glusterfs/sdb/dv8
Brick4: node4:/glusterfs/sdb/dv8
Options Reconfigured:
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: on
3.9 創建分布分散卷(不常用)
[root@node1 ~]# gluster volume create disperse_vol_3 disperse 3 \
> node1:/glusterfs/sdb/d9 \
> node2:/glusterfs/sdb/d9 \
> node3:/glusterfs/sdb/d9 \
> node4:/glusterfs/sdb/d9 \
> node5:/glusterfs/sdb/d9 \
> node6:/glusterfs/sdb/d9
volume create: disperse_vol_3: success: please start the volume to access data
4、查看卷
4.1 查看單個卷的詳細信息
[root@node1 ~]# gluster volume info disperse_vol_3
Volume Name: disperse_vol_3
Type: Distributed-Disperse
Volume ID: 3065d729-8a4f-4717-b8dc-cd73950d8ef7
Status: Created
Snapshot Count: 0
Number of Bricks: 2 x (2 + 1) = 6
Transport-type: tcp
Bricks:
Brick1: node1:/glusterfs/sdb/d9
Brick2: node2:/glusterfs/sdb/d9
Brick3: node3:/glusterfs/sdb/d9
Brick4: node4:/glusterfs/sdb/d9
Brick5: node5:/glusterfs/sdb/d9
Brick6: node6:/glusterfs/sdb/d9
Options Reconfigured:
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: on
4.2查看所有創建卷的狀態
[root@node1 ~]# gluster volume status
Status of volume: dir_rep_vol
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick node2:/glusterfs/sdb/drv5 49152 0 Y 17676
Brick node1:/glusterfs/sdb/drv5 49152 0 Y 16821
Brick node3:/glusterfs/sdb/drv5 49152 0 Y 16643
Brick node4:/glusterfs/sdb/drv5 49152 0 Y 17365
Self-heal Daemon on localhost N/A N/A Y 16841
Self-heal Daemon on node3 N/A N/A Y 16663
Self-heal Daemon on node6 N/A N/A Y 16557
Self-heal Daemon on node2 N/A N/A N N/A
Self-heal Daemon on node5 N/A N/A Y 15374
Self-heal Daemon on node4 N/A N/A Y 17386
Task Status of Volume dir_rep_vol
------------------------------------------------------------------------------
There are no active volume tasks
Volume dis_str_rep_vol is not started
Volume dis_vol is not started
Volume disperse_vol is not started
Volume disperse_vol_3 is not started
Volume rep_vol is not started
Volume str_rep_vol is not started
Volume str_vol is not started
5、/啟/停/刪除卷
$gluster volume start mamm-volume
$gluster volume stop mamm-volume
$gluster volume delete mamm-volume
6、擴展收縮卷
$gluster volume add-brick mamm-volume [strip|repli <count>] brick1...
$gluster volume remove-brick mamm-volume [repl <count>] brick1...
擴展或收縮卷時,也要按照卷的類型,加入或減少的brick個數必須滿足相應的要求。
擴展前狀態
[root@node1 ~]# gluster volume status dir_rep_vol
Status of volume: dir_rep_vol
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick node2:/glusterfs/sdb/drv5 49153 0 Y 2409
Brick node1:/glusterfs/sdb/drv5 49153 0 Y 1162
Brick node3:/glusterfs/sdb/drv5 49153 0 Y 1140
Brick node4:/glusterfs/sdb/drv5 49153 0 Y 1430
Self-heal Daemon on localhost N/A N/A Y 2560
Self-heal Daemon on node3 N/A N/A Y 2507
Self-heal Daemon on node6 N/A N/A Y 2207
Self-heal Daemon on node5 N/A N/A Y 2749
Self-heal Daemon on node4 N/A N/A Y 2787
Self-heal Daemon on node2 N/A N/A Y 2803
Task Status of Volume dir_rep_vol
------------------------------------------------------------------------------
擴展
[root@node1 ~]# gluster volume add-brick dir_rep_vol \
> node1:/glusterfs/sdb/drv6 \
> node2:/glusterfs/sdb/drv6 \
> node3:/glusterfs/sdb/drv6 \
> node4:/glusterfs/sdb/drv6
volume add-brick: success
擴展后
[root@node1 ~]# gluster volume status dir_rep_vol
Status of volume: dir_rep_vol
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick node2:/glusterfs/sdb/drv5 49153 0 Y 2409
Brick node1:/glusterfs/sdb/drv5 49153 0 Y 1162
Brick node3:/glusterfs/sdb/drv5 49153 0 Y 1140
Brick node4:/glusterfs/sdb/drv5 49153 0 Y 1430
Brick node1:/glusterfs/sdb/drv6 49154 0 Y 5248
Brick node2:/glusterfs/sdb/drv6 49154 0 Y 5093
Brick node3:/glusterfs/sdb/drv6 49154 0 Y 5017
Brick node4:/glusterfs/sdb/drv6 49154 0 Y 5103
Self-heal Daemon on localhost N/A N/A Y 5268
Self-heal Daemon on node3 N/A N/A Y 5037
Self-heal Daemon on node6 N/A N/A Y 5041
Self-heal Daemon on node5 N/A N/A Y 5055
Self-heal Daemon on node4 N/A N/A Y 5132
Self-heal Daemon on node2 N/A N/A N N/A
Task Status of Volume dir_rep_vol
------------------------------------------------------------------------------
There are no active volume tasks
收縮卷
[root@node1 ~]# gluster volume remove-brick dir_rep_vol \
> node1:/glusterfs/sdb/drv5 \
> node2:/glusterfs/sdb/drv5 \
> node3:/glusterfs/sdb/drv5 \
> node4:/glusterfs/sdb/drv5 force(強制)
Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y
volume remove-brick commit force: success
[root@node1 ~]# gluster volume status dir_rep_vol
Status of volume: dir_rep_vol
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick node1:/glusterfs/sdb/drv6 49154 0 Y 5248
Brick node2:/glusterfs/sdb/drv6 49154 0 Y 5093
Brick node3:/glusterfs/sdb/drv6 49154 0 Y 5017
Brick node4:/glusterfs/sdb/drv6 49154 0 Y 5103
Self-heal Daemon on localhost N/A N/A Y 5591
Self-heal Daemon on node4 N/A N/A Y 5377
Self-heal Daemon on node6 N/A N/A Y 5291
Self-heal Daemon on node5 N/A N/A Y 5305
Self-heal Daemon on node2 N/A N/A Y 5341
Self-heal Daemon on node3 N/A N/A Y 5282
Task Status of Volume dir_rep_vol
------------------------------------------------------------------------------
There are no active volume tasks
7、遷移卷(替換)
volume replace-brick <VOLNAME> <SOURCE-BRICK> <NEW-BRICK> {commit force}
示例:
[root@node1 ~]# gluster volume replace-brick rep_vol node2:/glusterfs/sdb/rv2 node2:/glusterfs/sdb/rv3 commit force
volume replace-brick: failed: volume: rep_vol is not started
[root@node1 ~]# gluster volume start rep_vol
volume start: rep_vol: success
[root@node1 ~]# gluster volume replace-brick rep_vol node2:/glusterfs/sdb/rv2 node2:/glusterfs/sdb/rv3 commit force
volume replace-brick: success: replace-brick commit force operation successful
#遷移需要完成一系列的事務,假如我們准備將mamm卷中的brick3替換為brick5
#啟動遷移過程
$gluster volume replace-brick mamm-volume node3:/exp3 node5:/exp5 start
#暫停遷移過程
$gluster volume replace-brick mamm-volume node3:/exp3 node5:/exp5 pause
#中止遷移過程
$gluster volume replace-brick mamm-volume node3:/exp3 node5:/exp5 abort
#查看遷移狀態
$gluster volume replace-brick mamm-volume node3:/exp3 node5:/exp5 status
#遷移完成后提交完成
$gluster volume replace-brick mamm-volume node3:/exp3 node5:/exp5 commit
四、客戶端管理
1、安裝
[root@node7-client ~]# yum install glusterfs glusterfs-fuse attr -y
2、glusterfs方式掛在
掛載(當前生效)
[root@node7 ~]# mount -t glusterfs node2:/rep_vol /gfs_test/
[root@node7 ~]# df -h
文件系統 容量 已用 可用 已用% 掛載點
/dev/mapper/cl-root 18G 4.1G 14G 24% /
devtmpfs 473M 0 473M 0% /dev
tmpfs 489M 84K 489M 1% /dev/shm
tmpfs 489M 7.1M 482M 2% /run
tmpfs 489M 0 489M 0% /sys/fs/cgroup
/dev/sda1 297M 156M 142M 53% /boot
tmpfs 98M 16K 98M 1% /run/user/42
tmpfs 98M 0 98M 0% /run/user/0
node2:/rep_vol 2.0G 33M 2.0G 2% /rep
node2:/rep_vol 2.0G 33M 2.0G 2% /gfs_test
掛載(永久)
[root@node7 ~]# echo "node2:/rep_vol /gfs_test glusterfs defaults,_netdev 0 0" >> /etc/fstab
[root@node7 ~]# mount -a
3、Nfs方式
掛在(當前生效)
[root@node7 ~]# mount -t nfs node3:/dis_vol gfs_nfs
mount.nfs: requested NFS version or transport protocol is not supported
解決方法:
1)、安裝 nfs-utils rpcbind
2)、開起卷的nfs掛載方式
[root@node1 ~]# gluster volume info dis_vol
Volume Name: dis_vol
Type: Distribute
Volume ID: c501f4ad-5a54-4835-b163-f508aa1c07ba
Status: Started
Snapshot Count: 0
Number of Bricks: 3
Transport-type: tcp
Bricks:
Brick1: node1:/glusterfs/sdb/dv1
Brick2: node2:/glusterfs/sdb/dv1
Brick3: node3:/glusterfs/sdb/dv1
Options Reconfigured:
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: on
[root@node1 ~]# gluster volume set dis_vol nfs.disable off
volume set: success
[root@node7 ~]# mount -t nfs node3:/dis_vol /gfs_nfs
[root@node7 ~]# df -h
文件系統 容量 已用 可用 已用% 掛載點
/dev/mapper/cl-root 18G 4.2G 14G 24% /
devtmpfs 473M 0 473M 0% /dev
tmpfs 489M 84K 489M 1% /dev/shm
tmpfs 489M 7.1M 482M 2% /run
tmpfs 489M 0 489M 0% /sys/fs/cgroup
/dev/sda1 297M 156M 142M 53% /boot
tmpfs 98M 16K 98M 1% /run/user/42
tmpfs 98M 0 98M 0% /run/user/0
node3:/dis_vol 6.0G 97M 5.9G 2% /gfs_nfs
永久掛載
[root@node7 ~]# echo "node3:/dis_vol /gfs_nfs nfs defaults,_netdev 0 0" >> /etc/fstab
[root@node7 ~]# mount -a
[root@node7 ~]# df -h
文件系統 容量 已用 可用 已用% 掛載點
/dev/mapper/cl-root 18G 4.1G 14G 24% /
devtmpfs 473M 0 473M 0% /dev
tmpfs 489M 84K 489M 1% /dev/shm
tmpfs 489M 7.1M 482M 2% /run
tmpfs 489M 0 489M 0% /sys/fs/cgroup
/dev/sda1 297M 156M 142M 53% /boot
tmpfs 98M 12K 98M 1% /run/user/42
tmpfs 98M 0 98M 0% /run/user/0
node3:/dis_vol 6.0G 97M 5.9G 2% /gfs_nfs