gluster進程: 端口號
glusterd 24007
glusterfsd 49153 提供卷服務, 停止卷后,進程退出。
glusterfs
glusterfs
root 3665 0.0 0.2 577312 9908 ? Ssl 18:47 0:02 /usr/sbin/glusterd --pid-file=/var/run/glusterd.pid root 19417 0.0 0.0 103300 2020 pts/0 R+ 19:36 0:00 grep gluster root 122826 0.0 0.3 1273460 13516 ? Ssl 19:24 0:00 /usr/sbin/glusterfsd -s 10.17.55.253 --volfile-id GlusterFs.10.17.55.253.mnt-b02abd665d4e1092d93f950ae499a0fa-glusterfs_storage -p /var/lib/glusterd/vols/GlusterFs/run/10.17.55.253-mnt-b02abd665d4e1092d93f950ae499a0fa-glusterfs_storage.pid -S /var/run/gluster/73372af1394228c2758045d0446e5297.socket --brick-name /mnt/b02abd665d4e1092d93f950ae499a0fa/glusterfs_storage -l /var/log/glusterfs/bricks/mnt-b02abd665d4e1092d93f950ae499a0fa-glusterfs_storage.log --xlator-option *-posix.glusterd-uuid=17a62946-b3f5-432e-9389-13fd6bb95e58 --brick-port 49153 --xlator-option GlusterFs-server.listen-port=49153 root 122878 0.0 0.2 582476 8496 ? Ssl 19:24 0:00 /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/run/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/gluster/e1fd6db52b653d9cfb69e179e4a5ab93.socket --xlator-option *replicate*.node-uuid=17a62946-b3f5-432e-9389-13fd6bb95e58 root 123248 97.4 0.3 542136 15236 ? Ssl 19:25 11:00 /usr/sbin/glusterfs --volfile-server=127.0.0.1 --volfile-id=/GlusterFs /mnt/d1f561a32ac1bf17cf183f36baac34d4
gluster 命令: 版本-3.4.2
[root@vClass-v0jTX etc]# gluster help volume info [all|<VOLNAME>] - list information of all volumes volume create <NEW-VOLNAME> [stripe <COUNT>] [replica <COUNT>] [device vg] [transport <tcp|rdma|tcp,rdma>] <NEW-BRICK> ... [force] - create a new volume of specified type with mentioned bricks volume delete <VOLNAME> - delete volume specified by <VOLNAME> volume start <VOLNAME> [force] - start volume specified by <VOLNAME> volume stop <VOLNAME> [force] - stop volume specified by <VOLNAME> volume add-brick <VOLNAME> [<stripe|replica> <COUNT>] <NEW-BRICK> ... [force] - add brick to volume <VOLNAME> volume remove-brick <VOLNAME> [replica <COUNT>] <BRICK> ... {start|stop|status|commit|force} - remove brick from volume <VOLNAME> volume rebalance <VOLNAME> [fix-layout] {start|stop|status} [force] - rebalance operations volume replace-brick <VOLNAME> <BRICK> <NEW-BRICK> {start [force]|pause|abort|status|commit [force]} - replace-brick operations volume set <VOLNAME> <KEY> <VALUE> - set options for volume <VOLNAME> volume help - display help for the volume command volume log rotate <VOLNAME> [BRICK] - rotate the log file for corresponding volume/brick volume sync <HOSTNAME> [all|<VOLNAME>] - sync the volume information from a peer volume reset <VOLNAME> [option] [force] - reset all the reconfigured options volume geo-replication [<VOLNAME>] [<SLAVE-URL>] {start|stop|config|status|log-rotate} [options...] - Geo-sync operations volume profile <VOLNAME> {start|info|stop} [nfs] - volume profile operations volume quota <VOLNAME> <enable|disable|limit-usage|list|remove> [path] [value] - quota translator specific operations volume top <VOLNAME> {[open|read|write|opendir|readdir [nfs]] |[read-perf|write-perf [nfs|{bs <size> count <count>}]]|[clear [nfs]]} [brick <brick>] [list-cnt <count>] - volume top operations volume status [all | <VOLNAME> [nfs|shd|<BRICK>]] [detail|clients|mem|inode|fd|callpool] - display status of all or specified volume(s)/brick volume heal <VOLNAME> [{full | info {healed | heal-failed | split-brain}}] - self-heal commands on volume specified by <VOLNAME> volume statedump <VOLNAME> [nfs] [all|mem|iobuf|callpool|priv|fd|inode|history]... - perform statedump on bricks volume list - list all volumes in cluster volume clear-locks <VOLNAME> <path> kind {blocked|granted|all}{inode [range]|entry [basename]|posix [range]} - Clear locks held on path peer probe <HOSTNAME> - probe peer specified by <HOSTNAME> peer detach <HOSTNAME> [force] - detach peer specified by <HOSTNAME> peer status - list status of peers peer help - Help command for peer quit - quit help - display command options exit - exit bd help - display help for bd command bd create <volname>:<bd> <size> - create a block device where size can be suffixed with KB, MB etc. Default size is in MB bd delete <volname>:<bd> - Delete a block device bd clone <volname>:<bd> <newbd> - clone device bd snapshot <volname>:<bd> <newbd> <size> - snapshot device where size can be suffixed with KB, MB etc. Default size is in MB
gluster: 3.10.2 版本 增加了 tier、bitrot命令。
volume info [all|<VOLNAME>] - list information of all volumes volume create <NEW-VOLNAME> [stripe <COUNT>] [replica <COUNT> [arbiter <COUNT>]] [disperse [<COUNT>]] [disperse-data <COUNT>] [redundancy <COUNT>] [transport <tcp|rdma|tcp,rdma>] <NEW-BRICK>?<vg_name>... [force] - create a new volume of specified type with mentioned bricks volume delete <VOLNAME> - delete volume specified by <VOLNAME> volume start <VOLNAME> [force] - start volume specified by <VOLNAME> volume stop <VOLNAME> [force] - stop volume specified by <VOLNAME>
卷分層,緩存
volume tier <VOLNAME> status volume tier <VOLNAME> start [force] 啟動緩存 volume tier <VOLNAME> stop 停止緩存 volume tier <VOLNAME> attach [<replica COUNT>] <NEW-BRICK>... [force] 創建緩存 volume tier <VOLNAME> detach <start|stop|status|commit|[force]> - Tier translator specific operations. volume attach-tier <VOLNAME> [<replica COUNT>] <NEW-BRICK>... - NOTE: this is old syntax, will be depreciated in next release. Please use gluster volume tier <vol> attach [<replica COUNT>] <NEW-BRICK>... volume detach-tier <VOLNAME> <start|stop|status|commit|force> - NOTE: this is old syntax, will be depreciated in next release. Please use gluster volume tier <vol> detach {start|stop|commit} [force] volume add-brick <VOLNAME> [<stripe|replica> <COUNT> [arbiter <COUNT>]] <NEW-BRICK> ... [force] - add brick to volume <VOLNAME> volume remove-brick <VOLNAME> [replica <COUNT>] <BRICK> ... <start|stop|status|commit|force> - remove brick from volume <VOLNAME> volume rebalance <VOLNAME> {{fix-layout start} | {start [force]|stop|status}} - rebalance operations volume replace-brick <VOLNAME> <SOURCE-BRICK> <NEW-BRICK> {commit force} - replace-brick operations volume set <VOLNAME> <KEY> <VALUE> - set options for volume <VOLNAME> volume help - display help for the volume command volume log <VOLNAME> rotate [BRICK] - rotate the log file for corresponding volume/brick volume log rotate <VOLNAME> [BRICK] - rotate the log file for corresponding volume/brick NOTE: This is an old syntax, will be deprecated from next release. volume sync <HOSTNAME> [all|<VOLNAME>] - sync the volume information from a peer volume reset <VOLNAME> [option] [force] - reset all the reconfigured options volume profile <VOLNAME> {start|info [peek|incremental [peek]|cumulative|clear]|stop} [nfs] - volume profile operations volume quota <VOLNAME> {enable|disable|list [<path> ...]| list-objects [<path> ...] | remove <path>| remove-objects <path> | default-soft-limit <percent>} | volume quota <VOLNAME> {limit-usage <path> <size> [<percent>]} | volume quota <VOLNAME> {limit-objects <path> <number> [<percent>]} | volume quota <VOLNAME> {alert-time|soft-timeout|hard-timeout} {<time>} - quota translator specific operations volume inode-quota <VOLNAME> enable - quota translator specific operations volume top <VOLNAME> {open|read|write|opendir|readdir|clear} [nfs|brick <brick>] [list-cnt <value>] | volume top <VOLNAME> {read-perf|write-perf} [bs <size> count <count>] [brick <brick>] [list-cnt <value>] - volume top operations volume status [all | <VOLNAME> [nfs|shd|<BRICK>|quotad|tierd]] [detail|clients|mem|inode|fd|callpool|tasks] - display status of all or specified volume(s)/brick volume heal <VOLNAME> [enable | disable | full |statistics [heal-count [replica <HOSTNAME:BRICKNAME>]] |info [healed | heal-failed | split-brain] |split-brain {bigger-file <FILE> | latest-mtime <FILE> |source-brick <HOSTNAME:BRICKNAME> [<FILE>]} |granular-entry-heal {enable | disable}] - self-heal commands on volume specified by <VOLNAME> volume statedump <VOLNAME> [[nfs|quotad] [all|mem|iobuf|callpool|priv|fd|inode|history]... | [client <hostname:process-id>]] - perform statedump on bricks volume list - list all volumes in cluster volume clear-locks <VOLNAME> <path> kind {blocked|granted|all}{inode [range]|entry [basename]|posix [range]} - Clear locks held on path volume barrier <VOLNAME> {enable|disable} - Barrier/unbarrier file operations on a volume volume get <VOLNAME> <key|all> - Get the value of the all options or given option for volume <VOLNAME> volume bitrot <VOLNAME> {enable|disable} | 錯誤偵測 volume bitrot <volname> scrub-throttle {lazy|normal|aggressive} | volume bitrot <volname> scrub-frequency {hourly|daily|weekly|biweekly|monthly} | volume bitrot <volname> scrub {pause|resume|status|ondemand} - Bitrot translator specific operation. For more information about bitrot command type 'man gluster' volume reset-brick <VOLNAME> <SOURCE-BRICK> {{start} | {<NEW-BRICK> commit}} - reset-brick operations peer probe { <HOSTNAME> | <IP-address> } - probe peer specified by <HOSTNAME> peer detach { <HOSTNAME> | <IP-address> } [force] - detach peer specified by <HOSTNAME> peer status - list status of peers peer help - Help command for peer pool list - list all the nodes in the pool (including localhost) quit - quit help - display command options exit - exit snapshot help - display help for snapshot commands snapshot create <snapname> <volname> [no-timestamp] [description <description>] [force] - Snapshot Create. snapshot clone <clonename> <snapname> - Snapshot Clone. snapshot restore <snapname> - Snapshot Restore. snapshot status [(snapname | volume <volname>)] - Snapshot Status. snapshot info [(snapname | volume <volname>)] - Snapshot Info. snapshot list [volname] - Snapshot List. snapshot config [volname] ([snap-max-hard-limit <count>] [snap-max-soft-limit <percent>]) | ([auto-delete <enable|disable>])| ([activate-on-create <enable|disable>]) - Snapshot Config. snapshot delete (all | snapname | volume <volname>) - Snapshot Delete. snapshot activate <snapname> [force] - Activate snapshot volume. snapshot deactivate <snapname> - Deactivate snapshot volume. global help - list global commands nfs-ganesha {enable| disable} - Enable/disable NFS-Ganesha support get-state [<daemon>] [odir </path/to/output/dir/>] [file <filename>] - Get local state representation of mentioned daemon
1. 服務器節點
為存儲池添加/移除服務器節點, 在其中一個節點上操作即可:
# gluster peer status // 查看所有節點信息,顯示時不包括本節點 # gluster peer probe <SERVER> // 添加節點 # gluster peer detach <SERVER> // 移除節點,需要提前將該節點上的brick移除
注意,移除節點時,需要提前將該節點上的Brick移除。
查看所有節點的基本狀態(顯示的時候不包括本節點):
# gluster peer status
[root@vClass-DgqxC init.d]# gluster peer probe 10.17.5.247
peer probe: success
[root@vClass-DgqxC init.d]# gluster peer status Number of Peers: 1 Hostname: 10.17.5.247 Port: 24007 Uuid: 8fd5d483-c883-4e22-a372-1ac058ce10d2 State: Peer in Cluster (Connected) [root@vClass-DgqxC init.d]#
2. 卷管理
2.1. 創建/啟動/停止/刪除卷
# gluster volume create <NEW-VOLNAME>[stripe <COUNT> | replica <COUNT>] [transport [tcp | rdma | tcp,rdma]] <NEW-BRICK1> <NEW-BRICK2> <NEW-BRICK3> <NEW-BRICK4>... # gluster volume start <VOLNAME> # gluster volume stop <VOLNAME> # gluster volume delete <VOLNAME>
注意,刪除卷的前提是先停止卷。
<1>復制卷
語法: gluster volume create NEW-VOLNAME [replica COUNT] [transport tcp | rdma | tcp, rdma] NEW-BRICK
示例1:gluster volume create test-volume replica 2 transport tcp server1:/exp1/brick server2:/exp2/brick
<2>條帶卷
語法:gluster volume create NEW-VOLNAME [stripe COUNT] [transport tcp | rdma | tcp, rdma] NEW-BRICK...
示例:gluster volume create test-volume stripe 2 transport tcp server1:/exp1/brick server2:/exp2/brick
<3>分布式卷
語法: gluster volume create NEW-VOLNAME [transport tcp | rdma | tcp, rdma] NEW-BRICK
示例1:gluster volume create test-volume server1:/exp1/brick server2:/exp2/brick
示例2:gluster volume create test-volume transport rdma server1:/exp1/brick server2:/exp2/brick server3:/exp3/brick server4:/exp4/brick
<4>分布式復制卷
語法: gluster volume create NEW-VOLNAME [replica COUNT] [transport tcp | rdma | tcp, rdma] NEW-BRICK...
示例: gluster volume create test-volume replica 2 transport tcp server1:/exp1/brick server2:/exp2/brick server3:/exp3/brick server4:/exp4/brick
<5>分布式條帶卷
語法:gluster volume create NEW-VOLNAME [stripe COUNT] [transport tcp | rdma | tcp, rdma] NEW-BRICK...
示例:gluster volume create test-volume stripe 2 transport tcp server1:/exp1/brick server2:/exp2/brick server3:/exp3/brick server4:/exp4/brick
<6>條帶復制卷
語法:gluster volume create NEW-VOLNAME [stripe COUNT] [replica COUNT] [transport tcp | rdma | tcp, rdma] NEW-BRICK...
示例:gluster volume create test-volume stripe 2 replica 2 transport tcp server1:/exp1/brick server2:/exp2/brick server3:/exp3/brick server4:/exp4/brick
2.2 查看卷信息
列出集群中的所有卷:
# gluster volume list /*列出集群中的所有卷*/
查看集群中的卷信息:
# gluster volume info [all] /*查看集群中的卷信息*/
查看集群中的卷狀態:
# gluster volume status [all] /*查看集群中的卷狀態*/
# gluster volume status <VOLNAME> [detail| clients | mem | inode | fd]
查看本節點的文件系統信息:
# df -h [<MOUNT-PATH>]
查看本節點的磁盤信息:
# fdisk -l
gluster> volume list
test-volume
gluster> volume info Volume Name: test-volume Type: Replicate Volume ID: bea9fbb5-6cbb-4fa6-905c-5b4791e53ae2 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 10.17.5.247:/mnt/ab796251092bf7c1d7657e728940927b/glusterfs_storage Brick2: 10.17.66.112:/mnt/02615f07eb1ecf582411bc5554a38170/glusterfs_storage
gluster> volume status Status of volume: test-volume Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick 10.17.5.247:/mnt/ab796251092bf7c1d7657e728940927b /glusterfs_storage 49152 Y 24996 Brick 10.17.66.112:/mnt/02615f07eb1ecf582411bc5554a3817 0/glusterfs_storage 49152 Y 20567 NFS Server on localhost 2049 Y 20579 Self-heal Daemon on localhost N/A Y 20584 NFS Server on 10.17.5.247 2049 Y 25008 Self-heal Daemon on 10.17.5.247 N/A Y 25012 There are no active volume tasks
gluster> volume status test-volume detail Status of volume: test-volume ------------------------------------------------------------------------------ Brick : Brick 10.17.5.247:/mnt/ab796251092bf7c1d7657e728940927b/glusterfs_storage Port : 49152 Online : Y Pid : 24996 File System : xfs Device : /dev/mapper/defaultlocal-defaultlocal Mount Options : rw,noatime,nodiratime,osyncisdsync,usrquota,grpquota Inode Size : 256 Disk Space Free : 14.8GB Total Disk Space : 24.4GB Inode Count : 25698304 Free Inodes : 25698188 ------------------------------------------------------------------------------ Brick : Brick 10.17.66.112:/mnt/02615f07eb1ecf582411bc5554a38170/glusterfs_storage Port : 49152 Online : Y Pid : 20567 File System : xfs Device : /dev/mapper/defaultlocal-defaultlocal Mount Options : rw,noatime,nodiratime,usrquota,grpquota Inode Size : 256 Disk Space Free : 4.4GB Total Disk Space : 4.4GB Inode Count : 4726784 Free Inodes : 4726699
gluster> volume status test-volume mem Memory status for volume : test-volume ---------------------------------------------- Brick : 10.17.5.247:/mnt/ab796251092bf7c1d7657e728940927b/glusterfs_storage Mallinfo -------- Arena : 2760704 Ordblks : 8 Smblks : 1 Hblks : 12 Hblkhd : 17743872 Usmblks : 0 Fsmblks : 80 Uordblks : 2623888 Fordblks : 136816 Keepcost : 134384 Mempool Stats ------------- Name HotCount ColdCount PaddedSizeof AllocCount MaxAlloc Misses Max-StdAlloc ---- -------- --------- ------------ ---------- -------- -------- ------------ test-volume-server:fd_t 0 1024 108 0 0 0 0 test-volume-server:dentry_t 0 16384 84 0 0 0 0 test-volume-server:inode_t 1 16383 156 4 2 0 0 test-volume-locks:pl_local_t 0 32 140 3 1 0 0 test-volume-marker:marker_local_t 0 128 316 0 0 0 0 test-volume-server:rpcsvc_request_t 0 512 6116 38 2 0 0 glusterfs:struct saved_frame 0 8 124 2 2 0 0 glusterfs:struct rpc_req 0 8 2236 2 2 0 0 glusterfs:rpcsvc_request_t 1 7 6116 10 1 0 0 glusterfs:data_t 119 16264 52 790 167 0 0 glusterfs:data_pair_t 117 16266 68 752 157 0 0 glusterfs:dict_t 13 4083 140 65 15 0 0 glusterfs:call_stub_t 0 1024 3756 22 2 0 0 glusterfs:call_stack_t 1 1023 1836 33 2 0 0 glusterfs:call_frame_t 0 4096 172 104 10 0 0 ---------------------------------------------- Brick : 10.17.66.112:/mnt/02615f07eb1ecf582411bc5554a38170/glusterfs_storage Mallinfo -------- Arena : 2760704 Ordblks : 10 Smblks : 1 Hblks : 12 Hblkhd : 17743872 Usmblks : 0 Fsmblks : 80 Uordblks : 2624400 Fordblks : 136304 Keepcost : 133952 Mempool Stats ------------- Name HotCount ColdCount PaddedSizeof AllocCount MaxAlloc Misses Max-StdAlloc ---- -------- --------- ------------ ---------- -------- -------- ------------ test-volume-server:fd_t 0 1024 108 4 1 0 0 test-volume-server:dentry_t 0 16384 84 0 0 0 0 test-volume-server:inode_t 2 16382 156 4 2 0 0 test-volume-locks:pl_local_t 0 32 140 5 1 0 0 test-volume-marker:marker_local_t 0 128 316 0 0 0 0 test-volume-server:rpcsvc_request_t 0 512 6116 58 2 0 0 glusterfs:struct saved_frame 0 8 124 3 2 0 0 glusterfs:struct rpc_req 0 8 2236 3 2 0 0 glusterfs:rpcsvc_request_t 1 7 6116 10 1 0 0 glusterfs:data_t 119 16264 52 859 167 0 0 glusterfs:data_pair_t 117 16266 68 796 157 0 0 glusterfs:dict_t 13 4083 140 80 15 0 0 glusterfs:call_stub_t 0 1024 3756 47 2 0 0 glusterfs:call_stack_t 1 1023 1836 54 2 0 0 glusterfs:call_frame_t 0 4096 172 183 13 0 0 ----------------------------------------------
gluster> volume status test-volume clients Client connections for volume test-volume ---------------------------------------------- Brick : 10.17.5.247:/mnt/ab796251092bf7c1d7657e728940927b/glusterfs_storage Clients connected : 4 Hostname BytesRead BytesWritten -------- --------- ------------ 10.17.5.247:1021 2340 2264 10.17.5.247:1015 3348 2464 10.17.66.112:1019 848 436 10.17.66.112:1012 1104 684 ---------------------------------------------- Brick : 10.17.66.112:/mnt/02615f07eb1ecf582411bc5554a38170/glusterfs_storage Clients connected : 4 Hostname BytesRead BytesWritten -------- --------- ------------ 10.17.66.112:1023 3256 3100 10.17.66.112:1022 4192 2656 10.17.5.247:1020 1228 808 10.17.5.247:1014 2924 2476 ----------------------------------------------
gluster> volume status test-volume inode Inode tables for volume test-volume ---------------------------------------------- Brick : 10.17.5.247:/mnt/ab796251092bf7c1d7657e728940927b/glusterfs_storage Active inodes: GFID Lookups Ref IA type ---- ------- --- ------- 00000000-0000-0000-0000-000000000001 0 65 D ---------------------------------------------- Brick : 10.17.66.112:/mnt/02615f07eb1ecf582411bc5554a38170/glusterfs_storage Active inodes: GFID Lookups Ref IA type ---- ------- --- ------- 00000000-0000-0000-0000-000000000001 0 127 D LRU inodes: GFID Lookups Ref IA type ---- ------- --- ------- d271a384-01c1-4e29-9171-0b3b58033936 4 0 D ----------------------------------------------
gluster> volume status test-volume fd FD tables for volume test-volume ---------------------------------------------- Brick : 10.17.5.247:/mnt/ab796251092bf7c1d7657e728940927b/glusterfs_storage Connection 1: RefCount = 0 MaxFDs = 128 FirstFree = 0 No open fds Connection 2: RefCount = 0 MaxFDs = 128 FirstFree = 0 No open fds Connection 3: RefCount = 0 MaxFDs = 128 FirstFree = 0 No open fds Connection 4: RefCount = 0 MaxFDs = 128 FirstFree = 0 No open fds ---------------------------------------------- Brick : 10.17.66.112:/mnt/02615f07eb1ecf582411bc5554a38170/glusterfs_storage Connection 1: RefCount = 0 MaxFDs = 128 FirstFree = 0 No open fds Connection 2: RefCount = 0 MaxFDs = 128 FirstFree = 0 No open fds Connection 3: RefCount = 0 MaxFDs = 128 FirstFree = 0 No open fds Connection 4: RefCount = 0 MaxFDs = 128 FirstFree = 0 No open fds ----------------------------------------------
2.3. 配置卷
# gluster volume set <VOLNAME> <OPTION> <PARAMETER>
如何查看配置項?
2.4. 擴展卷
# gluster volume add-brick <VOLNAME> <NEW-BRICK>
注意,如果是復制卷或者條帶卷,則每次添加的Brick數必須是replica或者stripe的整數倍。
2.5. 收縮卷
先將數據遷移到其它可用的Brick,遷移結束后才將該Brick移除:
# gluster volume remove-brick <VOLNAME> <BRICK> start
在執行了start之后,可以使用status命令查看移除進度:
# gluster volume remove-brick <VOLNAME> <BRICK> status
不進行數據遷移,直接刪除該Brick:
# gluster volume remove-brick <VOLNAME> <BRICK> commit
注意,如果是復制卷或者條帶卷,則每次移除的Brick數必須是replica或者stripe的整數倍。
gluster> volume remove-brick test-volume 10.17.66.112:/mnt/02615f07eb1ecf582411bc5554a38170/glusterfs_storage commit Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y volume remove-brick commit: failed: Removing bricks from replicate configuration is not allowed without reducing replica count explicitly. gluster> volume remove-brick test-volume replic 1 10.17.66.112:/mnt/02615f07eb1ecf582411bc5554a38170/glusterfs_storage commit Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y volume remove-brick commit: success gluster> volume info Volume Name: test-volume Type: Distribute Volume ID: bea9fbb5-6cbb-4fa6-905c-5b4791e53ae2 Status: Started Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: 10.17.5.247:/mnt/ab796251092bf7c1d7657e728940927b/glusterfs_storage gluster>
2.6. 遷移卷
使用start命令開始進行遷移:
# gluster volume replace-brick <VOLNAME> <BRICK> <NEW-BRICK> start
在數據遷移過程中,可以使用pause命令暫停遷移:
# gluster volume replace-brick <VOLNAME> <BRICK> <NEW-BRICK> pause
在數據遷移過程中,可以使用abort命令終止遷移:
# gluster volume replace-brick <VOLNAME> <BRICK> <NEW-BRICK> abort
在數據遷移過程中,可以使用status命令查看遷移進度:
# gluster volume replace-brick <VOLNAME> <BRICK> <NEW-BRICK> status
在數據遷移結束后,執行commit命令來進行Brick替換:
# gluster volume replace-brick <VOLNAME> <BRICK> <NEW-BRICK> commit
2.7. 平衡卷:
平衡布局是很有必要的,因為布局結構是靜態的,當新的bricks 加入現有卷,新創建的文件會分布到舊的bricks 中,所以需要平衡布局結構,使新加入的bricks 生效。布局平衡只是使
新布局生效,並不會在新的布局移動老的數據,如果你想在新布局生效后,重新平衡卷中的數據,還需要對卷中的數據進行平衡。
當你擴展或者縮小卷之后,需要重新在服務器直接重新平衡一下數據,重新平衡的操作被分
為兩個步驟:
1、Fix Layout
修改擴展或者縮小后的布局,以確保文件可以存儲到新增加的節點中。
2、Migrate Data
重新平衡數據在新加入bricks 節點之后。
* Fix Layout and Migrate Data
先重新修改布局然后移動現有的數據(重新平衡)
#gluster volume rebalance VOLNAME fix-layout start
#gluster volume rebalance VOLNAME migrate-data start
也可以兩步合一步同時操作
#gluster volume rebalance VOLNAME start
#gluster volume rebalance VOLNAME status // 你可以在在平衡過程中查看平衡信息
#gluster volume rebalance VOLNAME stop // 你也可以暫停平衡,再次啟動平衡的時候會從上次暫停的地方繼續開始平衡。
2.8. 重新均衡卷
不遷移數據:
# gluster volume rebalance <VOLNAME> lay-outstart
# gluster volume rebalance <VOLNAME> start # gluster volume rebalance <VOLNAME> startforce # gluster volume rebalance <VOLNAME> status # gluster volume rebalance <VOLNAME> stop
2.9. I/O 信息查看:
Profile Command 提供接口查看一個卷中的每一個brick 的IO 信息
#gluster volume profile VOLNAME start // 啟動profiling,之后則可以進行IO信息查看 #gluster volume profile VOLNAME info // 查看IO信息,可以查看到每一個Brick的IO信息 #gluster volume profile VOLNAME stop // 查看結束之后關閉profiling功能
2.10. 系統配額:
1、開啟/關閉系統配額
#gluster volume quota VOLNAME enable/disable
2、設置(重置)目錄配額
#gluster volume quota VOLNAME limit-usage /img limit-value
#gluster volume quota img limit-usage /quota 10GB
設置img 卷下的quota 子目錄的限額為10GB。這個目錄是以系統掛載目錄為根目錄”/”,所以/quota 即客戶端掛載目錄下的子目錄quota
3、配額查看
#gluster volume quota VOLNAME list
#gluster volume quota VOLNAME list
可以使用如上兩個命令進行系統卷的配額查看,第一個命令查看目的卷的所有配額設置,
第二個命令則是執行目錄進行查看。可以顯示配額大小及當前使用容量,若無使用容量(最小0KB)則說明設置的目錄可能是錯誤的(不存在)。
2.11. 地域復制:
gluster volume geo-replication MASTER SLAVE start/status/stop
// 地域復制是系統提供的災備功能,能夠將系統的全部數據進行異步的增量備份到另外的磁盤中。
#gluster volume geo-replication img 192.168.10.8:/data1/brick1 start
如上,開始執行將img 卷的所有內容備份到10.8 下的/data1/brick1 中的task,需要注意的是,這個備份目標不能是系統中的Brick。
2.12. Top監控:
Top command 允許你查看bricks 的性能例如:read, write, fileopen calls, file read calls, file write calls, directory open calls, and directory real calls
所有的查看都可以設置top 數,默認100
#gluster volume top VOLNAME open [brick BRICK-NAME][list-cntcnt] // 查看打開的fd #gluster volume top VOLNAME read [brick BRICK-NAME][list-cntcnt] // 查看調用次數最多的讀調用 #gluster volume top VOLNAME write [brick BRICK-NAME][list-cntcnt] // 查看調用次數最多的寫調用 #gluster volume top VOLNAME opendir [brick BRICK-NAME][list-cntcnt] // 查看次數最多的目錄調用 #gluster volume top VOLNAME readdir [brick BRICK-NAME][list-cntcnt] // 查看次數最多的目錄調用 #gluster volume top VOLNAME read-perf[bsblk-size count count][brick BRICK-NAME][list-cntcnt] // 查看每個Brick的讀性能 #gluster volume top VOLNAME write-perf[bsblk-size count count][brick BRICK-NAME][list-cntcnt] // 查看每個Brick的寫性能
2.13. 性能優化配置選項:
#gluster volume set arch-img cluster.min-free-disk 默認是10% 磁盤剩余告警 #gluster volume set arch-img cluster.min-free-inodes 默認是5% inodes剩余告警 #gluster volume set img performance.read-ahead-page-count 8 默認4,預讀取的數量 #gluster volume set img performance.io-thread-count 16 默認16 io操作的最大線程 #gluster volume set arch-img network.ping-timeout 10 默認42s #gluster volume set arch-img performance.cache-size 2GB 默認128M或32MB, #gluster volume set arch-img cluster.self-heal-daemonon 開啟目錄索引的自動愈合進程 #gluster volume set arch-img cluster.heal-timeout 300 自動愈合的檢測間隔,默認為600s #3.4.2版本才有 #gluster volume set arch-img performance.write-behind-window-size 256MB #默認是1M 能提高寫性能單個文件后寫緩沖區的大小默認1M
# gluster volume set dis-rep features.shard on
# gluster volume set dis-rep features.shard-block-size 16MB
預讀中繼(performance/read-ahead)屬於性能調整中繼的一種,用預讀的方式提高讀取的性能。
讀取操作前就預先抓取數據。這個有利於應用頻繁持續性的訪問文件,當應用完成當前數據塊讀取的時候,下一個數據塊就已經准備好了。
額外的,預讀中繼也可以扮演讀聚合器,許多小的讀操作被綁定起來,當成一個大的讀請求發送給服務器。
預讀處理有page-size和page-count來定義,page-size定義了,一次預讀取的數據塊大小,page-count定義的是被預讀取的塊的數量
不過官方網站上說這個中繼在以太網上沒有必要,一般都能跑滿帶寬。主要是在IB-verbs或10G的以太網上用。
volume readahead-example
type performance/read-ahead
option page-size 256 # 每次預讀取的數據塊大小
option page-count4 # 每次預讀取數據塊的數量
option force-atime-update off#是否強制在每次讀操作時更新文件的訪問時間,不設置這個,訪問時間將有些不精確,這個將影響預讀轉換器讀取數據時的那一時刻而不是應用真實讀到數據的那一時刻。
subvolumes
end-volume
回寫中繼(performance/read-ahead)屬於性能調整中繼的一種,作用是在寫數據時,先寫入緩存內,再寫入硬盤。以提高寫入的性能。
回寫中繼改善了了寫操作的延時。它會先把寫操作發送到后端存儲,同時返回給應用寫操作完畢,而實際上寫的操作還正在執行。使用后寫轉換器就可以像流水線一樣把寫請求持續發送。這個后寫操作模塊更適合使用在client端,以期減少應用的寫延遲。
回寫中繼同樣可以聚合寫請求。如果aggregate-size選項設置了的話,當連續的寫入大小累積起來達到了設定的值,就通過一個寫操作寫入到存儲上。這個操作模式適合應用在服務器端,以為這個可以在多個文件並行被寫入磁盤時降低磁頭動作。
volume write-behind-example
type performance/write-behind
option cache-size3MB #緩存大小,當累積達到這個值才進行實際的寫操作
option flush-behind on #這個參數調整close()/flush()太多的情況,適用於大量小文件的情況
subvolumes
end-volume
IO線程中繼(performance/io-threads)屬於性能調整中繼的一種,作用是增加IO的並發線程,以提高IO性能。
IO線程中繼試圖增加服務器后台進程對文件元數據讀寫I/O的處理能力。由於GlusterFS服務是單線程的,使用IO線程轉換器可以較大的提高性能。這個轉換器最好是被用於服務器端,而且是在服務器協議轉換器后面被加載。
IO線程操作會將讀和寫操作分成不同的線程。同一時刻存在的總線程是恆定的並且是可以配置的。
volume iothreads type performance/io-threads option thread-count 32 # 線程使用的數量 subvolumes end-volume
IO緩存中繼(performance/io-threads)屬於性能調整中繼的一種,作用是緩存住已經被讀過的數據,以提高IO性能。
IO緩存中繼可以緩存住已經被讀過的數據。這個對於多個應用對同一個數據多次訪問,並且如果讀的操作遠遠大於寫的操作的話是很有用的(比如,IO緩存很適合用於提供web服務的環境,大量的客戶端只會進行簡單的讀取文件的操作,只有很少一部分會去寫文件)。
當IO緩存中繼檢測到有寫操作的時候,它就會把相應的文件從緩存中刪除。
IO緩存中繼會定期的根據文件的修改時間來驗證緩存中相應文件的一致性。驗證超時時間是可以配置的。
volume iothreads type performance/ io-cache option cache-size 32MB #可以緩存的最大數據量 option cache-timeout 1 #驗證超時時間,單位秒 option priority *:0 #文件匹配列表及其設置的優先級 subvolumes end-volume
3 Brick管理
3.1 添加 Brick
# gluster volume add-brick test-volume 192.168.1.{151,152}:/mnt/brick2
3.2 刪除 Brick
若是副本卷,則移除的Bricks數是replica的整數倍
#gluster volume remove-brick test-volume 192.168.1.{151,152}:/mnt/brick2 start
在執行開始移除之后,可以使用status命令進行移除狀態查看。
#gluster volume remove-brick test-volume 192.168.1.{151,152}:/mnt/brick2 status
使用commit命令執行Brick移除,則不會進行數據遷移而直接刪除Brick,符合不需要數據遷移的用戶需求。
#gluster volume remove-brick test-volume 192.168.1.{151,152}:/mnt/brick2 commit
3.3 替換 Brick
任務:把192.168.1.151:/mnt/brick0 替換為192.168.1.151:/mnt/brick2
<1>開始替換
#gluster volume replace-brick test-volume 192.168.1.:/mnt/brick0 ..152:/mnt/brick2 start
異常信息:volume replace-brick: failed: /data/share2 or a prefix of it is already part of a volume
說明 /mnt/brick2 曾經是一個Brick。具體解決方法
# rm -rf /mnt/brick2/.glusterfs # setfattr -x trusted.glusterfs.volume-id /mnt/brick2 # setfattr -x trusted.gfid /mnt/brick2
//如上,執行replcace-brick卷替換啟動命令,使用start啟動命令后,開始將原始Brick的數據遷移到即將需要替換的Brick上。
<2>查看是否替換完
#gluster volume replace-brick test-volume 192.168.1.151:/mnt/brick0 ..152:/mnt/brick2 status
<3>在數據遷移的過程中,可以執行abort命令終止Brick替換。
#gluster volume replace-brick test-volume 192.168.1.151:/mnt/brick0 ..152:/mnt/brick2 abort
<4>在數據遷移結束之后,執行commit命令結束任務,則進行Brick替換。使用volume info命令可以查看到Brick已經被替換。
#gluster volume replace-brick test-volume 192.168.1.151:/mnt/brick0 .152:/mnt/brick2 commit
# 此時我們再往 /sf/data/vs/gfs/rep2上添加數據的話,數據會同步到 192.168.1.152:/mnt/brick0和192.168.1.152:/mnt/brick2上。而不會同步到
192.168.1.151:/mnt/brick0 上。
glusterfs 命令
[root@vClass-DgqxC glusterfs]# glusterfs --help Usage: glusterfs [OPTION...] --volfile-server=SERVER [MOUNT-POINT] or: glusterfs [OPTION...] --volfile=VOLFILE [MOUNT-POINT] Basic options: -f, --volfile=VOLFILE File to use as VOLUME_FILE -l, --log-file=LOGFILE File to use for logging [default: /var/log/glusterfs/glusterfs.log] -L, --log-level=LOGLEVEL Logging severity. Valid options are DEBUG, INFO, WARNING, ERROR, CRITICAL and NONE [default: INFO] -s, --volfile-server=SERVER Server to get the volume file from. This option overrides --volfile option --volfile-max-fetch-attempts=MAX-ATTEMPTS Maximum number of connect attempts to server. This option should be provided with --volfile-server option[default: 1] Advanced Options: --acl Mount the filesystem with POSIX ACL support --debug Run in debug mode. This option sets --no-daemon, --log-level to DEBUG and --log-file to console --enable-ino32[=BOOL] Use 32-bit inodes when mounting to workaround broken applicationsthat don't support 64-bit inodes --fopen-keep-cache Do not purge the cache on file open --mac-compat[=BOOL] Provide stubs for attributes needed for seamless operation on Macs [default: "off"] -N, --no-daemon Run in foreground -p, --pid-file=PIDFILE File to use as pid file --read-only Mount the filesystem in 'read-only' mode --selinux Enable SELinux label (extened attributes) support on inodes -S, --socket-file=SOCKFILE File to use as unix-socket --volfile-id=KEY 'key' of the volfile to be fetched from server --volfile-server-port=PORT Listening port number of volfile server --volfile-server-transport=TRANSPORT Transport type to get volfile from server [default: socket] --volume-name=XLATOR-NAME Translator name to be used for MOUNT-POINT [default: top most volume definition in VOLFILE] --worm Mount the filesystem in 'worm' mode --xlator-option=XLATOR-NAME.OPTION=VALUE Add/override an option for a translator in volume file with specified value Fuse options: --attribute-timeout=SECONDS Set attribute timeout to SECONDS for inodes in fuse kernel module [default: 1] --background-qlen=N Set fuse module's background queue length to N [default: 64] --congestion-threshold=N Set fuse module's congestion threshold to N [default: 48] --direct-io-mode[=BOOL] Use direct I/O mode in fuse kernel module [default: "off" if big writes are supported, else "on" for fds not opened with O_RDONLY] --dump-fuse=PATH Dump fuse traffic to PATH --entry-timeout=SECONDS Set entry timeout to SECONDS in fuse kernel module [default: 1] --gid-timeout=SECONDS Set auxilary group list timeout to SECONDS for fuse translator [default: 0] --negative-timeout=SECONDS Set negative timeout to SECONDS in fuse kernel module [default: 0] --use-readdirp[=BOOL] Use readdirp mode in fuse kernel module [default: "off"] --volfile-check Enable strict volume file checking Miscellaneous Options: -?, --help Give this help list --usage Give a short usage message -V, --version Print program version Mandatory or optional arguments to long options are also mandatory or optional for any corresponding short options. Report bugs to <gluster-users@gluster.org>.
glusterfs --version
glusterfs --version glusterfs 3.4.2 built on Mar 19 2014 02:14:52 Repository revision: git://git.gluster.com/glusterfs.git Copyright (c) 2006-2013 Red Hat, Inc. <http://www.redhat.com/> GlusterFS comes with ABSOLUTELY NO WARRANTY. It is licensed to you under your choice of the GNU Lesser General Public License, version 3 or any later version (LGPLv3 or later), or the GNU General Public License, version 2 (GPLv2), in all cases as published by the Free Software Foundation.
glusterfs配置
默認配置文件位置: /var/lib/glusterd/
/var/lib/glusterd/glustershd\glustershd-server.vol
卷的配置文件: /var/lib/glusterd/vols/[volname....]
glusterfsd 進程使用的配置文件
peer的配置文件: /var/lib/glusterd/peers/[peer id] 非本地,其他主機
服務器端配置
# vi /etc/glusterfs/glusterfsd.vol volume posix type storage/posix option directory /sda4 end-volume volume locks type features/locks subvolumes posix end-volume volume brick type performance/io-threads option thread-count 8 subvolumes locks end-volume volume server type protocol/server option transport-type tcp subvolumes brick option auth.addr.brick.allow * end-volume
保存后啟動GlusterFS:
# service glusterfsd start
客戶端配置
# vi /etc/glusterfs/glusterfs.vol volume brick1 type protocol/client option transport-type tcp end-volume volume brick2 type protocol/client option transport-type tcp option remote-host .12 option remote-subvolume brick end-volume volume brick3 type protocol/client option transport-type tcp option remote-host .13 option remote-subvolume brick end-volume volume brick4 type protocol/client option transport-type tcp option remote-host .14 option remote-subvolume brick end-volume volume brick5 type protocol/client option transport-type tcp option remote-host .15 option remote-subvolume brick end-volume volume brick6 type protocol/client option transport-type tcp option remote-host .16 option remote-subvolume brick end-volume volume afr1 type cluster/replicate subvolumes brick1 brick2 end-volume volume afr2 type cluster/replicate subvolumes brick3 brick4 end-volume volume afr3 type cluster/replicate subvolumes brick5 brick6 end-volume volume unify type cluster/distribute subvolumes afr1 afr2 afr3 end-volume
GlusterFS的主要配置都在客戶端上,上面配置文件的意思是把6台服務器分成3個replicate卷,再用這3個replicate卷做成一個distribute,提供應用程序使用。
配置文件:
/etc/glusterfs/glusterd.vol
volume management type mgmt/glusterd option working-directory /var/lib/glusterd option transport-type socket,rdma option transport.socket.keepalive-time 10 option transport.socket.keepalive-interval 2 option transport.socket.read-fail-log off # option base-port 49152 end-volume
/usr/sbin/glusterfs --volfile-id=test-volume --volfile-server=10.17.5.247 glusterfs-mnt
Given volfile: +------------------------------------------------------------------------------+ 1: volume test-volume-client-0 2: type protocol/client 3: option password ed44ac73-28d1-41a9-be3a-2888738a50b6 4: option username 8b67c943-8c92-4fbd-9cd5-8750b2f513a4 5: option transport-type tcp 6: option remote-subvolume /mnt/ab796251092bf7c1d7657e728940927b/glusterfs_storage 7: option remote-host 10.17.5.247 8: end-volume 9: 10: volume test-volume-client-1 11: type protocol/client 12: option password ed44ac73-28d1-41a9-be3a-2888738a50b6 13: option username 8b67c943-8c92-4fbd-9cd5-8750b2f513a4 14: option transport-type tcp 15: option remote-subvolume /mnt/02615f07eb1ecf582411bc5554a38170/glusterfs_storage 16: option remote-host 10.17.66.112 17: end-volume 18: 19: volume test-volume-dht 20: type cluster/distribute 21: subvolumes test-volume-client-0 test-volume-client-1 22: end-volume 23: 24: volume test-volume-write-behind 25: type performance/write-behind 26: subvolumes test-volume-dht 27: end-volume 28: 29: volume test-volume-read-ahead 30: type performance/read-ahead 31: subvolumes test-volume-write-behind 32: end-volume 33: 34: volume test-volume-io-cache 35: type performance/io-cache 36: subvolumes test-volume-read-ahead 37: end-volume 38: 39: volume test-volume-quick-read 40: type performance/quick-read 41: subvolumes test-volume-io-cache 42: end-volume 43: 44: volume test-volume-open-behind 45: type performance/open-behind 46: subvolumes test-volume-quick-read 47: end-volume 48: 49: volume test-volume-md-cache 50: type performance/md-cache 51: subvolumes test-volume-open-behind 52: end-volume 53: 54: volume test-volume 55: type debug/io-stats 56: option count-fop-hits off 57: option latency-measurement off 58: subvolumes test-volume-md-cache 59: end-volume
啟動一個卷,有兩個進程: glusterfs 和 glusterfsd 兩個進程進程。
root 55579 0.0 0.5 472788 21640 ? Ssl Jun27 0:03 /usr/sbin/glusterfsd -s 10.17.66.112
--volfile-id test-volume.10.17.66.112.mnt-02615f07eb1ecf582411bc5554a38170-glusterfs_storage
-p /var/lib/glusterd/vols/test-volume/run/10.17.66.112-mnt-02615f07eb1ecf582411bc5554a38170-glusterfs_storage.pid
-S /var/run/50a30acb134418925f349597a9e42150.socket
--brick-name /mnt/02615f07eb1ecf582411bc5554a38170/glusterfs_storage
-l /var/log/glusterfs/bricks/mnt-02615f07eb1ecf582411bc5554a38170-glusterfs_storage.log
--xlator-option *-posix.glusterd-uuid=7c8a3979-923a-454e-ab34-087e8937c99e
--brick-port 49153 --xlator-option test-volume-server.listen-port=49153 root 105463 0.5 1.0 291320 43436 ? Ssl 14:42 0:00 /usr/sbin/glusterfs -s localhost
--volfile-id gluster/nfs
-p /var/lib/glusterd/nfs/run/nfs.pid
-l /var/log/glusterfs/nfs.log
-S /var/run/62d848065f5f7d608d3e2cddf8080ca1.socket
localhost是 10.17.66.112
root 8970 0.1 0.5 471760 21288 ? Ssl 14:56 0:00 /usr/sbin/glusterfsd -s 10.17.66.112
--volfile-id test-vol2.10.17.66.112.mnt-dir1
-p /var/lib/glusterd/vols/test-vol2/run/10.17.66.112-mnt-dir1.pid
-S /var/run/b8a5734482b6ce3857f9005c434914d0.socket
--brick-name /mnt/dir1 -l /var/log/glusterfs/bricks/mnt-dir1.log
--xlator-option *-posix.glusterd-uuid=7c8a3979-923a-454e-ab34-087e8937c99e
--brick-port 49154
--xlator-option test-vol2-server.listen-port=49154 root 8983 0.4 0.5 405196 21020 ? Ssl 14:56 0:00 /usr/sbin/glusterfsd -s 10.17.66.112
--volfile-id test-vol2.10.17.66.112.mnt-dir2
-p /var/lib/glusterd/vols/test-vol2/run/10.17.66.112-mnt-dir2.pid
-S /var/run/a8ed48f14e7a0330e27285433cb9d53c.socket
--brick-name /mnt/dir2
-l /var/log/glusterfs/bricks/mnt-dir2.log
--xlator-option *-posix.glusterd-uuid=7c8a3979-923a-454e-ab34-087e8937c99e
--brick-port 49155
--xlator-option test-vol2-server.listen-port=49155 root 8992 0.1 0.5 471760 21000 ? Ssl 14:56 0:00 /usr/sbin/glusterfsd -s 10.17.66.112
--volfile-id test-vol2.10.17.66.112.mnt-dir3
-p /var/lib/glusterd/vols/test-vol2/run/10.17.66.112-mnt-dir3.pid
-S /var/run/951f675995a3c3e8fdd912d4eb113d45.socket
--brick-name /mnt/dir3
-l /var/log/glusterfs/bricks/mnt-dir3.log
--xlator-option *-posix.glusterd-uuid=7c8a3979-923a-454e-ab34-087e8937c99e
--brick-port 49156
--xlator-option test-vol2-server.listen-port=49156 root 9001 0.1 0.5 405196 20988 ? Ssl 14:56 0:00 /usr/sbin/glusterfsd -s 10.17.66.112
--volfile-id test-vol2.10.17.66.112.mnt-dir4
-p /var/lib/glusterd/vols/test-vol2/run/10.17.66.112-mnt-dir4.pid
-S /var/run/0d19719155089fe9e262af568636017a.socket
--brick-name /mnt/dir4
-l /var/log/glusterfs/bricks/mnt-dir4.log
--xlator-option *-posix.glusterd-uuid=7c8a3979-923a-454e-ab34-087e8937c99e
--brick-port 49157
--xlator-option test-vol2-server.listen-port=49157 root 9012 1.0 1.2 305420 50120 ? Ssl 14:56 0:00 /usr/sbin/glusterfs -s localhost
--volfile-id gluster/nfs
-p /var/lib/glusterd/nfs/run/nfs.pid
-l /var/log/glusterfs/nfs.log
-S /var/run/62d848065f5f7d608d3e2cddf8080ca1.socket root 9018 0.6 0.6 543608 26844 ? Ssl 14:56 0:00 /usr/sbin/glusterfs -s localhost
--volfile-id gluster/glustershd
-p /var/lib/glusterd/glustershd/run/glustershd.pid
-l /var/log/glusterfs/glustershd.log
-S /var/run/0c9486f066c00daabdaf7b0ec89bdd57.socket
--xlator-option *replicate*.node-uuid=7c8a3979-923a-454e-ab34-087e8937c99e
GlusterFS掛載
GlusterFS掛載為在客戶端上執行:
# glusterfs -f /etc/glusterfs/glusterfs.vol /gmnt/ -l/var/log/glusterfs/glusterfs.log -f /etc/glusterfs/glusterfs.vol為指定GlusterFS的配置文件 /gmnt 是掛載點 -l /var/log/glusterfs/glusterfs.log為日志
另外,GlusterFS也可以結果fstab或autofs方式開機掛載。掛載后就可以在/gmnt內讀寫文件了,用法與讀寫本地硬盤一樣。
Gluster的底層核心是GlusterFS分布式文件系統,為了滿足不同的應用負載需求,它提供了許多可調節的系統參數,其中與性能調優相關的主要參數包括:
(1)全局Cache-Size,缺省值32MB
(2)每文件Write-Cache-Size,缺省值1MB
(3)I/O並發數量,缺省值16
(4)Read-ahead開關,缺省值On
(5)條帶大小,缺省值128KB
FAQ:
1. 重新添加一個曾經交付的brick,報錯:
異常信息:volume replace-brick: failed: /data/share2 or a prefix of it is already part of a volume
說明 /mnt/brick2 曾經是一個Brick。具體解決方法
# rm -rf /mnt/brick2/.glusterfs # setfattr -x trusted.glusterfs.volume-id /mnt/brick2 // 移除目錄的擴展屬性 # setfattr -x trusted.gfid /mnt/brick2
2. 在glusterfs掛載點上,操作,出現: 傳輸端點尚未連接
通過 volume info 查看正常。
[root@vClass-ftYDC temp]# ls ls: 無法打開目錄.: 傳輸端點尚未連接 [root@vClass-ftYDC temp]# df 文件系統 1K-塊 已用 可用 已用% 掛載點 /dev/mapper/vg_vclassftydc-lv_root 13286512 3914004 8674552 32% / tmpfs 2013148 4 2013144 1% /dev/shm /dev/sda1 487652 31759 426197 7% /boot /dev/mapper/vg_vclassftydc-lv_var 20511356 496752 18949644 3% /var /dev/mapper/ssd-ssd 20836352 9829584 11006768 48% /mnt/787fe74d9bef93c17a7aa195a05245b3 /dev/mapper/defaultlocal-defaultlocal 25567232 10059680 15507552 40% /mnt/ab796251092bf7c1d7657e728940927b df: "/mnt/glusterfs-mnt": 傳輸端點尚未連接 [root@vClass-ftYDC temp]# cd .. -bash: cd: ..: 傳輸端點尚未連接 [root@vClass-ftYDC temp]#
原因: 出現如下3的情況。即glusterfs的掛載目錄,umount了。
3. 掛載glusterfs后,一段時間發現被umount,掛載日志如下:
[2017-06-28 05:45:51.861186] I [dht-layout.c:726:dht_layout_dir_mismatch] 0-test-vol2-dht: / - disk layout missing [2017-06-28 05:45:51.861332] I [dht-common.c:623:dht_revalidate_cbk] 0-test-vol2-dht: mismatching layouts for / [2017-06-28 05:46:02.499449] I [dht-layout.c:726:dht_layout_dir_mismatch] 0-test-vol2-dht: / - disk layout missing [2017-06-28 05:46:02.499520] I [dht-common.c:623:dht_revalidate_cbk] 0-test-vol2-dht: mismatching layouts for / [2017-06-28 05:46:05.701259] I [fuse-bridge.c:4628:fuse_thread_proc] 0-fuse: unmounting /mnt/glusterfs-mnt/ [2017-06-28 05:46:05.701898] W [glusterfsd.c:1002:cleanup_and_exit] (-->/lib64/libc.so.6(clone+0x6d) [0x7f09e546090d] (-->/lib64/libpthread.so.0(+0x7851) [0x7f09e5aff851] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xcd) [0x40533d]))) 0-: received signum (15), shutting down [2017-06-28 05:46:05.701911] I [fuse-bridge.c:5260:fini] 0-fuse: Unmounting '/mnt/glusterfs-mnt/'.
4. 恢復節點配置信息
故障現象: 其中一個節點配置信息不正確
故障模擬:
刪除server2部分配置信息
配置信息位置:/var/lib/glusterd/
修復方法:
觸發自修復:通過Gluster工具同步配置信息
#gluster volume sync server1 all
5. 復制卷數據不一致
故障現象: 雙副本卷數據出現不一致
故障模擬: 刪除其中一個brick數據
修復方法
新mount一下
6. 復制卷的目錄刪除了
如何解決
修復方法: 先替換brick
#gluster volume replace-brick bbs_img 10.20.0.201:/brick1/share2 10.20.0.201:/brick1/share start
#gluster volume replace-brick bbs_img 10.20.0.201:/brick1/share2 10.20.0.201:/brick1/share commit
還原的時候注意會報錯
解決辦法
rm -rf /data/share2/.glusterfs
setfattr -x trusted.glusterfs.volume-id /data/share2
setfattr -x trusted.gfid /data/share2
7. 出現這個問題
[root@vClass-v0jTX init.d]# gluster
gluster> volume set test_vol01 features.shard on
volume set: failed: One of the client 192.168.10.170:1023 is running with op-version 2 and doesn't support the required op-version 30700. This client needs to be upgraded or disconnected before running this command again
gluster>
8. 移除brick出現如下問題
gluster> volume remove-brick test_vol01 192.168.10.186:/mnt/63ef41a63399e6640a3c4abefa725497 192.168.10.186:/mnt/ad242fbe177ba330a0ea75a9d23fc936 force
Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y
volume remove-brick commit force: failed: One or more nodes do not support the required op-version. Cluster op-version must atleast be 30600.
gluster>
分析: 一個節點從3.4.2 升級到3.10.2版本。
解決:
9. gluster 掛載點無法使用,無法umount
[root@vClass-1iUBT etc]# umount /mnt/d1f561a32ac1bf17cf183f36baac34d4 umount: /mnt/d1f561a32ac1bf17cf183f36baac34d4: device is busy. (In some cases useful info about processes that use the device is found by lsof(8) or fuser(1)) [root@vClass-1iUBT etc]#
原因: 有一個glusterfs 進程正在使用。通過ps 查看 參數中有mount的,kill掉后,即可umount操作。
10. 掛載點無效的問題
在 /var/log/gluster/ 目錄下的掛載日志中,出現如下日志。
pending frames: frame : type(1) op(TRUNCATE) frame : type(1) op(OPEN) frame : type(0) op(0) frame : type(1) op(OPEN) frame : type(0) op(0) patchset: git://git.gluster.org/glusterfs.git signal received: 11 time of crash: 2017-08-23 08:40:38 signal received: 11 time of crash: 2017-08-23 08:40:38 configuration details: argp 1 backtrace 1 dlfcn 1 libpthread 1 argp 1 backtrace 1 dlfcn 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.10.2 package-string: glusterfs 3.10.2 /usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0x9c)[0x7fd00cb58f4c] /usr/lib64/libglusterfs.so.0(gf_print_trace+0x338)[0x7fd00cb64568] /lib64/libc.so.6(+0x32920)[0x7fd00b4d1920] /usr/lib64/glusterfs/3.10.2/xlator/features/shard.so(+0x16053)[0x7fcfff4fd053] /usr/lib64/glusterfs/3.10.2/xlator/features/shard.so(+0x1641d)[0x7fcfff4fd41d] /usr/lib64/glusterfs/3.10.2/xlator/features/shard.so(+0x16546)[0x7fcfff4fd546] /lib64/libc.so.6(+0x32920)[0x7fd00b4d1920] /usr/lib64/glusterfs/3.10.2/xlator/features/shard.so(+0x16053)[0x7fcfff4fd053] /usr/lib64/glusterfs/3.10.2/xlator/features/shard.so(+0x1641d)[0x7fcfff4fd41d] /usr/lib64/glusterfs/3.10.2/xlator/features/shard.so(+0x16546)[0x7fcfff4fd546] /usr/lib64/glusterfs/3.10.2/xlator/features/shard.so(+0x18f51)[0x7fcfff4fff51] /usr/lib64/glusterfs/3.10.2/xlator/features/shard.so(+0x19759)[0x7fcfff500759] /usr/lib64/glusterfs/3.10.2/xlator/features/shard.so(+0x5f61)[0x7fcfff4ecf61] /usr/lib64/glusterfs/3.10.2/xlator/features/shard.so(+0xf0ac)[0x7fcfff4f60ac] /usr/lib64/glusterfs/3.10.2/xlator/performance/write-behind.so(+0x4e51)[0x7fcfff2d3e51] /usr/lib64/libglusterfs.so.0(call_resume_keep_stub+0x78)[0x7fd00cb75378] /usr/lib64/glusterfs/3.10.2/xlator/performance/write-behind.so(+0x713a)[0x7fcfff2d613a] /usr/lib64/glusterfs/3.10.2/xlator/performance/write-behind.so(+0xa828)[0x7fcfff2d9828] /usr/lib64/glusterfs/3.10.2/xlator/performance/write-behind.so(+0xb238)[0x7fcfff2da238] /usr/lib64/glusterfs/3.10.2/xlator/performance/read-ahead.so(+0x5043)[0x7fcfff0c4043] /usr/lib64/libglusterfs.so.0(default_truncate+0xdf)[0x7fd00cbd81ef] /usr/lib64/glusterfs/3.10.2/xlator/performance/io-cache.so(+0x9032)[0x7fcffeca8032] /usr/lib64/glusterfs/3.10.2/xlator/performance/quick-read.so(+0x3dad)[0x7fcffea9adad] /usr/lib64/glusterfs/3.10.2/xlator/features/shard.so(+0xf0ac)[0x7fcfff4f60ac] /usr/lib64/glusterfs/3.10.2/xlator/performance/write-behind.so(+0x4e51)[0x7fcfff2d3e51] /usr/lib64/libglusterfs.so.0(call_resume_keep_stub+0x78)[0x7fd00cb75378] /usr/lib64/glusterfs/3.10.2/xlator/performance/write-behind.so(+0x713a)[0x7fcfff2d613a] /usr/lib64/glusterfs/3.10.2/xlator/performance/write-behind.so(+0xa828)[0x7fcfff2d9828] /usr/lib64/glusterfs/3.10.2/xlator/performance/write-behind.so(+0xb238)[0x7fcfff2da238] /usr/lib64/glusterfs/3.10.2/xlator/performance/read-ahead.so(+0x5043)[0x7fcfff0c4043] /usr/lib64/libglusterfs.so.0(default_truncate+0xdf)[0x7fd00cbd81ef] /usr/lib64/glusterfs/3.10.2/xlator/performance/io-cache.so(+0x9032)[0x7fcffeca8032] /usr/lib64/glusterfs/3.10.2/xlator/performance/quick-read.so(+0x3dad)[0x7fcffea9adad] /usr/lib64/libglusterfs.so.0(default_truncate+0xdf)[0x7fd00cbd81ef] /usr/lib64/glusterfs/3.10.2/xlator/performance/md-cache.so(+0x623e)[0x7fcffe67523e] /usr/lib64/glusterfs/3.10.2/xlator/debug/io-stats.so(+0x831e)[0x7fcffe44f31e] /usr/lib64/libglusterfs.so.0(default_truncate+0xdf)[0x7fd00cbd81ef] /usr/lib64/glusterfs/3.10.2/xlator/performance/md-cache.so(+0x623e)[0x7fcffe67523e] /usr/lib64/glusterfs/3.10.2/xlator/debug/io-stats.so(+0x831e)[0x7fcffe44f31e] /usr/lib64/libglusterfs.so.0(default_truncate+0xdf)[0x7fd00cbd81ef] /usr/lib64/glusterfs/3.10.2/xlator/meta.so(+0x3776)[0x7fcffe232776] /usr/lib64/glusterfs/3.10.2/xlator/mount/fuse.so(+0x10196)[0x7fd0046e7196] /usr/lib64/glusterfs/3.10.2/xlator/mount/fuse.so(+0x7245)[0x7fd0046de245] /usr/lib64/glusterfs/3.10.2/xlator/mount/fuse.so(+0x6f6e)[0x7fd0046ddf6e] /usr/lib64/glusterfs/3.10.2/xlator/mount/fuse.so(+0x728e)[0x7fd0046de28e] /usr/lib64/glusterfs/3.10.2/xlator/mount/fuse.so(+0x72f3)[0x7fd0046de2f3] /usr/lib64/glusterfs/3.10.2/xlator/mount/fuse.so(+0x7328)[0x7fd0046de328] /usr/lib64/glusterfs/3.10.2/xlator/mount/fuse.so(+0x71b8)[0x7fd0046de1b8] /usr/lib64/glusterfs/3.10.2/xlator/mount/fuse.so(+0x726e)[0x7fd0046de26e] /usr/lib64/glusterfs/3.10.2/xlator/mount/fuse.so(+0x72b8)[0x7fd0046de2b8] /usr/lib64/glusterfs/3.10.2/xlator/mount/fuse.so(+0x1ef20)[0x7fd0046f5f20] /lib64/libpthread.so.0(+0x7851)[0x7fd00bc26851] /lib64/libc.so.6(clone+0x6d)[0x7fd00b58790d] --------- /usr/lib64/glusterfs/3.10.2/xlator/mount/fuse.so(+0x71b8)[0x7fd0046de1b8] /usr/lib64/glusterfs/3.10.2/xlator/mount/fuse.so(+0x726e)[0x7fd0046de26e] /usr/lib64/glusterfs/3.10.2/xlator/mount/fuse.so(+0x72b8)[0x7fd0046de2b8] /usr/lib64/glusterfs/3.10.2/xlator/mount/fuse.so(+0x1ef20)[0x7fd0046f5f20] /lib64/libpthread.so.0(+0x7851)[0x7fd00bc26851] /lib64/libc.so.6(clone+0x6d)[0x7fd00b58790d] ---------
11. 在glusterfs存儲目錄出現文件異常大文件。
導致對文件操作失敗。
如下是brick實際的存儲目錄下,在掛載點進行scp vm_conf.xml.cacheBak vm_conf.xml, 導致在 glusterfs掛載點下看到的 vm_conf.xml達到16E的文件大小。
鏈接:
官網 https://www.gluster.org/
http://blog.gluster.org/
源碼: https://github.com/gluster/glusterfs
00086_GlusterFS性能監控&配額 https://github.com/jiobxn/one/wiki/00086_GlusterFS性能監控&配額
GlusterFS技術詳解 http://www.tuicool.com/articles/AbE7Vr
用GlusterFS搭建高可用存儲集群 http://blog.chinaunix.net/uid-144593-id-2804190.html
GlusterFS 筆記 https://my.oschina.net/kisops/blog/153466
使用Glusterfs作為kvm后端存儲 https://my.oschina.net/kisops/blog/153466
Gusterfs架構和維護 http://www.2cto.com/net/201607/526218.html
GlusterFS性能調優基本思路 http://blog.csdn.net/liuaigui/article/details/18015427
GlusterFS分布式集群文件系統安裝、配置及性能測試 http://www.3566t.com/news/show-748324.html
存儲 | 緩存和分層 https://sanwen8.cn/p/3728WOT.html
GlusterFS企業級功能之EC糾刪碼 http://www.taocloudx.com/index.php?a=shows&catid=4&id=68
初探glusterfs-處理腦裂(split-brain)狀態 http://nosmoking.blog.51cto.com/3263888/1718157/