How Data Is Stored In CEPH Cluster
本文翻譯自How Data Is Stored In CEPH Cluster
數據究竟是如何在Ceph Cluster中存儲的?
Now showing a easy to understand ceph data storage diagram.
現在介紹ceph的簡單存儲圖。(翻譯的什么鬼……)
POOLS
Ceph Cluster有pools,pools是存儲對象的邏輯分組。pools由Placement Group組成,創建Pools的過程中,我們要提供pg給pools,部分用於備份。
-
創建有128pg的pool
pool 'pool_a' created
-
顯示pools
ceph osd lspools
-
顯示pool中的pg數量
ceph osd pool get pool-A pg_num
-
Find out replication level being used by pool( see rep size value for replication )
ceph osd dump | grep -i pool-A
-
改變備份level
ceph osd pool set pool-A size 3
打印結果:set pool 36 size to 3
ceph osd dump | grep -i pool-A
打印結果:pool 36 'pool-A' rep size 3 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 4054 owner 0
-
在pool中上傳object
rados -p pool-A put object-A object-A
-
檢查pools中的objects
rados -p pool_a ls
Placement Group
Ceph cluster links objects –> PG . These PG containing objects are spread across multiple OSD and improves reliability(提高可靠性)。
Object
Object is the smallest unit of data storage in ceph cluster , Each & Everything is stored in the form of objects , thats why ceph cluster is also known as Object Storage Cluster. Objects are mapped to PG , and these Objects / their copies always spreaded on different OSD. This is how ceph is designed.
object 是最小的單元,這也就是為什么稱之為對象存儲集群的原因。objects映射到了placement group,objects和其復制本,被存儲在不同的OSD上。
-
object屬於哪個PG?
ceph osd map pool-A object-A
打印結果:
osdmap e4055(osd映射的版本id是e4055)
pool 'pool-A' (poolname:pool-A,pool_id:36)
object 'object-A'(obj_name) -> pg 36.b301e3e8 (pg id:36.68)
-> up [122,63,62] acting [122,63,62] (復制等級為3,因此有三個備份,分別在OSD:122,63,62) -
查看具體的路徑
去具體的查看OSD掛載的路徑,並且去具體的路徑下查看
df -h /var/lib/ceph/osd/ceph-122 cd /var/lib/ceph/osd/ceph-122 ls -la | grep -i 36.68 ls -l
查詢結果:
-rw-r--r-- 1 root root 10485760 Jan 24 16:45 object-A__head_B301E3E8__24
Moral of the Story(怎么翻譯?)
- ceph存儲集群可以有多個pools
- 每一個pools都應有多個PG, More the PG , better your cluster performance , more reliable your setup would be.
- 一個PG有多個objects
- 一個PG跨存在多個OSD上。例如:objects擴展在OSD上,第一個mapped(映射)到PG的OSD應該是基本的OSD,並且同意PG對應的其他OSD,是第二OSD(翻譯的鬼一樣,大致意思就是,Object不會存儲在一個OSD上,確保數據的安全性)
- 一個object可以只映射到一個PG上(如果沒有備份的話)
- PG和OSD的對應關系是n--->1
一個Pool需要多少placements?
(OSDs * 100)
Total PGs = -----------------
Replicas(副本數)
查看OSD的狀態
ceph osd stat
//顯示結果:osdmap e4055: 154 osds: 154 up, 154 in
( 154 * 100 ) / 3 = 5133.33,然后最接近2的次冪的數字就行了。8192PGs