使用citus 使用副本配置對於災備處理是比較重要的,以下是一個簡單的學習
環境准備
使用docker-compose運行
- docker-compose 文件
version: "3"
services:
graphql-engine:
image: hasura/graphql-engine:v1.1.0
ports:
- "8080:8080"
environment:
HASURA_GRAPHQL_DATABASE_URL: postgres://postgres:dalong@pg-citus-master:5432/postgres
HASURA_GRAPHQL_ENABLE_CONSOLE: "true" # set to "false" to disable console
HASURA_GRAPHQL_ENABLED_LOG_TYPES: startup, http-log, webhook-log, websocket-log, query-log
pg-citus-master:
container_name: pg-citus-master
image: dalongrong/pgspider:citus-9.1
volumes:
- "./csvfiles:/opt/csv"
- "./sql:/docker-entrypoint-initdb.d/"
ports:
- "5432:5432"
environment:
- "POSTGRES_PASSWORD=dalong"
pg-citus-worker:
container_name: pg-citus-worker
image: dalongrong/pgspider:citus-9.1
volumes:
- "./csvfiles:/opt/csv"
- "./sql:/docker-entrypoint-initdb.d/"
ports:
- "5433:5432"
pg-citus-worker2:
container_name: pg-citus-worker2
image: dalongrong/pgspider:citus-9.1
volumes:
- "./csvfiles:/opt/csv"
- "./sql:/docker-entrypoint-initdb.d/"
ports:
- "5434:5432"
- init sql
主要是擴展的創建
-- wrap in transaction to ensure Docker flag always visible
BEGIN;
CREATE EXTENSION citus;
COMMIT;
- 需要的數據
都是來自官方文檔
curl https://examples.citusdata.com/tutorial/companies.csv > csvfiles/scompanies.csv
curl https://examples.citusdata.com/tutorial/campaigns.csv > csvfiles/campaigns.csv
curl https://examples.citusdata.com/tutorial/ads.csv > csvfiles/ads.csv
- 啟動
docker-compose up -d
基本citus 使用
master 節點操作,數據的導入在創建分布式表前后都可以的
- 創建表
CREATE TABLE companies (
id bigint NOT NULL,
name text NOT NULL,
image_url text,
created_at timestamp without time zone NOT NULL,
updated_at timestamp without time zone NOT NULL
);
CREATE TABLE campaigns (
id bigint NOT NULL,
company_id bigint NOT NULL,
name text NOT NULL,
cost_model text NOT NULL,
state text NOT NULL,
monthly_budget bigint,
blacklisted_site_urls text[],
created_at timestamp without time zone NOT NULL,
updated_at timestamp without time zone NOT NULL
);
CREATE TABLE ads (
id bigint NOT NULL,
company_id bigint NOT NULL,
campaign_id bigint NOT NULL,
name text NOT NULL,
image_url text,
target_url text,
impressions_count bigint DEFAULT 0,
clicks_count bigint DEFAULT 0,
created_at timestamp without time zone NOT NULL,
updated_at timestamp without time zone NOT NULL
);
- 添加主鍵
ALTER TABLE companies
ADD PRIMARY KEY (id);
ALTER TABLE campaigns
ADD PRIMARY KEY (id, company_id);
ALTER TABLE ads
ADD PRIMARY KEY (id, company_id);
- 添加worker 節點
SELECT
master_add_node ('pg-citus-worker',
'5432');
SELECT
master_add_node ('pg-citus-worker2',
'5432');
- 創建分布式表
// 配置副本數為2,因為是2個worker,為了簡單,同時測試災備
SET citus.shard_replication_factor = 2;
SELECT create_distributed_table('companies', 'id');
SELECT create_distributed_table('campaigns', 'company_id');
SELECT create_distributed_table('ads', 'company_id');
災備處理
前邊通過membership-manager 的原理大概也就知道了,就是刪除節點的分片原數據,同時刪除節點。
但是需要注意數據的遷移的處理(節點添加數據的reblance 才是比較難的)
- 停止worker2
因為副本數為2,所以可以停止一個節點
docker-compose stop pg-citus-worker2
- 刪除分片原數據
DELETE FROM pg_dist_placement WHERE groupid = (SELECT groupid FROM pg_dist_node WHERE nodename ='pg-citus-worker2' AND nodeport = '5432' LIMIT 1);
- 移除節點
SELECT master_remove_node('pg-citus-worker2', '5433')
- 數據查詢
如果通過數據查詢,會發現業務是無感知的,而且不影響數據查詢處理
說明
citus 開源版本提供了一些操作函數,我們利用這些函數還是可以方便的維護citus 集群的,如果感覺維護費事,yugabyte 以及cockroachdb
都是可選的方案,但是個人推薦yugabyte
參考資料
http://docs.citusdata.com/en/v9.2/admin_guide/cluster_management.html
https://github.com/citusdata/membership-manager/blob/master/manager.py