環境介紹
基於CentOS Linux release 7.9.2009 (Core)
ip | hostname | role |
---|---|---|
172.17.0.4 | cd782d0a790b | etcd1 |
172.17.0.3 | 83d43a1203f6 | etcd2 |
172.17.0.2 | 99dac45f202c | etcd3 |
## 先添加 yum 倉庫 ## docker-ce yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
## epel wget -O /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo
安裝docker-ce
yum install -y yum-utils device-mapper-persistent-data lvm2 docker-ce
安裝go,非必須(如果編譯安裝,則要有go環境)
yum install golang
其他
yum -y install ansible git iproute
開始構建etcd集群(yum 安裝)
yum -y install etcd ## 查看版本 [root@cd782d0a790b data]# etcdctl -v etcdctl version: 3.3.11 API version: 2
1、基於http協議構建集群
編輯配置文件
cat /etc/etcd/etcd.conf ## etcd存儲路徑 ETCD_DATA_DIR="/var/lib/etcd/default.etcd" ## 用於監聽集群內所有etcd通訊的URL列表 ETCD_LISTEN_PEER_URLS="http://172.17.0.4:2380" ## 用於監聽客戶端通訊的URL列表 ETCD_LISTEN_CLIENT_URLS="http://172.17.0.4:2379,http://127.0.0.1:2379" ## 集群名字 ETCD_NAME="etcd1" ## 觸發快照到硬盤的已提交事務的數量 ETCD_SNAPSHOT_COUNT="10000" ## 心跳間隔時間,單位毫秒 ETCD_HEARTBEAT_INTERVAL="250" ## 選舉的超時時間,單位毫秒 ETCD_ELECTION_TIMEOUT="5000" ## 列出本機的通信 URL 以便通告給集群的其他成員 ETCD_INITIAL_ADVERTISE_PEER_URLS="http://172.17.0.4:2380" ## 列出本機的客戶端連接URL,通告給集群中的其他成員 ETCD_ADVERTISE_CLIENT_URLS="http://172.17.0.4:2379" ## 啟動初始化集群配置 ETCD_INITIAL_CLUSTER="etcd1=http://172.17.0.4:2380,etcd2=http://172.17.0.3:2380,etcd3=http://172.17.0.2:2380" ## 在啟動期間用於 etcd 集群的初始化集群記號 ETCD_INITIAL_CLUSTER_TOKEN="k8s_etcd" ## 初始化集群狀態,一般在新創建集群時填new,如果是加入某個已有的集群,則填寫existing ETCD_INITIAL_CLUSTER_STATE="new" ## 代理模式設置 ETCD_PROXY="off" ## 是否開始自動壓縮,0表示關閉自動壓縮。 ETCD_AUTO_COMPACTION_RETENTION="8" ## METRICS接口,用於提供給監控對接的 ETCD_METRICS="basic"
注意:三個配置文件大體內容基本相似,需要注意的是ETCD_NAME和本機的ip地址要隨之更改
加入systemctl管理
cat /usr/lib/systemd/system/etcd.service [Unit] Description=Etcd Server After=network.target After=network-online.target Wants=network-online.target [Service] Type=notify WorkingDirectory=/var/lib/etcd/ EnvironmentFile=-/etc/etcd/etcd.conf User=etcd # set GOMAXPROCS to number of processors ExecStart=/bin/bash -c "GOMAXPROCS=$(nproc) /usr/bin/etcd --name=\"${ETCD_NAME}\" --data-dir=\"${ETCD_DATA_DIR}\" --listen-client-urls=\"${ETCD_LISTEN_CLIENT_URLS}\"" Restart=on-failure LimitNOFILE=65536 [Install] WantedBy=multi-user.target
啟動服務,檢查健康狀態
## 三台都啟動 systemctl start etcd ## 查看集群 [root@cd782d0a790b /]# etcdctl member list d02233d35f3c4b94: name=etcd3 peerURLs=http://172.17.0.2:2380 clientURLs=http://172.17.0.2:2379 isLeader=false
e302fd1dad15f911: name=etcd1 peerURLs=http://172.17.0.4:2380 clientURLs=http://172.17.0.4:2379 isLeader=true
ef7057d9f69d96ad: name=etcd2 peerURLs=http://172.17.0.3:2380 clientURLs=http://172.17.0.3:2379 isLeader=false
## 檢查健康狀態 [root@cd782d0a790b /]# etcdctl cluster-health member d02233d35f3c4b94 is healthy: got healthy result from http://172.17.0.2:2379
member e302fd1dad15f911 is healthy: got healthy result from http://172.17.0.4:2379
member ef7057d9f69d96ad is healthy: got healthy result from http://172.17.0.3:2379
以上為默認的 API version: 2,可以將 API version 改為 3,再次查看
export ETCDCTL_API=3 HOST_1=172.17.0.2 HOST_2=172.17.0.3 HOST_3=172.17.0.4 ENDPOINTS=$HOST_1:2379,$HOST_2:2379,$HOST_3:2379 ## 查看list [root@cd782d0a790b /]# etcdctl --endpoints=$ENDPOINTS member list d02233d35f3c4b94, started, etcd3, http://172.17.0.2:2380, http://172.17.0.2:2379
e302fd1dad15f911, started, etcd1, http://172.17.0.4:2380, http://172.17.0.4:2379
ef7057d9f69d96ad, started, etcd2, http://172.17.0.3:2380, http://172.17.0.3:2379
## 檢查health [root@cd782d0a790b /]# etcdctl --endpoints=$ENDPOINTS endpoint health 172.17.0.2:2379 is healthy: successfully committed proposal: took = 7.5093ms 172.17.0.4:2379 is healthy: successfully committed proposal: took = 5.5682ms 172.17.0.3:2379 is healthy: successfully committed proposal: took = 8.0291ms ## 查看status [root@cd782d0a790b /]# etcdctl --write-out=table --endpoints=$ENDPOINTS endpoint status +-----------------+------------------+---------+---------+-----------+-----------+------------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+-----------------+------------------+---------+---------+-----------+-----------+------------+
| 172.17.0.2:2379 | d02233d35f3c4b94 | 3.3.11 | 16 kB | false | 129 | 12 |
| 172.17.0.3:2379 | ef7057d9f69d96ad | 3.3.11 | 16 kB | false | 129 | 12 |
| 172.17.0.4:2379 | e302fd1dad15f911 | 3.3.11 | 20 kB | true | 129 | 12 |
+-----------------+------------------+---------+---------+-----------+-----------+------------+
具體更多操作可以查看etcd官網demo:https://etcd.io/docs/v3.4/demo/
2、基於https構建集群
首先需要生成證書,下載證書生成工具
curl -s -L -o /usr/local/bin/cfssl https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
curl -s -L -o /usr/local/bin/cfssljson https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
curl -s -L -o /usr/local/bin/cfssl-certinfo https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64
chmod +x /usr/local/bin/cfssl*
開始生成證書
## CA機構配置,有效期10年 [root@cd782d0a790b cert]# cat > ca-config.json << EOF { "signing": { "default": { "expiry": "87600h" }, "profiles": { "etcd": { "expiry": "87600h", "usages": [ "signing", "key encipherment", "server auth", "client auth" ] } } } } EOF
"字段說明"
"ca-config.json":可以定義多個 profiles,分別指定不同的過期時間、使用場景等參數;后續在簽名證書時使用某個 profile;
"signing":表示該證書可用於簽名其它證書;生成的 ca.pem 證書中 CA=TRUE;
"server auth":表示client可以用該 CA 對server提供的證書進行驗證;
"client auth":表示server可以用該 CA 對client提供的證書進行驗證;
## CA機構配置,機構名稱Comman Name,所在地Country國家, State省, Locality市 [root@cd782d0a790b cert]# cat > ca-csr.json << EOF { "CN": "etcd CA", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "Beijing", "L": "Beijing" } ] } EOF
## 向ca機構申請:證書注冊 (中國,北京省,北京市),每個節點用相同的證書,所以要填寫所有主機ip [root@cd782d0a790b cert]# cat > server-csr.json << EOF { "CN": "etcd", "hosts": [ "172.17.0.2", "172.17.0.3", "172.17.0.4" ], "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O":"aa.com", "CN":"beijing.aa.com" } ] } EOF
請求文件全部編輯好后:
## 生成ca證書和key cfssl gencert -initca ca-csr.json | cfssljson -bare ca - ## 生成etcd證書和key,注意這里的-profile的值必須和ca-config中的profiles的值一樣 cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=etcd server-csr.json | cfssljson -bare server ## 生成證書如下 [root@cd782d0a790b ssl]# ls *.pem ca-key.pem ca.pem server-key.pem server.pem ## 賦值讀權限 chmod 644 *.pem
以上情況是客戶端、服務端、集群內peer通信都是用同一個證書,實際情況中,可以把它分為多個,設置不同的功能,不同的到期時間,例如如下:
## ca證書生成,在此定義了幾種不同的證書類型 [root@cd782d0a790b cert]# cat > ca-config.json << EOF { "signing": { "default": { "expiry": "168h" }, "profiles": { "server": { "expiry": "8760h", "usages": [ "signing", "key encipherment", "server auth" ] }, "client": { "expiry": "8760h", "usages": [ "signing", "key encipherment", "client auth" ] }, "peer": { "expiry": "8760h", "usages": [ "signing", "key encipherment", "server auth", "client auth" ] } } } } EOF
"類型說明"
在其中定義3個profile
"server" 作為服務器與客戶端通信時的服務器證書
"client" 作為服務器與客戶端通信時的客戶端證書
"peer" 作為服務器間通信時用的證書,既認證服務器也認證客戶端
cat > ca-csr.json << EOF { "CN": "etcd CA", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "Beijing", "L": "Beijing" } ] } EOF
## 這種是單獨的,各自使用自己的peer證書,注意名字要不同,所有的機器都要執行一次 [root@cd782d0a790b cert]# cat > etcd1-csr.json << EOF { "CN": "etcd1", "hosts": [ "172.17.0.2" ], "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O":"aa.com", "CN":"beijing.aa.com" } ] } EOF
請求文件全部編輯好后:
## 生成ca證書和key cfssl gencert -initca ca-csr.json | cfssljson -bare ca - ## 生成etcd證書和key,注意這里的-profile的值必須和ca-config中的profiles的值一樣 for i in `seq 1 5`;do cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=peer etcd${i}-csr.json | cfssljson -bare etcd${i};done [root@cd782d0a790b ssl]# ls ca-config.json ca.csr etcd1-key.pem etcd2-csr.json etcd2.pem etcd3.csr etcd4-key.pem etcd5-csr.json etcd5.pem ca-csr.json ca.pem etcd1.csr etcd2-key.pem etcd3-csr.json etcd3.pem etcd4.csr etcd5-key.pem server.pem ca-key.pem etcd1-csr.json etcd1.pem etcd2.csr etcd3-key.pem etcd4-csr.json etcd4.pem etcd5.csr ## 賦值讀權限 chmod 644 *.pem
如果是每個服務器單獨的證書,下邊etcd的配置,包括查看、檢查狀態時,所指定的證書,都指定本機的即可
修改etcd.conf配置
ETCD_DATA_DIR="/var/lib/etcd/default.etcd" ETCD_LISTEN_PEER_URLS="https://172.17.0.4:2380" ETCD_LISTEN_CLIENT_URLS="https://172.17.0.4:2379,https://127.0.0.1:2379" ETCD_NAME="etcd1" ETCD_SNAPSHOT_COUNT="10000" ETCD_HEARTBEAT_INTERVAL="250" ETCD_ELECTION_TIMEOUT="5000" ETCD_INITIAL_ADVERTISE_PEER_URLS="https://172.17.0.4:2380" ETCD_ADVERTISE_CLIENT_URLS="https://172.17.0.4:2379" ETCD_INITIAL_CLUSTER="etcd1=https://172.17.0.4:2380,etcd2=https://172.17.0.3:2380,etcd3=https://172.17.0.2:2380" ETCD_INITIAL_CLUSTER_TOKEN="etcd" ETCD_INITIAL_CLUSTER_STATE="new" ETCD_PROXY="off"
## etcd 客戶端與服務端通信的證書和key
ETCD_CERT_FILE="/data/cert/ssl/etcd1.pem"
ETCD_KEY_FILE="/data/cert/ssl/etcd1-key.pem"
ETCD_CLIENT_CERT_AUTH="true"
## ca證書
ETCD_TRUSTED_CA_FILE="/data/cert/ssl/ca.pem"
## etcd 集群內部通信證書和key
ETCD_PEER_CERT_FILE="/data/cert/ssl/etcd1.pem"
ETCD_PEER_KEY_FILE="/data/cert/ssl/etcd1-key.pem"
ETCD_PEER_CLIENT_CERT_AUTH="true"
ETCD_PEER_TRUSTED_CA_FILE="/data/cert/ssl/ca.pem"
ETCD_AUTO_COMPACTION_RETENTION="8"
ETCD_METRICS="basic"
將http全部更改為https,然后指定證書的路徑的路徑
重啟服務
systemctl restart etcd ## 重啟時,報類似錯誤 request sent was ignored (cluster ID mismatch: peer[61c68880c0fd8e67]=47ca0413c1aaf745, local=755bf44e2e1770ae) 或 publish error: etcdserver: request timed out ## 因為之前啟動過http的etcd集群,已經有數據保存,由於這些臟數據引起的,所有節點全部數據刪除后,重啟即可 rm -rf /var/lib/etcd/default.etcd/*
檢查狀態
export ETCDCTL_API=3 HOST_1=https://172.17.0.2
HOST_2=https://172.17.0.3
HOST_3=https://172.17.0.4
ENDPOINTS=$HOST_1:2379,$HOST_2:2379,$HOST_3:2379 ## list etcdctl --endpoints=$ENDPOINTS --cacert="/data/cert/ssl/ca.pem" --cert="/data/cert/ssl/etcd1.pem" --key="/data/cert/ssl/etcd1-key.pem" member list --write-out=table +------------------+---------+-------+-------------------------+-------------------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS |
+------------------+---------+-------+-------------------------+-------------------------+
| 37ab29a4575d84d2 | started | etcd3 | https://172.17.0.2:2380 | https://172.17.0.2:2379 |
| 3e6a29fd4717a78b | started | etcd2 | https://172.17.0.3:2380 | https://172.17.0.3:2379 |
| 653155eddc689793 | started | etcd1 | https://172.17.0.4:2380 | https://172.17.0.4:2379 |
+------------------+---------+-------+-------------------------+-------------------------+ ## status etcdctl --endpoints=$ENDPOINTS --cacert="/data/cert/ssl/ca.pem" --cert="/data/cert/ssl/etcd1.pem" --key="/data/cert/ssl/etcd1-key.pem" endpoint status --write-out=table +-------------------------+------------------+---------+---------+-----------+-----------+------------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+-------------------------+------------------+---------+---------+-----------+-----------+------------+
| https://172.17.0.2:2379 | 37ab29a4575d84d2 | 3.3.11 | 20 kB | false | 1064 | 139 |
| https://172.17.0.3:2379 | 3e6a29fd4717a78b | 3.3.11 | 20 kB | true | 1064 | 139 |
| https://172.17.0.4:2379 | 653155eddc689793 | 3.3.11 | 20 kB | false | 1064 | 139 |
+-------------------------+------------------+---------+---------+-----------+-----------+------------+
3、ETCD集群中添加節點
member add 添加
## add etcdctl --endpoints=$ENDPOINTS --cacert="/data/cert/ssl/ca.pem" --cert="/data/cert/ssl/etcd1.pem" --key="/data/cert/ssl/etcd1-key.pem" member add etcd4 --peer-urls=https://172.17.0.5:2380
Member 71f4582f1c4ba901 added to cluster a89c967de8e14b61 ETCD_NAME="etcd4" ETCD_INITIAL_CLUSTER="etcd3=https://172.17.0.2:2380,etcd2=https://172.17.0.3:2380,etcd1=https://172.17.0.4:2380,etcd4=https://172.17.0.5:2380" ETCD_INITIAL_ADVERTISE_PEER_URLS="https://172.17.0.5:2380" ETCD_INITIAL_CLUSTER_STATE="existing" ## list etcdctl --endpoints=$ENDPOINTS --cacert="/data/cert/ssl/ca.pem" --cert="/data/cert/ssl/etcd1.pem" --key="/data/cert/ssl/etcd1-key.pem" member list --write-out=table +------------------+-----------+-------+-------------------------+-------------------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS |
+------------------+-----------+-------+-------------------------+-------------------------+
| 37ab29a4575d84d2 | started | etcd3 | https://172.17.0.2:2380 | https://172.17.0.2:2379 |
| 3e6a29fd4717a78b | started | etcd2 | https://172.17.0.3:2380 | https://172.17.0.3:2379 |
| 653155eddc689793 | started | etcd1 | https://172.17.0.4:2380 | https://172.17.0.4:2379 |
| e321a980939fe867 | unstarted | | https://172.17.0.5:2380 | |
+------------------+-----------+-------+-------------------------+-------------------------+
注意:添加節點時,必須把集群狀態修復完畢,才能繼續添加下一個,否則報錯類似:Error: etcdserver: unhealthy cluster
最終etcd4的配置文件如下
ETCD_DATA_DIR="/var/lib/etcd/default.etcd" ETCD_LISTEN_PEER_URLS="https://172.17.0.5:2380" ETCD_LISTEN_CLIENT_URLS="https://172.17.0.5:2379,https://127.0.0.1:2379" ETCD_NAME="etcd4" ETCD_SNAPSHOT_COUNT="10000" ETCD_HEARTBEAT_INTERVAL="250" ETCD_ELECTION_TIMEOUT="5000" ETCD_INITIAL_ADVERTISE_PEER_URLS="https://172.17.0.5:2380" ETCD_ADVERTISE_CLIENT_URLS="https://172.17.0.5:2379" ETCD_INITIAL_CLUSTER="etcd1=https://172.17.0.4:2380,etcd2=https://172.17.0.3:2380,etcd3=https://172.17.0.2:2380,etcd4=https://172.17.0.5:2380" ETCD_INITIAL_CLUSTER_TOKEN="etcd" ETCD_INITIAL_CLUSTER_STATE="existing" ETCD_PROXY="off" ETCD_CERT_FILE="/data/cert/ssl/etcd4.pem" ETCD_KEY_FILE="/data/cert/ssl/etcd4-key.pem" ETCD_CLIENT_CERT_AUTH="true" ETCD_TRUSTED_CA_FILE="/data/cert/ssl/ca.pem" ETCD_PEER_CERT_FILE="/data/cert/ssl/etcd4.pem" ETCD_PEER_KEY_FILE="/data/cert/ssl/etcd4-key.pem" ETCD_PEER_CLIENT_CERT_AUTH="true" ETCD_PEER_TRUSTED_CA_FILE="/data/cert/ssl/ca.pem" ETCD_AUTO_COMPACTION_RETENTION="8" ETCD_METRICS="basic"
啟動etcd4,查看集群狀態
systemctl start etcd export ETCDCTL_API=3 HOST_1=https://172.17.0.2
HOST_2=https://172.17.0.3
HOST_3=https://172.17.0.4
HOST_4=https://172.17.0.5
ENDPOINTS=$HOST_1:2379,$HOST_2:2379,$HOST_3:2379,$HOST_4:2379 ## list etcdctl --endpoints=$ENDPOINTS --cacert="/data/cert/ssl/ca.pem" --cert="/data/cert/ssl/etcd1.pem" --key="/data/cert/ssl/etcd1-key.pem" member list --write-out=table +------------------+---------+-------+-------------------------+-------------------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS |
+------------------+---------+-------+-------------------------+-------------------------+
| 37ab29a4575d84d2 | started | etcd3 | https://172.17.0.2:2380 | https://172.17.0.2:2379 |
| 3e6a29fd4717a78b | started | etcd2 | https://172.17.0.3:2380 | https://172.17.0.3:2379 |
| 653155eddc689793 | started | etcd1 | https://172.17.0.4:2380 | https://172.17.0.4:2379 |
| e321a980939fe867 | started | etcd4 | https://172.17.0.5:2380 | https://172.17.0.5:2379 |
+------------------+---------+-------+-------------------------+-------------------------+ ## status etcdctl --endpoints=$ENDPOINTS --cacert="/data/cert/ssl/ca.pem" --cert="/data/cert/ssl/etcd1.pem" --key="/data/cert/ssl/etcd1-key.pem" endpoint status --write-out=table +-------------------------+------------------+---------+---------+-----------+-----------+------------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+-------------------------+------------------+---------+---------+-----------+-----------+------------+
| https://172.17.0.2:2379 | 37ab29a4575d84d2 | 3.3.11 | 20 kB | false | 1066 | 159 |
| https://172.17.0.3:2379 | 3e6a29fd4717a78b | 3.3.11 | 20 kB | false | 1066 | 159 |
| https://172.17.0.4:2379 | 653155eddc689793 | 3.3.11 | 20 kB | true | 1066 | 159 |
| https://172.17.0.5:2379 | e321a980939fe867 | 3.3.11 | 20 kB | false | 1066 | 159 |
+-------------------------+------------------+---------+---------+-----------+-----------+------------+
4、備份及恢復ETCD集群數據
備份
## 環境配置 export ETCDCTL_API=3 kubectl get nodes -o wide HOST_1=https://10.36.234.169
HOST_2=https://10.36.234.180
HOST_3=https://10.36.235.19
ENDPOINTS=$HOST_1:2379,$HOST_2:2379,$HOST_3:2379 ## 備份 etcdctl --endpoints=$ENDPOINTS --cacert="/etc/ssl/etcd/ssl/ca.pem" --cert="/etc/ssl/etcd/ssl/member-gzbh-intelmbx043.gzbh.baidu.com.pem" --key="/etc/ssl/etcd/ssl/member-gzbh-intelmbx043.gzbh.baidu.com-key.pem" snapshot save my.db Snapshot saved at my.db ## 查看 [root@gzbh-intelmbx043 etcd_data]# ls my.db
恢復
## 停止etcd服務 systemctl stop etcd ## 刪除原數據(如原數據重要,記得備份!) rm -rf /var/lib/etcd ## 恢復,如果是多台機器集群模式,每個機器都要導入 etcdctl --endpoints=https://10.61.187.39:2379 --cacert="/etc/ssl/etcd/ssl/ca.pem" --cert="/etc/ssl/etcd/ssl/member-yq01-aip-aikefu06e1a866.yq01.baidu.com.pem" --key="/etc/ssl/etcd/ssl/member-yq01-aip-aikefu06e1a866.yq01.baidu.com-key.pem" snapshot restore my.db --name=etcd1 --initial-cluster etcd1=https://10.61.187.39:2380 --initial-cluster-token etcd_test --initial-advertise-peer-urls https://10.61.187.39:2380 --data-dir=/var/lib/etcd/
2021-05-25 16:05:02.784608 I | mvcc: restore compact to 6104817
2021-05-25 16:05:02.802119 I | etcdserver/membership: added member 67745b5848ce7e3c [https://10.61.187.39:2380] to cluster 1256ee7f1ba66254
## 啟動服務即可 systemctl start etcd
需要注意:數據的備份和恢復是個敏感操作,一定要謹慎!