一、環境准備
服務器:4台 (均安裝docker-ce)
系 統:Centos 7 (確定你是Centos7及以上版本)
內 存:2G (至少)
內 核:1核
二、安裝步驟
操作技巧:
使用Xshell,在終端右擊鼠標勾選【發送鍵盤輸入的所有會話】,所有終端機器都可同步執行!!
1.安裝gcc相關環境(確保虛擬機可以上外網)
yum -y install gcc gcc-c++
2.卸載舊版本
yum remove docker \ docker-client \ docker-client-latest \ docker-common \ docker-latest \ docker-latest-logrotate \ docker-logrotate \ docker-engine
3.安裝需要的軟件包
yum install -y yum-utils
4.設置鏡像倉庫
yum-config-manager \ --add-repo \ https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
5.更新yum軟件包索引
yum makecache fast
6.安裝Docker CE
yum install -y docker-ce docker-ce-cli containerd.io
7.啟動Docker
systemctl start docker
8.測試
docker version docker run hello-world docker images docker ps -a
9.卸載
systemctl stop docker yum -y remove docker-ce docker-ce-cli containerd.io rm -rf /var/lib/docker
10.安裝鏡像加速器
sudo mkdir -p /etc/docker sudo tee /etc/docker/daemon.json <<-'EOF' { "registry-mirrors": ["https://registry.docker-cn.com"] } EOF sudo systemctl daemon-reload sudo systemctl restart docker
三、Swarm 集群搭建
官網文檔地址:https://docs.docker.com/engine/swarm/how-swarm-mode-works/nodes/
(1) 環境要求:
♦ ♦ ♦ 至少3台Manager 主節點(如果是2台,其中一台宕機,則集群無法使用,請看后面第四節具體實驗)
(2) 集群運行簡介:
1. 集群中為了兩種工作節點,Manager(管理節點)和 worker(工作節點 )
2. Manager 節點間可互通,而 worker 節點間不可互通
3. Manager 節點可管理 worker工作節點,而worker不可管理Manager節點
4. 所有操作指令只能在Manager,worker 無法操作指令
(3)搭建集群
docker-1 操作:
1. 查看網絡
docker network ls
2. 查看Swarm 命令
[root@localhost ~]# docker swarm --help Usage: docker swarm COMMAND Manage Swarm Commands: ca Display and rotate the root CA # 顯示並旋轉根CA init Initialize a swarm # 初始化一個swarm集群 join Join a swarm as a node and/or manager # 作為節點和/或管理者加入集群 join-token Manage join tokens # 創建一個tokens令牌 leave Leave the swarm # 離開swarm集群 unlock Unlock swarm # 解鎖swarm unlock-key Manage the unlock key # 管理解鎖鑰匙 update Update the swarm # 更新swarm集群 Run 'docker swarm COMMAND --help' for more information on a command.
3. 初始化集群
[root@localhost ~]# docker swarm init --help Usage: docker swarm init [OPTIONS] Initialize a swarm Options: --advertise-addr string Advertised address (format: <ip|interface>[:port]) # 播發地址,對外連接要怎么連接,(重點是這條命令)
網絡分為:
私網(不需要走外網,訪問速度快,耗時短)
公網(需要走外網)
4. 查看服務器IP
[root@localhost ~]# ip addr
inet 192.168.1.230/24 brd 192.168.1.255 scope global ens32
5. 配置 Manager 主節點
[root@localhost ~]# docker swarm init --advertise-addr 192.168.1.230
Swarm initialized: current node (bbksnbwq0aq96ap3z6s1znk09) is now a manager.
# 當前節點(bbksnbwq0aq96ap3z6s1znk09)現在是一個管理器。 To add a worker to this swarm, run the following command:
# 要將工作進程添加到此群,請運行以下命令: docker swarm join --token SWMTKN-1-0dnpje2x2vv04k51oc76styuwoe9y2ddeonh3ie0zpi4naviwn-f1o9oknc8uqazyay6tb8d8oqf 192.168.1.230:2377 To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
# 要將管理器添加到此群,請運行“docker swarm join token manager”並按照說明進行操作。
6. 獲取令牌:
只能在Manager上操作,worker無法操作
docker swarm join-token manager # 生成一個manager令牌 docker swarm join-token worker # 生成一個worker令牌
docker-2 操作:
1. 將docker-2 加入swarm 集群
[root@localhost ~]# docker swarm join --token SWMTKN-1-0dnpje2x2vv04k51oc76styuwoe9y2ddeonh3ie0zpi4naviwn-f1o9oknc8uqazyay6tb8d8oqf 192.168.1.230:2377
報錯 :
無報錯,可忽略!
報錯問題:
Error response from daemon: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp 192.168.1.230:2377: connect: no route to host"
原因 :
這個錯誤是因為將node節點加入swarm中導致的,原因就是manager節點這台機器上的防火牆沒有關閉。
解決辦法 :
♦ 關閉Manger及worker服務器上的防火牆
(1) 查看manage節點機器上防火牆狀態
systemctl status firewalld.service
(2) 停止防火牆
systemctl stop firewalld.service
(3) 永久關閉防火牆
systemctl disable firewalld.service
2.再次執行加入集群命令
[root@localhost ~]# docker swarm join --token SWMTKN-1-0dnpje2x2vv04k51oc76styuwoe9y2ddeonh3ie0zpi4naviwn-f1o9oknc8uqazyay6tb8d8oqf 192.168.1.230:2377 This node joined a swarm as a worker.
# 此節點作為工作節點加入集群 。
docker-1 操作:
1.查看節點狀態
[root@localhost ~]# docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION 65ohc7a6u28qlanpb2tygpd8k localhost.localdomain Ready Active 20.10.5 bbksnbwq0aq96ap3z6s1znk09 * localhost.localdomain Ready Active Leader 20.10.5
其中MANAGER顯示:
Leader 為Manager節點,worker節點為空則為工作節點
2.在docker-1上生成一個worker令牌(和初始化時的令牌一致)
[root@localhost ~]# docker swarm join-token worker
To add a worker to this swarm, run the following command: # 要將工作進程添加到此群,請運行以下命令:
docker swarm join --token SWMTKN-1-0dnpje2x2vv04k51oc76styuwoe9y2ddeonh3ie0zpi4naviwn-f1o9oknc8uqazyay6tb8d8oqf 192.168.1.230:2377
docker-3 操作:
將docker-3 加入swarm 集群
[root@localhost ~]# docker swarm join --token SWMTKN-1-0dnpje2x2vv04k51oc76styuwoe9y2ddeonh3ie0zpi4naviwn-f1o9oknc8uqazyay6tb8d8oqf 192.168.1.230:2377 This node joined a swarm as a worker.
docker-1 操作:
1. 查看節點狀態
[root@localhost ~]# docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION 2giiey95hal5fv2rgfmx3souu localhost.localdomain Ready Active 20.10.5 # docker-3 65ohc7a6u28qlanpb2tygpd8k localhost.localdomain Ready Active 20.10.5 # docker-2 bbksnbwq0aq96ap3z6s1znk09 * localhost.localdomain Ready Active Leader 20.10.5 # dcoker-1
2.生一個Manager 令牌
[root@localhost ~]# docker swarm join-token manager To add a manager to this swarm, run the following command: docker swarm join --token SWMTKN-1-0dnpje2x2vv04k51oc76styuwoe9y2ddeonh3ie0zpi4naviwn-coihmyn50k2y7d156xq0l87m3 192.168.1.230:2377
docker-4 操作:
將docker-4 加入swarm 集群,作為Manager管理節點
[root@localhost ~]# docker swarm join --token SWMTKN-1-0dnpje2x2vv04k51oc76styuwoe9y2ddeonh3ie0zpi4naviwn-coihmyn50k2y7d156xq0l87m3 192.168.1.230:2377 This node joined a swarm as a manager. # 這個節點作為管理者加入了一個集群
docker-1 操作:
1. 查看節點狀態
[root@localhost ~]# docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION 2giiey95hal5fv2rgfmx3souu localhost.localdomain Ready Active 20.10.5 # docker-3 65ohc7a6u28qlanpb2tygpd8k localhost.localdomain Ready Active 20.10.5 # docker-2 bbksnbwq0aq96ap3z6s1znk09 * localhost.localdomain Ready Active Leader 20.10.5 # docker-1 ww14ifp4ki9gyw1gdhvzgbzff localhost.localdomain Ready Active Reachable 20.10.5 # docker-4
其中MANAGER顯示:
Leader 為Manager節點,Reachable為Manager主節點的從節點,如果Leader節點宕機,則Reachable立即切換為Leader主節點
worker節點為空則為工作節點
至此swarm集群搭建完成!但是雙主雙從是沒有意義的,如果其中1台Manager宕機,則集群無法運行,所以需要三個主節點才可以!!!
四、Rfat一致性協議
Rfat協議:
保證大多數節點存活才可以使用。只要大於1台,集群至少大於3台。
1 .問題:
目前有2台主節點,假如其中1台Manager主節點宕機,集群是否還可用?
2. 實驗:
(1)將docker-1機器停止,就相當於宕機,目前配置的是雙主雙從,另外一個主節點也不能使用了!
docker-1 操作:
停止 docker
[root@localhost ~]# systemctl stop docker
docker-4 操作:
查看節點狀態
[root@localhost ~]# docker node ls Error response from daemon: rpc error: code = Unknown desc = The swarm does not have a leader. It's possible that too few managers are online. Make sure more than half of the managers are online.
# 有可能在線的manager太少了。現在只有一個是不行的!
docker-1 操作:
(1) 啟動docker-1
[root@localhost ~]# systemctl start docker
(2) 查看節點狀態
[root@localhost ~]# docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION 2giiey95hal5fv2rgfmx3souu localhost.localdomain Unknown Active 20.10.5 65ohc7a6u28qlanpb2tygpd8k localhost.localdomain Ready Active 20.10.5 bbksnbwq0aq96ap3z6s1znk09 * localhost.localdomain Ready Active Reachable 20.10.5 ww14ifp4ki9gyw1gdhvzgbzff localhost.localdomain Ready Active Leader 20.10.5
此上可見,docker-1變成了Reachable,而docker-4變成了Leader
docker-3 操作:
將docker-3移出集群,重新加入到集群
[root@localhost ~]# docker swarm leave Node left the swarm.
docker-1 操作:
查看節點狀態
[root@localhost ~]# docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
2giiey95hal5fv2rgfmx3souu localhost.localdomain Down Active 20.10.5
65ohc7a6u28qlanpb2tygpd8k localhost.localdomain Ready Active 20.10.5
bbksnbwq0aq96ap3z6s1znk09 * localhost.localdomain Ready Active Reachable 20.10.5
ww14ifp4ki9gyw1gdhvzgbzff localhost.localdomain Ready Active Leader 20.10.5
以上,docker-3 已經顯示為Down
3.將移出的docker-3 加入集群,作為主節點
docker-1 操作:
(1)在docker-1上生成一個manager令牌
[root@localhost ~]# docker swarm join-token manager To add a manager to this swarm, run the following command: docker swarm join --token SWMTKN-1-0dnpje2x2vv04k51oc76styuwoe9y2ddeonh3ie0zpi4naviwn-coihmyn50k2y7d156xq0l87m3 192.168.1.230:2377
docker-3 操作:
(2) 將生成的令牌在docker-3 上執行
[root@localhost ~]# docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION 2giiey95hal5fv2rgfmx3souu localhost.localdomain Down Active 20.10.5 65ohc7a6u28qlanpb2tygpd8k localhost.localdomain Ready Active 20.10.5 bbksnbwq0aq96ap3z6s1znk09 * localhost.localdomain Ready Active Reachable 20.10.5 ww14ifp4ki9gyw1gdhvzgbzff localhost.localdomain Ready Active Leader 20.10.5 xttcuf3iu5cvirxwwiydi3808 localhost.localdomain Ready Active Reachable 20.10.5 # dcoker-3
此上可見,docker-3也變成了Reachable
4. 將docker-1停止,是否還可以使用呢?
docker-1 操作:
停止docker-1
[root@localhost ~]# systemctl stop docker
dcoker-3 操作:
[root@localhost ~]# docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION 2giiey95hal5fv2rgfmx3souu localhost.localdomain Down Active 20.10.5 65ohc7a6u28qlanpb2tygpd8k localhost.localdomain Ready Active 20.10.5 bbksnbwq0aq96ap3z6s1znk09 localhost.localdomain Down Active Unreachable 20.10.5 ww14ifp4ki9gyw1gdhvzgbzff localhost.localdomain Ready Active Leader 20.10.5 xttcuf3iu5cvirxwwiydi3808 * localhost.localdomain Ready Active Reachable 20.10.5
以上,docker-1 已經顯示為Unreachable,表示無法到達,也證明了我們的集群還可以正常使用,就達到了高可用!
如果再停止了docker-3,則集群無法使用!