集群現狀:
共有五個節點,配置為16核32g內存,數據節點為1T ssd盤,非數據節點為100g ssd盤;
角色規划:
node1 tidb tipd
node2 tidb tipd
node3 tikv tipd
node4 tikv
node5 tikv
1.每次操作都需要更改配置文件inventory.ini,都是在tidb用戶下進行;
2.初始化,打通新增節點與已有節點間免密碼登錄(tidb)用戶;
3.擴容tidb節點兩個
## TiDB Cluster Part [tidb_servers] 10.15.xxx.xxx ---舊的tidb---uc機器 10.15.xxx.xxx ---舊的tidb---uc機器 10.80.xxx.xxx ---新加tidb---阿里機器 10.80.xxx.xxx ---新加tidb---阿里機器
3.1 部署
[tidb@tidb.11.tidb.prod.uc:~/tidb-ansible]$ ansible-playbook deploy.yml -l 10.80.xxx.xxx,10.80.xxx.xxx
。。。。。。。。。。。。。。。。。。。。。。。。。。。
。。。。。。。。。。。。。。。。。
。。。。。。。。。
PLAY RECAP ***********************************************************************************************************************************************************************************************
10.80.249.46 : ok=30 changed=12 unreachable=0 failed=0
10.80.249.47 : ok=30 changed=12 unreachable=0 failed=0
Congrats! All goes well. :-)
3.2 啟動新增tidb節點
[tidb@tidb.11.tidb.prod.uc:~/tidb-ansible]$ ansible-playbook rolling_update_monitor.yml --tags=prometheus PLAY RECAP *********************************************************************************************************************************************************************************************** 10.15.xxx.xxx : ok=10 changed=0 unreachable=0 failed=0 10.15.xxx.xxx : ok=10 changed=0 unreachable=0 failed=0 10.15.xxx.xxx : ok=10 changed=0 unreachable=0 failed=0 10.15.xxx.xxx : ok=30 changed=8 unreachable=0 failed=0 10.15.xxx.xxx : ok=10 changed=0 unreachable=0 failed=0 10.80.xxx.xxx : ok=10 changed=0 unreachable=0 failed=0 10.80.xxx.xxx : ok=10 changed=0 unreachable=0 failed=0 localhost : ok=1 changed=0 unreachable=0 failed=0 Congrats! All goes well. :-)
3.3更新監控
[tidb@tidb.11.tidb.prod.uc:~/tidb-ansible]$ ansible-playbook rolling_update_monitor.yml --tags=prometheus PLAY RECAP *********************************************************************************************************************************************************************************************** 10.15.xxx.xxx : ok=10 changed=0 unreachable=0 failed=0 10.15.xxx.xxx : ok=10 changed=0 unreachable=0 failed=0 10.15.xxx.xxx : ok=10 changed=0 unreachable=0 failed=0 10.15.xxx.xxx : ok=30 changed=8 unreachable=0 failed=0 10.15.xxx.xxx : ok=10 changed=0 unreachable=0 failed=0 10.80.xxx.xxx : ok=10 changed=0 unreachable=0 failed=0 10.80.xxx.xxx : ok=10 changed=0 unreachable=0 failed=0 localhost : ok=1 changed=0 unreachable=0 failed=0 Congrats! All goes well. :-)
4.擴容tikv節點
[tikv_servers] 10.15.xxx.xxx 10.15.xxx.xxx 10.15.xxx.xxx 10.80.xxx.xxx 10.80.xxx.xxx 10.80.xxx.xxx
4.1.tikv為數據節點,擴容之前把ssd盤格式化分區掛載
root@tikv.11.tidb.prod.ali:~/.ssh# vi /etc/fstab # # /etc/fstab # Created by anaconda on Sun Oct 15 15:19:00 2017 # # Accessible filesystems, by reference, are maintained under '/dev/disk' # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info # UUID=eb448abb-3012-4d8d-bcde-94434d586a31 / ext4 defaults 1 1 #/dev/vdb /data ext4 defaults,noatime 0 0 /dev/vdb /data ext4 defaults,nodelalloc,noatime 0 0 root@tikv.11.tidb.prod.ali:~/.ssh# root@tikv.11.tidb.prod.ali:~/.ssh# umount /data root@tikv.11.tidb.prod.ali:~/.ssh# mount -a root@tikv.11.tidb.prod.ali:~/.ssh# mount -t ext4 /dev/vda1 on / type ext4 (rw,relatime,data=ordered) /dev/vdb on /data type ext4 (rw,noatime,nodelalloc,data=ordered)
4.2 加入tikv,啟動服務,更新
[tidb@tidb.11.tidb.prod.uc:~/tidb-ansible]$ ansible-playbook bootstrap.yml -l 10.80.249.59,10.80.249.60,10.80.249.58
[tidb@tidb.11.tidb.prod.uc:~/tidb-ansible]$ ansible-playbook start.yml -l 10.80.249.59,10.80.249.60,10.80.249.58
[tidb@tidb.11.tidb.prod.uc:~/tidb-ansible]$ ansible-playbook rolling_update_monitor.yml --tags=prometheus
5.加入tidb(加入tipd節點要一個一個加入)
[pd_servers] 10.15.xxx.xxx 10.15.xxx.xxx 10.15.xxx.xxx 10.80.xxx.xxx
5.1初始化新加入tipd
[tidb@tidb.01.tidb.prod.uc:~/tidb-ansible]$ ansible-playbook bootstrap.yml -l 10.80.xxx.xxx
2.安裝目標服務器
[tidb@tidb.01.tidb.prod.uc:~/tidb-ansible]$ ansible-playbook deploy.yml -l 10.80.xxx.xxx
PLAY RECAP ***************************************************************************************************************************
10.80.249.46 : ok=34 changed=0 unreachable=0 failed=0
Congrats! All goes well. :-)
5.3登錄到新加入的tipd節點,vi run_pd.sh
root@tidb.11.tidb.prod.ali:/data/tidb/deploy/scripts# vi run_pd.sh --initial-cluster="pd1=http://10.15.xxx.xxx:2380,pd2=http://10.15.xxx.xxx:2380,pd3=http://10.15.xxx.xxx:2380,pd4=http://10.80.xxx.xxx:2380" \ #!/bin/bash set -e ulimit -n 1000000 # WARNING: This file was auto-generated. Do not edit! # All your edit might be overwritten! DEPLOY_DIR=/data/tidb/deploy cd "${DEPLOY_DIR}" || exit 1 exec bin/pd-server \ --name="pd4" \ --client-urls="http://10.80.xxx.xxx:2379" \ --advertise-client-urls="http://10.80.xxx.xxx:2379" \ --peer-urls="http://10.80.xxx.xxx:2380" \ --advertise-peer-urls="http://10.80.xxx.xxx:2380" \ --data-dir="/data/tidb/deploy/data.pd" \ --config=conf/pd.toml \ --join="http://10.15.xxx.xxx:2380" \ --log-file="/data/tidb/deploy/log/pd.log" 2>> "/data/tidb/deploy/log/pd_stderr.log 在目標服務器手動啟動pd服務: tidb@tidb.11.tidb.prod.ali:/data/tidb/deploy/scripts$ sh -x start_pd.sh
5.4 在目標機器查看pd服務:
tidb@tidb.11.tidb.prod.ali:/data/tidb/deploy/scripts$ ps -ef | grep tidb
tidb 6922 1 0 14:29 ? 00:00:02 bin/pd-server --name=pd4 --client-urls=http://10.80.xxx.xxx:2379 --advertise-client-urls=http://10.80.xxx.xxx:2379 --peer-urls=http://10.80.xxx.xxx:2380 --advertise-peer-urls=http://10.80.xxx.xxx:2380 --data-dir=/data/tidb/deploy/data.pd --config=conf/pd.toml --join=http://10.15.xxx.xxx:2380 --log-file=/data/tidb/deploy/log/pd.log
5.5 滾動升級集群:
[tidb@tidb.11.tidb.prod.uc:~/tidb-ansible]$ ansible-playbook rolling_update.yml
5.6更新 Prometheus 配置並重啟:
[tidb@tidb.11.tidb.prod.uc:~/tidb-ansible]$ ansible-playbook rolling_update_monitor.yml --tags=prometheus