TiDB集群安裝主要操作

本文轉載自查看原文 2019-11-12 10:51 298 TIDB

TiDB集群安裝主要操作

參考資料：https://www.cnblogs.com/plyx/archive/2018/12/21/10158615.html

一、TiDB數據簡介

　　TiDB 是 PingCAP 公司設計的開源分布式 HTAP (Hybrid Transactional and Analytical Processing) 數據庫，結合了傳統的 RDBMS 和 NoSQL 的最佳特性。

TiDB 兼容 MySQL，支持無限的水平擴展，具備強一致性和高可用性。TiDB 的目標是為 OLTP (Online Transactional Processing) 和

OLAP (Online Analytical Processing) 場景提供一站式的解決方案。

　　TiDB 具備如下特性：

1、高度兼容 MySQL

　　大多數情況下，無需修改代碼即可從 MySQL 輕松遷移至 TiDB，分庫分表后的 MySQL 集群亦可通過 TiDB 工具進行實時遷移。

2、水平彈性擴展

　　通過簡單地增加新節點即可實現 TiDB 的水平擴展，按需擴展吞吐或存儲，輕松應對高並發、海量數據場景。

3、分布式事務

　　TiDB 100% 支持標准的 ACID 事務。

4、真正金融級高可用

　　相比於傳統主從 (M-S) 復制方案，基於 Raft 的多數派選舉協議可以提供金融級的 100% 數據強一致性保證，且在不丟失大多數副本的前提下，

　　可以實現故障的自動恢復 (auto-failover)，無需人工介入。

5、一站式 HTAP 解決方案
　　TiDB 作為典型的 OLTP 行存數據庫，同時兼具強大的 OLAP 性能，配合 TiSpark，可提供一站式 HTAP 解決方案，一份存儲同時處理 OLTP & OLAP，

　　無需傳統繁瑣的 ETL 過程。

6、雲原生SQL數據庫
　　TiDB 是為雲而設計的數據庫，支持公有雲、私有雲和混合雲，使部署、配置和維護變得十分簡單。

　　TiDB Server：

　　　　TiDB Server 負責接收 SQL 請求，處理 SQL 相關的邏輯，並通過 PD 找到存儲計算所需數據的 TiKV 地址，與 TiKV 交互獲取數據，最終返回結果。

　　TiDB Server 是無狀態的，其本身並不存儲數據，只負責計算，可以無限水平擴展，可以通過負載均衡組件（如LVS、HAProxy 或 F5）對外提供統一的接入地址。
　　

　　PD Server：

　　　　Placement Driver (簡稱 PD) 是整個集群的管理模塊，其主要工作有三個：一是存儲集群的元信息（某個 Key 存儲在哪個 TiKV 節點）；

　　二是對 TiKV 集群進行調度和負載均衡（如數據的遷移、Raft group leader 的遷移等）；三是分配全局唯一且遞增的事務 ID。PD 是一個集群，

　　需要部署奇數個節點，一般線上推薦至少部署 3 個節點。

　　TiKV Server：

　　　　TiKV Server 負責存儲數據，從外部看 TiKV 是一個分布式的提供事務的 Key-Value 存儲引擎。存儲數據的基本單位是 Region，

　　每個 Region 負責存儲一個 Key Range（從 StartKey 到 EndKey 的左閉右開區間）的數據，每個 TiKV 節點會負責多個 Region。TiKV 使用 Raft 協議做復制，

　　保持數據的一致性和容災。副本以 Region 為單位進行管理，不同節點上的多個 Region 構成一個 Raft Group，互為副本。數據在多個 TiKV 之間的負載均衡

　　由 PD 調度，這里也是以 Region 為單位進行調度
　　

　　TiSpark：

　　　　TiSpark 作為 TiDB 中解決用戶復雜 OLAP 需求的主要組件，將 Spark SQL 直接運行在 TiDB 存儲層上，同時融合 TiKV 分布式集群的優勢，

　　並融入大數據社區生態。至此，TiDB 可以通過一套系統，同時支持 OLTP 與 OLAP，免除用戶數據同步的煩惱。

　二、生產環境部署推薦

　　標准 TiDB 集群需要 6 台機器:
###########################################################################
2 個 TiDB 節點
3 個 PD 節點
3 個 TiKV 節點，第一台 TiDB 機器同時用作監控機
默認情況下，單台機器上只需部署一個 TiKV 實例。如果你的 TiKV 部署機器 CPU 及內存配置是部署
建議的兩倍或以上，並且擁有兩塊 SSD 硬盤或單塊容量超 2T 的 SSD 硬盤，可以考慮部署兩實例，
但不建議部署兩個以上實例。

單機單 TiKV 實例集群拓撲
Name   Host IP   Services
node1   172.16.10.1   PD1, TiDB1
node2   172.16.10.2   PD2, TiDB2
node3   172.16.10.3   PD3
node4   172.16.10.4   TiKV1
node5   172.16.10.5   TiKV2
node6   172.16.10.6   TiKV3
###########################################################################

三、個人演示環境部署

分配機器資源
# 單機Tikv實例
Name HostIP Services
bj-db-m1 10.10.146.28 PD1, TiDB1, TiKV1
bj-db-m2 10.10.1.139 PD2, TiDB2, TiKV2
bj-db-m3 10.10.173.84 PD3, TiKV3

10.10.69.73 bj-db-manage

3-1、安裝中控機軟件

yum -y install epel-release git curl sshpass atop vim htop net-tools 
yum -y install python-pip

3-2、在中控機上創建 tidb 用戶，並生成 ssh key

# 創建tidb用戶
useradd -m -d /home/tidb tidb
echo tidbpwd | passwd --stdin tidb

3-3、配置tidb用戶sudo權限

# 配置tidb用戶sudo權限
visudo
tidb ALL=(ALL) NOPASSWD: ALL


# 或者 
cat >>/etc/sudoers<<"EOF"
tidb    ALL=(ALL)      NOPASSWD: ALL
EOF

3-4、設置ssh免秘鑰登錄

# 使用tidb賬戶生成 ssh key
su - tidb 
ssh-keygen -t rsa 

# 一路回車
[root@bj-db-manage ~]# su - tidb 
[tidb@bj-db-manage ~]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/tidb/.ssh/id_rsa): 
Created directory '/home/tidb/.ssh'.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/tidb/.ssh/id_rsa.
Your public key has been saved in /home/tidb/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:XcP3rQqjGN1TDdMsjpy9vsTN+vl+jicsLknJHjx9UIk tidb@bj-db-manage
The key's randomart image is:
+---[RSA 2048]----+
|             . . |
|           .Eoo  |
|            B.+  |
|         o *.B ..|
|        So=o+.. o|
|       . .Bo.+.. |
|      . .o=++o+  |
|       o .+*.oooo|
|      . .  o*++*=|
+----[SHA256]-----+
[tidb@bj-db-manage ~]$

四、在中控機器上下載 TiDB-Ansible

1、# 下載Tidb-Ansible 版本
  cd /home/tidb && git clone -b release-2.0 https://github.com/pingcap/tidb-ansible.git
2、# 安裝ansible及依賴
  cd /home/tidb/tidb-ansible/ && pip install -r ./requirements.txt

五、在中控機上配置部署機器ssh互信及sudo 規則

# 配置hosts.ini
su - tidb && cd /home/tidb/tidb-ansible
[tidb@bj-db-manage tidb-ansible]$ cat /home/tidb/tidb-ansible/hosts.ini 
[servers]
10.10.146.28
10.10.1.139
10.10.173.84

[all:vars]
username = tidb
ntp_server = pool.ntp.org
[tidb@bj-db-manage tidb-ansible]$ 
# 配置ssh 互信
ansible-playbook -i hosts.ini create_users.yml -u root -k
[tidb@bj-db-manage tidb-ansible]$ ansible-playbook -i hosts.ini create_users.yml -u root -k
SSH password: 

PLAY [all] **************************************************************************************************************************

TASK [create user] ******************************************************************************************************************
changed: [10.10.1.139]
changed: [10.10.146.28]
changed: [10.10.173.84]

TASK [set authorized key] ***********************************************************************************************************
changed: [10.10.146.28]
changed: [10.10.1.139]
changed: [10.10.173.84]

TASK [update sudoers file] **********************************************************************************************************
ok: [10.10.1.139]
ok: [10.10.146.28]
ok: [10.10.173.84]

PLAY RECAP **************************************************************************************************************************
10.10.1.139                : ok=3    changed=2    unreachable=0    failed=0   
10.10.146.28               : ok=3    changed=2    unreachable=0    failed=0   
10.10.173.84               : ok=3    changed=2    unreachable=0    failed=0   

Congrats! All goes well. :-)
[tidb@bj-db-manage tidb-ansible]$ 
# 需要輸入root的密碼 0zWYbnc55Wh20eDwbRHx

六、在目標機器上安裝ntp服務

# 中控機器上給目標主機安裝ntp服務
 cd /home/tidb/tidb-ansible
 ansible-playbook -i hosts.ini deploy_ntp.yml -u tidb -b

七、目標機器上調整cpufreq

　　備注：傳統物理機器需要，我的環境是雲主機，不適合

1、 # 查看cpupower 調節模式，目前虛擬機不支持，調節10服務器cpupower
2、 cpupower frequency-info --governors
3、 analyzing CPU 0:
4、 available cpufreq governors: Not Available
5、 # 配置cpufreq調節模式
6、 cpupower frequency-set --governor performance

View Code

八、目標機器上添加數據盤ext4 文件系統掛載

備注：傳統物理機器需要，我的環境是雲主機，不適合

# 創建分區表
parted -s -a optimal /dev/nvme0n1 mklabel gpt -- mkpart primary ext4 1 -1
# 手動創建分區
parted dev/sdb
mklabel gpt
mkpart primary 0KB 210GB 
# 格式化分區
mkfs.ext4 /dev/sdb
# 查看數據盤分區 UUID
[root@tidb-tikv1 ~]# lsblk -f
NAME FSTYPE LABEL UUID MOUNTPOINT
sda 
├─sda1 xfs f41c3b1b-125f-407c-81fa-5197367feb39 /boot
├─sda2 xfs 8119193b-c774-467f-a057-98329c66b3b3 /
├─sda3 
└─sda5 xfs 42356bb3-911a-4dc4-b56e-815bafd08db2 /home
sdb ext4 532697e9-970e-49d4-bdba-df386cac34d2 
# 分別在三台機器上，編輯 /etc/fstab 文件，添加 nodelalloc 掛載參數
vim /etc/fstab
UUID=8119193b-c774-467f-a057-98329c66b3b3 / xfs defaults 0 0
UUID=f41c3b1b-125f-407c-81fa-5197367feb39 /boot xfs defaults 0 0
UUID=42356bb3-911a-4dc4-b56e-815bafd08db2 /home xfs defaults 0 0
UUID=532697e9-970e-49d4-bdba-df386cac34d2 /data ext4 defaults,nodelalloc,noatime 0 2
# 掛載數據盤
mkdir /data
mount -a
mount -t ext4
/dev/sdb on /data type ext4 (rw,noatime,seclabel,nodelalloc,data=ordered)

View Code

九、分配機器資源

# 單機Tikv實例
Name HostIP Services
tidb-tikv1 10.10.146.28 PD1, TiDB1, TiKV1
tidb-tikv2 10.10.1.139 PD2, TiKV2
tidb-tikv3 10.10.173.84 TiKV3

十、編輯inventory.ini 文件

# 編輯inventory.ini 文件
cd /home/tidb/tidb-ansible
cd /home/tidb/tidb-ansible
vim /home/tidb/tidb-ansible/inventory.ini
## TiDB Cluster Part
[tidb_servers]
10.10.146.28

[tikv_servers]
10.10.146.28
10.10.1.139
10.10.173.84

[pd_servers]
10.10.146.28
10.10.1.139

[spark_master]

[spark_slaves]

## Monitoring Part
# prometheus and pushgateway servers
[monitoring_servers]
10.10.146.28
[grafana_servers]
10.10.146.28
# node_exporter and blackbox_exporter servers
[monitored_servers]
10.10.146.28
10.10.1.139
10.10.173.84
[alertmanager_servers]

[kafka_exporter_servers]

## Binlog Part
[pump_servers:children]
tidb_servers

[drainer_servers]

## Group variables
[pd_servers:vars]
# location_labels = ["zone","rack","host"]

## Global variables
[all:vars]
deploy_dir = /data/tidb/deploy

## Connection
# ssh via normal user
ansible_user = tidb

cluster_name = test-cluster

tidb_version = v2.0.11

# process supervision, [systemd, supervise]
process_supervision = systemd

timezone = Asia/Shanghai

enable_firewalld = False
# check NTP service
enable_ntpd = True
set_hostname = False

## binlog trigger
enable_binlog = False
# zookeeper address of kafka cluster for binlog, example:
# zookeeper_addrs = "192.168.0.11:2181,192.168.0.12:2181,192.168.0.13:2181"
zookeeper_addrs = ""
# kafka cluster address for monitoring, example:
# kafka_addrs = "192.168.0.11:9092,192.168.0.12:9092,192.168.0.13:9092"
kafka_addrs = ""

# store slow query log into seperate file
enable_slow_query_log = False

# enable TLS authentication in the TiDB cluster
enable_tls = False

# KV mode
deploy_without_tidb = False

# Optional: Set if you already have a alertmanager server.
# Format: alertmanager_host:alertmanager_port
alertmanager_target = ""

grafana_admin_user = "admin"
grafana_admin_password = "admin"


### Collect diagnosis
collect_log_recent_hours = 2

enable_bandwidth_limit = True
# default: 10Mb/s, unit: Kbit/s
collect_bandwidth_limit = 10000

View Code

十一、檢測ssh互信

[tidb@bj-db-manage tidb-ansible]$ ansible -i inventory.ini all -m shell -a 'whoami'
10.10.146.28 | SUCCESS | rc=0 >>
tidb

10.10.1.139 | SUCCESS | rc=0 >>
tidb

10.10.173.84 | SUCCESS | rc=0 >>
tidb

[tidb@bj-db-manage tidb-ansible]$

十二、檢測tidb 用戶 sudo 免密碼配置

[tidb@bj-db-manage tidb-ansible]$  ansible -i inventory.ini all -m shell -a 'whoami' -b
10.10.146.28 | SUCCESS | rc=0 >>
root

10.10.173.84 | SUCCESS | rc=0 >>
root

10.10.1.139 | SUCCESS | rc=0 >>
root

[tidb@bj-db-manage tidb-ansible]$

十三、聯網下載 TiDB binary 到中控機

#執行 local_prepare.yml playbook，聯網下載 TiDB binary 到中控機
ansible-playbook local_prepare.yml
# 初始化系統環境，修改內核參數
# 注釋掉磁盤檢查
# 參考資料：https://blog.csdn.net/mayancheng7/article/details/93896233

十四、修改檢測參數

　　說明：TiDB對硬件配置的要求相當高，在測試過程中，需要跳過預檢查。

14-1、fio_randread.yml

[tidb@bj-db-manage tidb-ansible]$ cat  /home/tidb/tidb-ansible/roles/machine_benchmark/tasks/fio_randread.yml
---

#- name: fio randread benchmark on tikv_data_dir disk
#  shell: "cd {{ fio_deploy_dir }} && ./fio -ioengine=psync -bs=32k -fdatasync=1 -thread -rw=randread -size={{ benchmark_size }} -filename=fio_randread_test.txt -name='fio randread test' -iodepth=4 -runtime=60 -numjobs=4 -group_reporting --output-format=json --output=fio_randread_result.json"
#  register: fio_randread

#- name: clean fio randread benchmark temporary file
#  file:
#    path: "{{ fio_deploy_dir }}/fio_randread_test.txt"
#    state: absent

#- name: get fio randread iops
#  shell: "python parse_fio_output.py --target='fio_randread_result.json' --read-iops"
#  register: disk_randread_iops
#  args:
#    chdir: "{{ fio_deploy_dir }}/"

#- name: get fio randread summary
#  shell: "python parse_fio_output.py --target='fio_randread_result.json' --summary"
#  register: disk_randread_smmary
#  args:
#    chdir: "{{ fio_deploy_dir }}/"

#- name: fio randread benchmark command
#  debug:
#    msg: "fio randread benchmark command: {{ fio_randread.cmd }}."
#  run_once: true

#- name: fio randread benchmark summary
#  debug:
#    msg: "fio randread benchmark summary: {{ disk_randread_smmary.stdout }}."

#- name: Preflight check - Does fio randread iops of tikv_data_dir disk meet requirement
#  fail:
#    msg: 'fio: randread iops of tikv_data_dir disk is too low: {{ disk_randread_iops.stdout }} < {{ min_ssd_randread_iops }}, it is strongly recommended to use SSD disks for TiKV and PD, or there might be performance issues.'
#  when: disk_randread_iops.stdout|int < min_ssd_randread_iops|int

View Code

14-2、fio_randread_write_latency.yml

[tidb@bj-db-manage tidb-ansible]$ cat /home/tidb/tidb-ansible/roles/machine_benchmark/tasks/fio_randread_write_latency.yml          
---

#- name: fio mixed randread and sequential write benchmark for latency on tikv_data_dir disk
#  shell: "cd {{ fio_deploy_dir }} && ./fio -ioengine=psync -bs=32k -fdatasync=1 -thread -rw=randrw -percentage_random=100,0 -size={{ benchmark_size }} -filename=fio_randread_write_latency_test.txt -name='fio mixed randread and sequential write test' -iodepth=1 -runtime=60 -numjobs=1 -group_reporting --output-format=json --output=fio_randread_write_latency_test.json"
#  register: fio_randread_write_latency

#- name: clean fio mixed randread and sequential write benchmark for latency temporary file
#  file:
#    path: "{{ fio_deploy_dir }}/fio_randread_write_latency_test.txt"
#    state: absent

#- name: get fio mixed test randread latency
#  shell: "python parse_fio_output.py --target='fio_randread_write_latency_test.json' --read-lat"
#  register: disk_mix_randread_lat
#  args:
#    chdir: "{{ fio_deploy_dir }}/"

#- name: get fio mixed test write latency
#  shell: "python parse_fio_output.py --target='fio_randread_write_latency_test.json' --write-lat"
#  register: disk_mix_write_lat
#  args:
#    chdir: "{{ fio_deploy_dir }}/"

#- name: get fio mixed randread and sequential write for latency summary
#  shell: "python parse_fio_output.py --target='fio_randread_write_latency_test.json' --summary"
#  register: disk_mix_randread_write_latency_smmary
#  args:
#    chdir: "{{ fio_deploy_dir }}/"

#- name: fio mixed randread and sequential write benchmark for latency command
#  debug:
#    msg: "fio mixed randread and sequential write benchmark for latency command: {{ fio_randread_write_latency.cmd }}."
#  run_once: true

#- name: fio mixed randread and sequential write benchmark for latency summary
#  debug:
#    msg: "fio mixed randread and sequential write benchmark summary: {{ disk_mix_randread_write_latency_smmary.stdout }}."

#- name: Preflight check - Does fio mixed randread and sequential write latency of tikv_data_dir disk meet requirement - randread
#  fail:
#    msg: 'fio mixed randread and sequential write test: randread latency of  tikv_data_dir disk is too low: {{ disk_mix_randread_lat.stdout }} ns > {{ max_ssd_mix_randread_lat }} ns, it is strongly recommended to use SSD disks for TiKV and PD, or there might be performance issues.'
#  when: disk_mix_randread_lat.stdout|int > max_ssd_mix_randread_lat|int

#- name: Preflight check - Does fio mixed randread and sequential write latency of tikv_data_dir disk meet requirement - sequential write
#  fail:
#    msg: 'fio mixed randread and sequential write test: sequential write latency of tikv_data_dir disk is too low: {{ disk_mix_write_lat.stdout }} ns > {{ max_ssd_mix_write_lat }} ns, it is strongly recommended to use SSD disks for TiKV and PD, or there might be performance issues.'
#  when: disk_mix_write_lat.stdout|int > max_ssd_mix_write_lat|int

View Code

14-3、fio_randread_write.yml

[tidb@bj-db-manage tidb-ansible]$ cat  /home/tidb/tidb-ansible/roles/machine_benchmark/tasks/fio_randread_write.yml      
---

#- name: fio mixed randread and sequential write benchmark on tikv_data_dir disk
#  shell: "cd {{ fio_deploy_dir }} && ./fio -ioengine=psync -bs=32k -fdatasync=1 -thread -rw=randrw -percentage_random=100,0 -size={{ benchmark_size }} -filename=fio_randread_write_test.txt -name='fio mixed randread and sequential write test' -iodepth=4 -runtime=60 -numjobs=4 -group_reporting --output-format=json --output=fio_randread_write_test.json"
#  register: fio_randread_write

#- name: clean fio mixed randread and sequential write benchmark temporary file
#  file:
#    path: "{{ fio_deploy_dir }}/fio_randread_write_test.txt"
#    state: absent

#- name: get fio mixed test randread iops
#  shell: "python parse_fio_output.py --target='fio_randread_write_test.json' --read-iops"
#  register: disk_mix_randread_iops
#  args:
#    chdir: "{{ fio_deploy_dir }}/"

#- name: get fio mixed test write iops
#  shell: "python parse_fio_output.py --target='fio_randread_write_test.json' --write-iops"
#  register: disk_mix_write_iops
#  args:
#    chdir: "{{ fio_deploy_dir }}/"

#- name: get fio mixed randread and sequential write summary
#  shell: "python parse_fio_output.py --target='fio_randread_write_test.json' --summary"
#  register: disk_mix_randread_write_smmary
#  args:
#    chdir: "{{ fio_deploy_dir }}/"

#- name: fio mixed randread and sequential write benchmark command
#  debug:
#    msg: "fio mixed randread and sequential write benchmark command: {{ fio_randread_write.cmd }}."
#  run_once: true

#- name: fio mixed randread and sequential write benchmark summary
#  debug:
#    msg: "fio mixed randread and sequential write benchmark summary: {{ disk_mix_randread_write_smmary.stdout }}."

#- name: Preflight check - Does fio mixed randread and sequential write iops of tikv_data_dir disk meet requirement - randread
#  fail:
#    msg: 'fio mixed randread and sequential write test: randread iops of  tikv_data_dir disk is too low: {{ disk_mix_randread_iops.stdout }} < {{ min_ssd_mix_randread_iops }}, it is strongly recommended to use SSD disks for TiKV and PD, or there might be performance issues.'
#  when: disk_mix_randread_iops.stdout|int < min_ssd_mix_randread_iops|int

#- name: Preflight check - Does fio mixed randread and sequential write iops of tikv_data_dir disk meet requirement - sequential write
#  fail:
#    msg: 'fio mixed randread and sequential write test: sequential write iops of tikv_data_dir disk is too low: {{ disk_mix_write_iops.stdout }} < {{ min_ssd_mix_write_iops }}, it is strongly recommended to use SSD disks for TiKV and PD, or there might be performance issues.'
#  when: disk_mix_write_iops.stdout|int < min_ssd_mix_write_iops|int

View Code

# 執行如下命令，進行檢查：

ansible-playbook bootstrap.yml

十五、安裝TiD集群

ansible-playbook deploy.yml

十六、啟動Tidb集群

#啟動集群服務 
ansible-playbook start.yml

十七、測試集群

# 使用 MySQL 客戶端連接測試，TCP 4000 端口是 TiDB 服務默認端口
mysql -u root -h 10.10.146.28 -P 4000
[tidb@bj-db-manage tidb-ansible]$ mysql -u root -h 10.10.146.28 -P 4000
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MySQL connection id is 41
Server version: 5.7.10-TiDB-v2.0.11 MySQL Community Server (Apache License 2.0)

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MySQL [(none)]> show databases;
+--------------------+
| Database           |
+--------------------+
| INFORMATION_SCHEMA |
| PERFORMANCE_SCHEMA |
| mysql              |
| test               |
+--------------------+
4 rows in set (0.01 sec)

十八、訪問監控web頁面

# 通過瀏覽器訪問監控平台
地址：http://10.10.146.28:3000 默認帳號密碼是：admin/admin

十九、監控配置(可選)

本小節介紹如何配置 Grafana。

第 1 步：添加 Prometheus 數據源
登錄 Grafana 界面。

默認地址：http://10.10.146.28:3000
默認賬戶：admin
默認密碼：admin
點擊 Grafana 圖標打開側邊欄。

在側邊欄菜單中，點擊 Data Source。

點擊 Add data source。

指定數據源的相關信息：

在 Name 處，為數據源指定一個名稱。
在 Type 處，選擇 Prometheus。
在 URL 處，指定 Prometheus 的 IP 地址。
根據需求指定其它字段。
點擊 Add 保存新的數據源。

第 2 步：導入 Grafana 面板
執行以下步驟，為 PD Server、TiKV Server 和 TiDB Server 分別導入 Grafana 面板：

點擊側邊欄的 Grafana 圖標。

在側邊欄菜單中，依次點擊 Dashboards > Import 打開 Import Dashboard 窗口。

點擊 Upload .json File 上傳對應的 JSON 文件（下載 TiDB Grafana 配置文件)。
注意：

TiKV、PD 和 TiDB 面板對應的 JSON 文件分別為 tikv_summary.json，tikv_details.json，
tikv_trouble_shooting.json，pd.json，tidb.json，tidb_summary.json。

wget https://github.com/pingcap/tidb-ansible/blob/master/scripts/tikv_summary.json
wget https://github.com/pingcap/tidb-ansible/blob/master/scripts/tikv_details.json
wget https://github.com/pingcap/tidb-ansible/blob/master/scripts/tikv_trouble_shooting.json
wget https://github.com/pingcap/tidb-ansible/blob/master/scripts/pd.json
wget https://github.com/pingcap/tidb-ansible/blob/master/scripts/tidb.json
wget https://github.com/pingcap/tidb-ansible/blob/master/scripts/tidb_summary.json

View Code

二十、TiDB運維常見操作

1、停止集群操作
ansible-playbook  stop.yml

2、啟動集群操作
ansible-playbook  start.yml

3、清除集群數據 此操作會關閉 TiDB、Pump、TiKV、PD 服務，並清空 Pump、TiKV、PD 數據目錄
[tidb@bj-db-manage ~]$ cd tidb-ansible
[tidb@bj-db-manage tidb-ansible]$ ansible-playbook unsafe_cleanup_data.yml
銷毀集群
此操作會關閉集群，並清空部署目錄，若部署目錄為掛載點，會報錯，可忽略。
ansible-playbook unsafe_cleanup.yml

二十一、備份與恢復

#備份與恢復，下載TIDB工具集
wget http://download.pingcap.org/tidb-enterprise-tools-latest-linux-amd64.tar.gz && \
wget http://download.pingcap.org/tidb-enterprise-tools-latest-linux-amd64.sha256
#檢查文件完整性
sha256sum -c tidb-enterprise-tools-latest-linux-amd64.sha256
# 解壓
tar -xzf tidb-enterprise-tools-latest-linux-amd64.tar.gz && \
cd tidb-enterprise-tools-latest-linux-amd64

#使用 mydumper/loader 全量備份恢復數據
#mydumper 是一個強大的數據備份工具，具體可以參考：https://github.com/maxbube/mydumper
# 可使用 mydumper 從 TiDB 導出數據進行備份，然后用 loader 將其導入到 TiDB 里面進行恢復。
/*
注意：

必須使用企業版工具集包的 mydumper，不要使用你的操作系統的包管理工具提供的 mydumper。mydumper 的上游版本並不能對 
TiDB 進行正確處理 (#155)。由於使用 mysqldump 進行數據備份和恢復都要耗費許多時間，這里也並不推薦。
*/

mydumper/loader 全量備份恢復最佳實踐，為了快速地備份恢復數據 (特別是數據量巨大的庫)，可以參考以下建議：
　　使用 Mydumper 導出來的數據文件盡可能的小，最好不要超過 64M，可以將參數 -F 設置為 64。
Loader的 -t 參數可以根據 TiKV 的實例個數以及負載進行評估調整，推薦設置為 32。當 TiKV 負載過高，Loader 以及 TiDB 日志中出現大量

backoffer.maxSleep 15000ms is exceeded 時，可以適當調小該值；當 TiKV 負載不是太高的時候，可以適當調大該值。

從 TiDB 備份數據
我們使用 mydumper 從 TiDB 備份數據，如下:
./bin/mydumper -h 10.10.146.28 -P 4000 -u root -t 32 -F 64 -B test -T t1,t2 --skip-tz-utc -o /data/test.t1.t1.dumper.sql
[tidb@bj-db-manage tidb-enterprise-tools-latest-linux-amd64]$ ./bin/mydumper -h 10.10.146.28 -P 4000 -u root -t 32 -F 64 -B test -T t1,t2 --skip-tz-utc -o /data/test.t1.t1.dumper.sql
[tidb@bj-db-manage tidb-enterprise-tools-latest-linux-amd64]$ ll /data/test.t1.t1.dumper.sql/
total 16
-rw-rw-r-- 1 tidb tidb 129 Nov 12 16:34 metadata
-rw-rw-r-- 1 tidb tidb  64 Nov 12 16:34 test-schema-create.sql
-rw-rw-r-- 1 tidb tidb 134 Nov 12 16:34 test.t1-schema.sql
-rw-rw-r-- 1 tidb tidb  34 Nov 12 16:34 test.t1.sql
[tidb@bj-db-manage tidb-enterprise-tools-latest-linux-amd64]$ 

上面，我們使用 -B test 表明是對 test 這個 database 操作，然后用 -T t1,t2 表明只導出 t1，t2 兩張表。

-t 32 表明使用 32 個線程去導出數據。-F 64 是將實際的 table 切分成多大的 chunk，這里就是 64MB 一個 chunk。

--skip-tz-utc 添加這個參數忽略掉 TiDB 與導數據的機器之間時區設置不一致的情況，禁止自動轉換。

向 TiDB 恢復數據

MySQL [(none)]> drop database test;
Query OK, 0 rows affected (0.24 sec)

[tidb@bj-db-manage tidb-enterprise-tools-latest-linux-amd64]$  ./bin/loader  -h 10.10.146.28 -P 4000  -u root  -t 32 -d /data/test.t1.t1.dumper.sql/
2019/11/12 16:36:44 printer.go:52: [info] Welcome to loader
2019/11/12 16:36:44 printer.go:53: [info] Release Version: v1.0.0-76-gad009d9
2019/11/12 16:36:44 printer.go:54: [info] Git Commit Hash: ad009d917b2cdc2a9cc26bc4e7046884c1ff43e7
2019/11/12 16:36:44 printer.go:55: [info] Git Branch: master
2019/11/12 16:36:44 printer.go:56: [info] UTC Build Time: 2019-10-21 06:22:03
2019/11/12 16:36:44 printer.go:57: [info] Go Version: go version go1.12 linux/amd64
2019/11/12 16:36:44 main.go:51: [info] config: {"log-level":"info","log-file":"","status-addr":":8272","pool-size":32,"dir":"/data/test.t1.t1.dumper.sql/","db":{"host":"10.10.146.28","user":"root","port":4000,"sql-mode":"@DownstreamDefault","max-allowed-packet":67108864},"checkpoint-schema":"tidb_loader","config-file":"","route-rules":null,"do-table":null,"do-db":null,"ignore-table":null,"ignore-db":null,"rm-checkpoint":false}
 2019/11/12 16:36:44 loader.go:532: [info] [loader] prepare takes 0.000103 seconds
2019/11/12 16:36:44 checkpoint.go:207: [info] calc checkpoint finished. finished tables (map[])
2019/11/12 16:36:44 loader.go:715: [info] [loader][run db schema]/data/test.t1.t1.dumper.sql//test-schema-create.sql[start]
2019/11/12 16:36:45 loader.go:720: [info] [loader][run db schema]/data/test.t1.t1.dumper.sql//test-schema-create.sql[finished]
2019/11/12 16:36:45 loader.go:736: [info] [loader][run table schema]/data/test.t1.t1.dumper.sql//test.t1-schema.sql[start]
2019/11/12 16:36:45 loader.go:741: [info] [loader][run table schema]/data/test.t1.t1.dumper.sql//test.t1-schema.sql[finished]
2019/11/12 16:36:45 loader.go:773: [info] [loader] create tables takes 0.290772 seconds
2019/11/12 16:36:45 loader.go:788: [info] [loader] all data files have been dispatched, waiting for them finished 
2019/11/12 16:36:45 loader.go:158: [info] [loader][restore table data sql]/data/test.t1.t1.dumper.sql//test.t1.sql[start]
2019/11/12 16:36:46 loader.go:216: [info] data file /data/test.t1.t1.dumper.sql/test.t1.sql scanned finished.
2019/11/12 16:36:46 loader.go:165: [info] [loader][restore table data sql]/data/test.t1.t1.dumper.sql//test.t1.sql[finished]
2019/11/12 16:36:46 loader.go:791: [info] [loader] all data files has been finished, takes 2.039493 seconds
2019/11/12 16:36:46 status.go:32: [info] [loader] finished_bytes = 34, total_bytes = GetAllRestoringFiles34, progress = 100.00 %
2019/11/12 16:36:46 main.go:88: [info] loader stopped and exits 
[tidb@bj-db-manage tidb-enterprise-tools-latest-linux-amd64]$

二十二、添加節點擴容 TiDB/TiKV 節點（這里添加一個tidb節點）

cd  /home/tidb/tidb-ansible
1、編輯 inventory.ini 文件，添加節點信息：
vim /home/tidb/tidb-ansible/inventory.ini 

2、初始化新增節點：
ansible-playbook bootstrap.yml -l 10.10.1.139
3、部署新增節點：
ansible-playbook deploy.yml -l 10.10.1.139
4、啟動新節點
ansible-playbook start.yml -l 10.10.1.139
5、更新 Prometheus 配置並重啟：
ansible-playbook rolling_update_monitor.yml --tags=prometheus
6、打開瀏覽器訪問監控平台：http://10.10.146.28:3000，監控整個集群和新增節點的狀態。
可使用同樣的步驟添加 TiKV 節點。但如果要添加 PD 節點，則需手動更新一些配置文件。

二十三、縮容 TiDB 節點

# 縮容 TiDB 節點
1、停止該節點上的服務：
ansible-playbook stop.yml -l 10.10.1.139
2、
vim /home/tidb/tidb-ansible/inventory.ini 
3、
ansible-playbook rolling_update_monitor.yml --tags=prometheus

二十四、升級TiDB版本

安裝ansible及其依賴
https://pingcap.com/docs-cn/stable/how-to/deploy/orchestrated/ansible#%E5%9C%A8%E4%B8%AD%E6%8E%A7%E6%9C%BA%E5%99%A8%E4%B8%8A%E5%AE%89%E8%A3%85-ansible-%E5%8F%8A%E5%85%B6%E4%BE%9D%E8%B5%96

https://pingcap.com/docs-cn/stable/how-to/upgrade/from-previous-version/

在中控機器上下載 TiDB Ansible
以 tidb 用戶登錄中控機並進入 /home/tidb 目錄，備份 TiDB 2.0 版本或 TiDB 2.1 版本的 tidb-ansible 文件夾：
su - tidb
# 以 tidb 用戶登錄中控機並進入 /home/tidb 目錄，備份 TiDB 2.0 版本或 TiDB 2.1 版本的 tidb-ansible 文件夾：
mv tidb-ansible tidb-ansible-bak
#下載 TiDB 3.0 版本對應 tag 的 tidb-ansible 下載 TiDB Ansible，默認的文件夾名稱為 tidb-ansible。

git clone -b v3.0.5 https://github.com/pingcap/tidb-ansible.git
[tidb@bj-db-manage tidb-ansible]$ cat /home/tidb/tidb-ansible/hosts.ini 
[servers]
10.10.146.28
10.10.1.139
10.10.173.84

[all:vars]
username = tidb
ntp_server = pool.ntp.org
[tidb@bj-db-manage tidb-ansible]$ 

#編輯 inventory.ini 文件和配置文件
#以tidb 用戶登錄中控機並進入 /home/tidb/tidb-ansible 目錄
[tidb@bj-db-manage tidb-ansible]$ cat /home/tidb/tidb-ansible/inventory.ini 
## TiDB Cluster Part
[tidb_servers]
10.10.146.28
10.10.1.139

[tikv_servers]
10.10.146.28
10.10.1.139
10.10.173.84

[pd_servers]
10.10.146.28
10.10.1.139

[spark_master]

[spark_slaves]

[lightning_server]

[importer_server]

## Monitoring Part
# prometheus and pushgateway servers
[monitoring_servers]
10.10.146.28

[grafana_servers]
10.10.146.28

# node_exporter and blackbox_exporter servers
[monitored_servers]
10.10.146.28
10.10.1.139
10.10.173.84

[alertmanager_servers]
10.10.146.28

[kafka_exporter_servers]

## Binlog Part
[pump_servers]

[drainer_servers]

## Group variables
[pd_servers:vars]
# location_labels = ["zone","rack","host"]

## Global variables
[all:vars]
deploy_dir = /data/tidb/deploy

## Connection
# ssh via normal user
ansible_user = tidb

cluster_name = Pro-cluster

tidb_version = v3.0.5

# process supervision, [systemd, supervise]
process_supervision = systemd

timezone = Asia/Shanghai

enable_firewalld = False
# check NTP service
enable_ntpd = True
set_hostname = False

## binlog trigger
enable_binlog = False

# kafka cluster address for monitoring, example:
# kafka_addrs = "192.168.0.11:9092,192.168.0.12:9092,192.168.0.13:9092"
kafka_addrs = ""

# store slow query log into seperate file
enable_slow_query_log = True

# zookeeper address of kafka cluster for monitoring, example:
# zookeeper_addrs = "192.168.0.11:2181,192.168.0.12:2181,192.168.0.13:2181"
zookeeper_addrs = ""

# enable TLS authentication in the TiDB cluster
enable_tls = False

# KV mode
deploy_without_tidb = False

# wait for region replication complete before start tidb-server.
wait_replication = True

# Optional: Set if you already have a alertmanager server.
# Format: alertmanager_host:alertmanager_port
alertmanager_target = ""

grafana_admin_user = "admin"
grafana_admin_password = "admin"


### Collect diagnosis
collect_log_recent_hours = 2

enable_bandwidth_limit = True
# default: 10Mb/s, unit: Kbit/s
collect_bandwidth_limit = 20000
[tidb@bj-db-manage tidb-ansible]$ 


vi inventory.ini 參照之前的參數文件修改ip及路徑
vi /home/tidb/tidb-ansible/conf/tikv.yml 參照之前的參數文件修改內存等參數
vi /home/tidb/tidb-ansible/conf/alertmanager.yml 參照之前的參數文件修改郵件信息
vi /home/tidb/tidb-ansible/conf/tidb.yml （修改split-table: true）
vi /home/tidb/tidb-ansible/conf/pump.yml ( 修改gc: 10)
vi /home/tidb/tidb-ansible/group_vars/all.yml （修改tikv_metric_method: "pull"）
vi /home/tidb/tidb-ansible/roles/prometheus/defaults/main.yml（修改prometheus_storage_retention: "90d"）

下載 TiDB 3.0 binary 到中控機

[tidb@bj-db-manage tidb-ansible]$ pwd
/home/tidb/tidb-ansible
[tidb@bj-db-manage tidb-ansible]$ ansible-playbook local_prepare.yml

# 滾動升級 TiDB 集群組件
# 如果當前 process_supervision 變量使用默認的 systemd 參數，則通過 excessive_rolling_update.yml 滾動升級 TiDB 集群。
ansible-playbook excessive_rolling_update.yml

#如果當前 process_supervision 變量使用 supervise 參數，則通過 rolling_update.yml 滾動升級 TiDB 集群。
ansible-playbook rolling_update.yml


# 滾動升級 TiDB 監控組件
ansible-playbook rolling_update_monitor.yml

升級完畢后的測試

[tidb@bj-db-manage tidb-ansible]$ mysql -h 10.10.146.28 -P 4000  -u root
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MySQL connection id is 5
Server version: 5.7.25-TiDB-v3.0.5 MySQL Community Server (Apache License 2.0)

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MySQL [(none)]> show databases;
+--------------------+
| Database           |
+--------------------+
| INFORMATION_SCHEMA |
| PERFORMANCE_SCHEMA |
| mysql              |
| test               |
| tidb_loader        |
+--------------------+
5 rows in set (0.00 sec)

MySQL [(none)]> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
MySQL [test]> select * from t1;
+----+------+
| id | name |
+----+------+
|  1 | aa   |
+----+------+
1 row in set (0.00 sec)

MySQL [test]> show tables from tidb_loader;
+-----------------------+
| Tables_in_tidb_loader |
+-----------------------+
| checkpoint            |
+-----------------------+
1 row in set (0.00 sec)

MySQL [test]> 
# 升級完畢

二十五、tidb的限制

https://pingcap.com/docs-cn/stable/reference/mysql-compatibility/

生產環境強烈推薦最低要求：
3TiKV＋3PD＋2TiDB

tidb使用坑記錄
1、對硬盤要求很高，沒上SSD硬盤的不建議使用

2、不支持分區，刪除數據是個大坑。

解決方案：set @@session.tidb_batch_delete=1;

3、插入數據太大也會報錯

解決方案：set @@session.tidb_batch_insert=1;

4、刪除表數據時不支持別名

delete from 表名表別名 where 表別名.col = '1' 會報錯

5、內存使用有問題，GO語言導致不知道回收機制什么時候運作。內存使用過多會導致TIDB當機（這點完全不像MYSQL）

測試情況是，32G內存，在10分鍾后才回收一半。

6、數據寫入的時候，tidb壓力很大, tikv的CPU也占用很高

7、不支持GBK

8、不支持存儲過程

# sysbench測試
curl -s https://packagecloud.io/install/repositories/akopytov/sysbench/script.rpm.sh | sudo bash
sudo yum -y install sysbench

mysql -h 10.10.69.73 -P 4000 -u root
create database sbtest

sysbench /usr/share/sysbench/oltp_common.lua --mysql-host=10.10.146.28 --mysql-port=4000 --mysql-user=root --tables=20 --table_size=20000000 --threads=100 --max-requests=0 prepare

sysbench /usr/share/sysbench/oltp_read_write.lua --mysql-host=10.10.146.28 --mysql-port=4000 --mysql-user=root --tables=20 --table_size=2000000 --threads=10 --max-requests=0 run

############ tidb 優缺點 ############
tidb 就幾個疑問來學習，我們為什么使用TIDB（優點）？

生產環境強烈推薦最低要求：
3TiKV＋3PD＋2TiDB

優點：
1、tidb完全支持sql語法，對mysql的兼容做的比較好，可以單獨當作直接寫入庫，
也可以當作mysql庫當從庫(版本不一致時，不推薦這么做)

2、對於大表查詢速度極快，上億條數據查詢秒級別(用這個的主要原因所在)

3、可以在線添加架構節點（pd tidb tikv），水平拓展 (用這個的主要原因所在)

缺點：
1、數據庫中存儲數據同樣數據保持三份，挺浪費存儲空間
2、如果當從庫使用的話，對於上游修改字段長度只能修改基於原來基礎增長而不能縮短以及修改從無
符號修改有符號值時會導致不同步(即只能加長，不能縮短)
3、數據寫入的時候，tidb壓力很大, tikv的CPU也占用很高，寫入相對較慢
4、對硬盤要求很高，沒上SSD硬盤的不建議使用
5、對於刪除數據(dml操作)需要執行相對應的命令，才能回收高水位

不支持分區，刪除數據是個大坑。
插入數據太大也會報錯
不支持GBK
刪除表數據時不支持別名

備注：如果是獨立使用TIDB，問題會更少些(遵循規范)。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 TIDB集群安裝部署方案————————下篇 TiDB數據庫集群安裝以及注意事項基於k8s安裝TiDB4.0集群 tidb集群部署 TiDB集群部署及維護 Centos7配置TiDB集群 TIDB數據集群部署 TiDB原理與集群架構使用 Docker Compose 快速構建 TiDB 集群使用 docker compose 安裝 tidb