Greenplum5.16.0 安裝教程


Greenplum5.16.0 安裝教程

一、環境說明

1.1官方網站

Greenplum官方安裝說明:https://gpdb.docs.pivotal.io/5160/install_guide/install_extensions.html

1.2硬件要求

Greenplum數據庫集成對服務器的要求:

Operating System

SUSE Linux Enterprise Server 11 SP2

CentOS 5.0 or higher

Red Hat Enterprise Linux (RHEL) 5.0 or higher

Oracle Unbreakable Linux 5.5

Note: See the Greenplum Database Release Notes for current supported platform information.

File Systems

xfs required for data storage on SUSE Linux and Red Hat (ext3 supported for root file system)

Minimum CPU

Pentium Pro compatible (P3/Athlon and above)

Minimum Memory

16 GB RAM per server

Disk Requirements

l 150MB per host for Greenplum installation

l Approximately 300MB per segment instance for meta data

l Appropriate free space for data with disks at no more than 70% capacity

l High-speed, local storage

Network Requirements

10 Gigabit Ethernet within the array

Dedicated, non-blocking switch

NIC bonding is recommended when multiple interfaces are present

Software and Utilities

zlib compression libraries

bash shell

GNU tars

GNU zip

GNU sed (used by Greenplum Database gpinitsystem)

perl

secure shell

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1.3搭建環境

操作系統:Red Hat Enterprise Linux Server release 6.4 (Santiago) x86_64

主機情況:

Gp集成主機組成:1台主節點 + 1台主節點備份節點 + 3台數據節點主節點用主機備份,數據節點用鏡像備份)

Master(主節點)192.25.108.86mdw

Segment(數據節點)192.25.108.85sdw1)、  192.25.108.84sdw2)、192.25.108.86sdw3

主節點備份節點:192.25.108.85smdw

注意:這里服務器有限,將86即做主節點又做為數據節點;85即做數據節點又做為主節點備份節點;三台服務器的配置是一樣的;

二、安裝(強烈建議在root用戶下操作

注:標示“三台主機”的表示要在三台電腦上都做該操作,“master節點”表示只在mdw主機上操作;

2.1 系統設置

1關閉防火牆(三台主機)(學習時可以直接關閉,正式環境是通過開放端口)

#systemctl status firewalld(查看防火牆狀態)

 

出現以上信息表示防火牆已經關閉;

#systemctl stop firewalld(停止防火牆)

#systemctl disable firewalld(設置防火牆不可用)

假如有安裝iptables,需要關閉:

service iptables stop  停止防火牆服務,重啟電腦后仍然會開啟
chkconfig iptables off  關閉防火牆服務開機啟動,重啟后生效
可以兩個命令結合使用避免重啟(重啟服務器命令:reboot);

 

2修改/etc/hosts文件(三台主機)

命令:vi /etc/hosts

hosts文件中添加或修改一下內容

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4

::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

192.25.108.86   mdw

192.25.108.85   sdw1

192.25.108.84   sdw2

192.25.108.86   sdw3

192.25.108.85   smdw

添加之后,可以通過ping命令測試是否正確,如:ping sdw1 測試是否能訪問sdw1節點.

 

3修改或添加/etc/sysctl.conf(三台主機)

kernel.shmmax = 500000000

kernel.shmmni = 4096

kernel.shmall = 4000000000

kernel.sem = 500 1024000 200 4096

kernel.sysrq = 1

kernel.core_uses_pid = 1

kernel.msgmnb = 65536

kernel.msgmax = 65536

kernel.msgmni = 2048

net.ipv4.tcp_syncookies = 1

net.ipv4.conf.default.accept_source_route = 0

net.ipv4.tcp_tw_recycle = 1

net.ipv4.tcp_max_syn_backlog = 4096

net.ipv4.conf.all.arp_filter = 1

net.ipv4.ip_local_port_range = 10000 65535

net.core.netdev_max_backlog = 10000

net.core.rmem_max = 2097152

net.core.wmem_max = 2097152

vm.overcommit_memory = 2

vm.swappiness = 10

vm.dirty_expire_centisecs = 500

vm.dirty_writeback_centisecs = 100

vm.dirty_background_ratio = 0

vm.dirty_ratio=0

vm.dirty_background_bytes = 1610612736

vm.dirty_bytes = 4294967296

然后執行生效命令:sysctl -p 或者 source sysctl.conf 或者 locate source sysctl.conf

 

4配置/etc/security/limits.conf文件,添加以下內容(三台主機)

* soft nofile 65536 #打開文件的最大數目

* hard nofile 65536

* soft nproc 131072 #進程的最大數目

* hard nproc 131072

 

5設置預讀塊的值為16384(三台主機)(可不設置)

# /sbin/blockdev --getra /dev/sda 查看預讀塊,默認大小為256

# /sbin/blockdev --setra 16384 /dev/sda  設置預讀塊

 

6設置磁盤訪問I/O調度策略(三台主機)(可不設置)

#echo deadline > /sys/block/sda/queue/scheduler

 

7創建文件all_hosts(所有主機名 all_segs(數據節點主機名)(三台主機)

路徑可以自選,這里用/home

文件內容:

文件/home/all_hosts:
mdw

sdw1

sdw2

sdw3

smdw

文件/home/all_segs:

 sdw1

 sdw2

 sdw3

2.2 安裝Greenplum

1,准備greenplum數據庫安裝文件(master節點)

路徑:/home/greenplum:

greenplum-db-5.16.0-rhel7-x86_64.rpm

 

2,安裝軟件(master節點)

# cd /home/greenplum

# rpm -Uvh ./greenplum-db-5.16.0-rhel7-x86_64.rpm

安裝過程中會顯示以下內容,直接使用默認即可;

 

注意:默認安裝路徑 /usr/local/,如下圖;

 

# chown -R gpadmin /usr/local/greenplum*(在創建gpadmin后執行)

# chgrp -R gpadmin /usr/local/greenplum*(在創建gpadmin后執行)

 

3獲取環境參數(master節點)
# source /usr/local/greenplum-db-5.16.0/greenplum_path.sh

查看環境變量: 

# echo $GPHOME

 

4運行gpseginstall工具 (master節點)

# cd /usr/local/greenplum-db-5.16.0

# source greenplum_path.sh(執行gp相關命令的時候,切換用戶后需要先執行這個命令)

# cd /usr/local/greenplum-db-5.16.0/bingp相關命令,切到bin目錄下執行)
# gpseginstall -f /home/all_hosts -u gpadmin -p gp38281850
all_hosts是上個步驟創建的文件,安裝過程中會讓輸入三台主機的密碼,完成后提示成功,如下圖:

 

注意:如果執行失敗,可以采用先打通服務器之間連接(master節點)

# cd /usr/local/greenplum-db-5.16.0

# source greenplum_path.sh(執行gp相關命令的時候,切換用戶后需要先執行這個命令)

# cd /usr/local/greenplum-db-5.16.0/bingp相關命令,切到bin目錄下執行)
# gpssh-exkeys -f /home/all_hosts(如果在主節點上手動創建了gpadmin,也可以在gpdamin用戶下執行此命令)

成功后再執行gpseginstall命令;

gpseginstall -f all_hosts -u gpadmin -p gpadmin執行這條指令時出錯:

  • 重新反復檢查操作系統配置是否正確,比如關閉防火牆,/etc/sysctl.conf,/etc/hosts,greenplum_path.sh等
  • gpseginstall會根據指定主機列表自動安裝文件,創建系統用戶gpadmin,並自動建立root用戶和系統用戶(gpadmin)的信任關系。重新運行前,可以先清理掉生成目錄等。
  • 這個其實也可以自行手工完成,通過useradd創建用戶,通過gpssh-exkeys命令建立信任關系,手工創建目錄等。

創建創建gpadmin組合用戶命令:

# groupdel gpadmin

# userdel gpadmin

# groupadd -g 530 gpadmin

# useradd -g 530 -u 530 -m -d /home/gpadmin -s /bin/bash gpadmin

# passwd gpadmin

 

錯誤處理:

[root@sjck-db003tf bin]# gpseginstall -f /home/all_segs -u gpadmin -p gp38281808

20190312:09:30:46:041587 gpseginstall:sjck-db003tf:root-[INFO]:-Installation Info:

link_name greenplum-db

binary_path /usr/local/greenplum-db-5.16.0

binary_dir_location /usr/local

binary_dir_name greenplum-db-5.16.0

20190312:09:30:46:041587 gpseginstall:sjck-db003tf:root-[INFO]:-check cluster password access

20190312:09:30:47:041587 gpseginstall:sjck-db003tf:root-[INFO]:-de-duplicate hostnames

20190312:09:30:47:041587 gpseginstall:sjck-db003tf:root-[INFO]:-master hostname: sjck-db003tf

20190312:09:30:47:041587 gpseginstall:sjck-db003tf:root-[INFO]:-check for user gpadmin on cluster

20190312:09:30:47:041587 gpseginstall:sjck-db003tf:root-[INFO]:-add user gpadmin on master

20190312:09:30:48:041587 gpseginstall:sjck-db003tf:root-[INFO]:-add user gpadmin on cluster

20190312:09:30:48:041587 gpseginstall:sjck-db003tf:root-[INFO]:-chown -R gpadmin:gpadmin /usr/local/greenplum-db

20190312:09:30:48:041587 gpseginstall:sjck-db003tf:root-[INFO]:-chown -R gpadmin:gpadmin /usr/local/greenplum-db-5.16.0

20190312:09:30:48:041587 gpseginstall:sjck-db003tf:root-[INFO]:-rm -f /usr/local/greenplum-db-5.16.0.tar; rm -f /usr/local/greenplum-db-5.16.0.tar.gz

20190312:09:30:48:041587 gpseginstall:sjck-db003tf:root-[INFO]:-cd /usr/local; tar cf greenplum-db-5.16.0.tar greenplum-db-5.16.0

20190312:09:31:03:041587 gpseginstall:sjck-db003tf:root-[INFO]:-gzip /usr/local/greenplum-db-5.16.0.tar

20190312:09:31:38:041587 gpseginstall:sjck-db003tf:root-[INFO]:-remote command: mkdir -p /usr/local

20190312:09:31:39:041587 gpseginstall:sjck-db003tf:root-[INFO]:-remote command: rm -rf /usr/local/greenplum-db-5.16.0

20190312:09:31:40:041587 gpseginstall:sjck-db003tf:root-[INFO]:-scp software to remote location

The authenticity of host 'smdw (192.25.108.85)' can't be established.

ECDSA key fingerprint is SHA256:FVZzJbgTMrxJp2gjhQlAgyXAg+cOlZ3mp+nun+ujTwM.

Are you sure you want to continue connecting (yes/no)? yes

Warning: the ECDSA host key for 'smdw' differs from the key for the IP address '192.25.108.85'

Offending key for IP in /root/.ssh/known_hosts:7

Are you sure you want to continue connecting (yes/no)?

-- 解決方案:

刪除:/root/.ssh/known_hosts 或者 刪除對應的秘鑰信息

然后執行:

gpssh-exkeys -f /home/all_hosts

重新生成秘鑰

然后執行:

gpseginstall -f /home/all_segs -u gpadmin -p gp38281808

 

5,切換到gpadmin用戶驗證無密碼登錄(master節點)

        (1)切換用戶

            $ su - gpadmin

su [user] su - [user]的區別:

su [user]切換到其他用戶,但是不切換環境變量,su - [user]則是完整的切換到新的用戶環境。

        (2)使用gpssh工具來測試無密碼登錄所有主機,結果如下圖:

            $ gpssh -f /home/all_hosts -e ls -l $GPHOME

 

 

6配置環境變量(master節點)

vi ~/.bashrc

source /usr/local/greenplum-db-5.16.0/greenplum_path.sh

export MASTER_DATA_DIRECTORY=/home/data/master/gpseg-1

 

export PGPORT=5432

export PGUSER=gpadmin

export PGDATABASE=postgres(默認數據庫)

 

修改生效:source ~/.bashrc

備用主機,將環境文件復制到備用主機:

$ cd ~

# 只需要替換standby_hostname名即可;

$ scp .bashrc standby_hostname:`pwd`

查看效果:只能在gpadmin用戶下看到替換效果;

 

7創建存儲區域(master節點)root用戶下執行】

注意:#df -hl 查看各個文件夾剩余空間,用空間大的作為data存放目錄;

    (1) 創建Master數據存儲區域

        # mkdir -p /home/data/master

    (2) 改變目錄的所有權

        # chown gpadmin:gpadmin /home/data/master

(3) 使用gpssh工具在所有segment主機上創建主數據和鏡像數據目錄,如果沒有設置鏡像可以不創建mirror目錄(執行下面命令):

mkdir /home/data

mkdir /home/data/master

chown gpadmin:gpadmin /home/data/master/

source /usr/local/greenplum-db-4.2/greenplum_path.sh

-- 主節點備用機

gpssh -h smdw -e 'mkdir /home/data/master'

gpssh -h smdw -e 'chown gpadmin /home/data/master'

-- 數據節點

gpssh -f /home/all_segs -e 'mkdir /home/data'

gpssh -f /home/all_segs -e 'mkdir /home/data/primary'

gpssh -f /home/all_segs -e 'mkdir /home/data/mirror'

gpssh -f /home/all_segs -e 'chown gpadmin /home/data/primary'

gpssh -f /home/all_segs -e 'chown gpadmin /home/data/mirror'

注意:這樣可以登錄數據節點看看是否生成對應的文件夾。

 

8同步系統時間Master節點)

     (1) 在Master主機上編輯/etc/ntp.conf來設置如下內容:

  server 127.127.1.0

     (2) 在Segment主機上編輯/etc/ntp.conf

  server mdw prefer

server smdw

# 如果有指定的數據中心NTP服務器,則需要將mdw和smdw指定到數據中心IP;

     (3) 在Master主機上,通過NTP守護進程同步系統時鍾(切換到su - gpadmin)

    $ gpssh -f all_hosts -v -e 'ntpd'

 

(4)驗證下觀察時間是否一致(切換到su - gpadmin):

$gpssh -f /home/all_hosts -v date

 

注意:沒有ntp.conf文件,安裝命令:yum -y install ntp

 

9,檢測系統環境

Master上進行主機OS參數檢測(Master主節點)【切換到su - gpadmin】:

$ gpcheck -f /home/all_hosts -m mdw

gpcheck時遇到的一些報錯解決:

$ gpssh -f /home/all_hosts -e 'echo deadline > /sys/block/sr0/queue/scheduler'

$ gpssh -f /home/all_hosts -e 'echo deadline > /sys/block/sr1/queue/scheduler'

$ gpssh -f /home/all_hosts -e 'echo deadline > /sys/block/sda/queue/scheduler'

$ /sbin/blockdev --setra 16384 /dev/sda* /sbin/blockdev --getra /dev/sda*

2.3 檢查硬件性能

1, 檢查網絡性能(Master主節點)【切換到 su - gpadmin

$ gpcheckperf -f /home/all_segs -r N -d /tmp > subnet1.out

$ cat subnet1.out

 

2,檢查磁盤I/O和內存帶寬(Master主節點)【切換到 su - gpadmin

$ gpcheckperf -f /home/all_hosts -d /home/data/mirror -r ds

[gpadmin@sjck-db003tf bin]$ gpcheckperf -f /home/all_hosts -d /home/data/mirror -r ds

/usr/local/greenplum-db/./bin/gpcheckperf -f /home/all_hosts -d /home/data/mirror -r ds

--------------------

--  DISK WRITE TEST

--------------------

--------------------

--  DISK READ TEST

--------------------

--------------------

--  STREAM TEST

--------------------

====================

==  RESULT

====================

 disk write avg time (sec): 60.64

 disk write tot bytes: 100862754816

 disk write tot bandwidth (MB/s): 2711.94

 disk write min bandwidth (MB/s): 247.14 [sdw3]

 disk write max bandwidth (MB/s): 1297.59 [sdw1]

 

 disk read avg time (sec): 37.11

 disk read tot bytes: 100862754816

 disk read tot bandwidth (MB/s): 2730.62

 disk read min bandwidth (MB/s): 745.66 [sdw3]

 disk read max bandwidth (MB/s): 1224.26 [smdw]

 

 stream tot bandwidth (MB/s): 32104.45

 stream min bandwidth (MB/s): 9512.24 [smdw]

 stream max bandwidth (MB/s): 11821.18 [sdw2]

 

三、初始化Greenplum數據庫

3.1數據庫配置文件

1gpadmin用戶登錄

  # su - gpadmin

 

2從模板中拷貝一份gpinitsystem_config文件

cp $GPHOME/docs/cli_help/gpconfigs/gpinitsystem_config /home/gpadmin/gpinitsystem_config

chown gpadmin:gpadmin /home/gpadmin/gpinitsystem_config

 

3設置所有必須的參數

# 數據庫應用名稱

 ARRAY_NAME="EMC Greenplum DW"

# Segment的前綴(包含數據節點和其鏡像,目錄/home/data/primary  /home/data/mirror

SEG_PREFIX=gpseg

# Primary Segment的起始端口號

PORT_BASE=40000

# 指定Primary Segment的數據目錄,DATA_DIRECTORY參數指定每個Segment主機配置多少個Instance(目錄路徑只需要設置到primary,會自動生成帶序號的文件)

declare -a DATA_DIRECTORY=(/home/data/primary /home/data/primary /home/data/primary /home/data/primary)

# Master所在主機的hostname

MASTER_HOSTNAME=mdw

# Master的目錄

MASTER_DIRECTORY=/home/data/master

# Master端口

MASTER_PORT=5432

# bash版本(官方文檔:TRUSTED SHELL=ssh 資料查閱:TRUSTED_SHELL=ssh)

TRUSTED SHELL=ssh

# 設置檢查點段的大小,較大的檢查點段可以改善大數據量裝載的性能,同時會加長災難事物恢復時間,缺省值8(如果多台服務器級的主機,有足夠的內存>16G >16核,那么可以設置256

CHECK_POINT_SEGMENTS=8

# 字符集

ENCODING=UNICODE

--設置鏡像參數

# Mirror Segment起始端口號

MIRROR_PORT_BASE=7000

# Primary Segment主備同步的起始端口號

REPLICATION_PORT_BASE=8000

# Mirror Segment主備同步的起始端口號

MIRROR_REPLICATION_PORT_BASE=9000

# Mirror Segment目錄(個數跟Primary Segment一致)

declare -a MIRROR_DATA_DIRECTORY

=(/home/data/mirror /home/data/mirror /home/data/mirror /home/data/mirror)

3.2運行初始化工具初始化數據庫

1,初始化數據庫

# 主節點沒有備份主機

$ cd ~

$ gpinitsystem -c /home/gpadmin/gpinitsystem_config -h /home/all_segs -s

或者

# 主節點有備份主機

$ cd ~
$ gpinitsystem -c /home/gpadmin/gpinitsystem_config -h /home/all_segs -s smdw -S

注意:參數-s 表示鏡像為散列spread儲存;缺省默認表示組group存儲;

成功之后,數據庫便啟動了,信息如下:

 

 

2啟動和停止數據庫測試
  $ gpstart
  $ gpstop

3.3訪問數據庫(默認登錄postgres數據庫

$ psql -d postgres

輸入查詢語句

postgres=# select datname,datdba,encoding,datacl from pg_database;

 

 

3.4退出數據庫

postgres=# \d 或者 ctrl + d

3.5數據庫狀態

查看狀態:$ gpstate

  • gpstate -c:primary instance 和 mirror instance 的對應關系;
  • gpstate -m:只列出mirror 實例的狀態和配置信息;
  • gpstate -f:顯示standby master 的詳細信息;

 

表示正常;

 

四、擴展Segment節點

4.1 擴展方法

gpdb推薦的硬件配置環境下:每個有效的CPU核對應一個Segment Instance ,比如一個Segment主機配備2個雙核的CPU ,那么可以選擇每個Segment主機配置4個主實例(Primary Instance

查看邏輯CPU的個數 $ cat /proc/cpuinfo | grep "processor" | wc -l

擴展Segment個數,總共分三步:

1)將主機加入集群(如果在原有主機上擴展,不需要這一步)

這一步主要做的是:環境配置;

例如:OS kernel參數;

創建go管理用戶;

ssh key的交換(使用gpssh-exkeys -e exist_hosts -x new_hosts;

Greenplum bin軟件的拷貝;

使用gpcheck檢查(gpcheck -f new_hosts;

使用gpcheckperf檢查性能(gpcheckperf -f new_hosts_file -d /data1 -d /data2 -v);

2)初始化segement並加入集群

這一步主要做的是:產生配置文件;

命令:gpexpand -f new_hosts_file(也可以自己寫配置文件);

在指定目錄初始化segment數據庫(gpexpand -i cnf -D dbname);

將新增的segment信息添加到master元表;

擴展失敗怎么處理?

3)重分布表

規划表的重分布優先級順序;

將表數據根據新的 segment重新分布;

分析表;

4.2示例

4.2.1示例一:在原主機上新增節點

假設需要原地擴展9Segment,在原有的3台主機各增加3segment

1)因為沒有新增主機,所以直接進入第二步;

2)創建需要擴展的segment的主機文件/home/seg_hosts

$ vi /home/seg_hosts

sdw1

sdw2

sdw3

(3)產生配置文件:

創建一個database

$  psql -d postgres

postgres=# create database addseg;

創建配置文件:

$ gpexpand -f /home/seg_hosts -D addseg

Please refer to the Admin Guide for more information.

 

Would you like to initiate a new System Expansion Yy|Nn (default=N):

> y

What type of mirroring strategy would you like?

 spread|grouped (default=grouped):

> spread

How many new primary segments per host do you want to add? (default=0):

> 3

Enter new primary data directory 1:

> /home/data/primary

Enter new primary data directory 2:

/home/data/primary

Enter new primary data directory 3:

> /home/data/primary

Enter new mirror data directory 1:

> /home/data/mirror

Enter new mirror data directory 2:

> /home/data/mirror

Enter new mirror data directory 3:

> /home/data/mirror

Generating configuration file...

20190312:18:22:22:094698 gpexpand:sjck-db003tf:gpadmin-[INFO]:-Generating input file...

Input configuration files were written to 'gpexpand_inputfile_20190312_182222' and 'None'.

Please review the file and make sure that it is correct then re-run

with: gpexpand -i gpexpand_inputfile_20190312_182222 -D addseg

20190312:18:22:22:094698 gpexpand:sjck-db003tf:gpadmin-[INFO]:-Exiting...

查看生成的配置文件:$ cat gpexpand_inputfile_20190313_112226

sdw1:sdw1:40001:/home/data/primary/gpseg3:9:3:p:8001

sdw2:sdw2:7001:/home/data/mirror/gpseg3:10:3:m:9001

sdw2:sdw2:40001:/home/data/primary/gpseg4:11:4:p:8001

sdw3:sdw3:7001:/home/data/mirror/gpseg4:12:4:m:9001

sdw3:sdw3:40001:/home/data/primary/gpseg5:13:5:p:8001

sdw1:sdw1:7001:/home/data/mirror/gpseg5:14:5:m:9001

sdw1:sdw1:40002:/home/data/primary/gpseg6:15:6:p:8002

sdw3:sdw3:7002:/home/data/mirror/gpseg6:16:6:m:9002

sdw3:sdw3:40002:/home/data/primary/gpseg7:17:7:p:8002

sdw2:sdw2:7002:/home/data/mirror/gpseg7:18:7:m:9002

sdw2:sdw2:40002:/home/data/primary/gpseg8:19:8:p:8002

sdw1:sdw1:7002:/home/data/mirror/gpseg8:20:8:m:9002

sdw1:sdw1:40003:/home/data/primary/gpseg9:21:9:p:8003

sdw2:sdw2:7003:/home/data/mirror/gpseg9:22:9:m:9003

sdw2:sdw2:40003:/home/data/primary/gpseg10:23:10:p:8003

sdw3:sdw3:7003:/home/data/mirror/gpseg10:24:10:m:9003

sdw3:sdw3:40003:/home/data/primary/gpseg11:25:11:p:8003

sdw1:sdw1:7003:/home/data/mirror/gpseg11:26:11:m:9003

配置文件內容格式以及字段含義:

hostname:address:port:fselocation:dbid:content:prefered_role:replication_port

hostname:主機名

addree:類似主機名

portsegment監聽端口,根據配置的primarymirror的初始端口值累加(需要注意已經使用的端口)

fselocationsegment data目錄,注意是全路徑

dbidgp集群的唯一ID,可以到表gp_segment_configuration中獲取,必須是累加

content:可以到表gp_segment_configuration中獲取,必須是累加

prefered_role:角色(pm)(primary,mirror)

replication_port:如果沒有mirror則不需要(用於replication端口),根據配置primarymirror的主備同步的初始端口值累加(注意已經使用的端口)

注意:如果覺得上面內容有問題可以手動修改或者手動寫配置文件,要注意鏡像是spread存儲方式,同一個segmentprimartymirror不能在同一台主機上;

(4)接下來需要修改 greenplum bin目錄權限:

gpexpand需要在這個目錄寫入一些內容;

$ chmod -R 700 /usr/local/greenplum-db-5.16.0/

(5)執行gpexpand進行擴展

$ gpexpand -i gpexpand_inputfile_20190313_112226 -D addseg -S -V -v -n 1 -B 1 -t /tmp

參數:

-B batch_size

Batch size of remote commands to send to a given host before making a one-second pause. Default is 16. Valid values are 1-128.

The gpexpand utility issues a number of setup commands that may exceed the host's maximum threshold for unauthenticated connections as defined by MaxStartups in the SSH daemon configuration. The one-second pause allows authentications to be completed before gpexpand issues any more commands.

The default value does not normally need to be changed. However, it may be necessary to reduce the maximum number of commands if gpexpand fails with connection errors such as 'ssh_exchange_identification: Connection closed by remote host.'

-c | --clean

Remove the expansion schema.

-d | --duration hh:mm:ss

Duration of the expansion session from beginning to end.

-D database_name

Specifies the database in which to create the expansion schema and tables. If this option is not given, the setting for the environment variable PGDATABASE is used. The database templates template1 and template0 cannot be used.

-e | --end 'YYYY-MM-DD hh:mm:ss'

Ending date and time for the expansion session.

-f | --hosts-file filename

Specifies the name of a file that contains a list of new hosts for system expansion. Each line of the file must contain a single host name.

This file can contain hostnames with or without network interfaces specified. The gpexpand utility handles either case, adding interface numbers to end of the hostname if the original nodes are configured with multiple network interfaces.

Note: The Greenplum Database segment host naming convention is sdwN where sdw is a prefix and N is an integer. For example, sdw1, sdw2 and so on. For hosts with multiple interfaces, the convention is to append a dash (-) and number to the host name. For example, sdw1-1 and sdw1-2 are the two interface names for host sdw1.

-i | --input input_file

Specifies the name of the expansion configuration file, which contains one line for each segment to be added in the format of:

hostname:address:port:fselocation:dbid:content:preferred_role:replication_port

If your system has filespaces, gpexpand will expect a filespace configuration file (input_file_name.fs) to exist in the same directory as your expansion configuration file. The filespace configuration file is in the format of:

filespaceOrder=filespace1_name:filespace2_name: ...

dbid:/path/for/filespace1:/path/for/filespace2: ...

dbid:/path/for/filespace1:/path/for/filespace2: ...

...

-n parallel_processes

The number of tables to redistribute simultaneously. Valid values are 1 - 96.

Each table redistribution process requires two database connections: one to alter the table, and another to update the table's status in the expansion schema. Before increasing -n, check the current value of the server configuration parameter max_connections and make sure the maximum connection limit is not exceeded.

-r | --rollback

Roll back a failed expansion setup operation. If the rollback command fails, attempt again using the -D option to specify the database that contains the expansion schema for the operation that you want to roll back.

-s | --silent

Runs in silent mode. Does not prompt for confirmation to proceed on warnings.

-S | --simple-progress

If specified, the gpexpand utility records only the minimum progress information in the Greenplum Database table gpexpand.expansion_progress. The utility does not record the relation size information and status information in the table gpexpand.status_detail.

Specifying this option can improve performance by reducing the amount of progress information written to the gpexpand tables.

[-t | --tardir] directory

The fully qualified path to a directory on segment hosts were the gpexpand utility copies a temporary tar file. The file contains Greenplum Database files that are used to create segment instances. The default directory is the user home directory.

-v | --verbose

Verbose debugging output. With this option, the utility will output all DDL and DML used to expand the database.

--version

Display the utility's version number and exit.

-V | --novacuum

Do not vacuum catalog tables before creating schema copy.

運行成功輸出如下:

20190313:16:50:44:012900 gpexpand:digoal193096:digoal-[INFO]:-Exiting...

(6)添加節點失敗了,怎么辦?

啟動限制模式,回滾;

$ gpstart -m

$ gpexpand -r -D addseg 或者 gpexpand --rollback -D addseg

$ gpstart -a

然后找問題解決,繼續上一步,直到成功;

成功后,可以登錄數據庫查看節點數變多:

postgres=#select * from gp_segment_configuration;

(7)重分布

在數據重分布前,新增節點對老數據不起作用;

計划重分布任務中,表的調度順序:

postgres=# \c addseg

addseg=#select * from gpexpand.status;

查看接下來的任務,如果要調整任務的先后順序,改rank即可:

addseg=#select * from gpexpand.status_detail;

例如:

addseg=#Update gpexpand.status_detail set rank=10;

addseg=#Update gpexpand.status_detail set rank=1 where fq_name = ‘public.lineitem’;

addseg=#Update gpexpand.status_detail set rank=2 where fq_name = ‘public.orders;

還有多少表未完成重分布:

addseg=#select * from gpexpand.expansion_progress;

執行重分布命令,需要在指定計划在多久內完成,或者機會在哪天完成重分布,腳本會自動調用重分布:

$ gpexpand -a -d 1:00:00 -D addseg -S -t /tmp -v -n 1

重分布過程中,可以看到進度:

addseg=#select * from gpexpand.expansion_progress;

addseg=#select * from gpexpand.status_detail;

(8)最后一步,清除重分布產生的schema gpexpand

$ gpexpand -c -D addseg

問你是否需要清除gpexpand schema前將狀態信息導出

Do you want to dump the gpexpand.status_detail table to file?Yy|Nn (default=Y)

>y

注意:如果提示已經超過回退的節點,可以采用全量恢復($ gprecoverseg -F

4.2.2示例二:新增一台主機,4segment

新增主機:192.25.108.87 sdw4

(1)關閉防火牆(sdw4)

#systemctl status firewalld(查看防火牆狀態)

#systemctl stop firewalld(停止防火牆)

#systemctl disable firewalld(設置防火牆不可用)

2修改/etc/hosts文件(台主機)

命令:vi /etc/hosts

hosts文件中添加或修改一下內容

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4

::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

192.25.108.86   mdw

192.25.108.85   sdw1

192.25.108.84   sdw2

192.25.108.86   sdw3

192.25.108.87   sdw4

192.25.108.85   smdw

3)修改或添加/etc/sysctl.conf(sdw4)

kernel.shmmax = 500000000

kernel.shmmni = 4096

kernel.shmall = 4000000000

kernel.sem = 500 1024000 200 4096

kernel.sysrq = 1

kernel.core_uses_pid = 1

kernel.msgmnb = 65536

kernel.msgmax = 65536

kernel.msgmni = 2048

net.ipv4.tcp_syncookies = 1

net.ipv4.conf.default.accept_source_route = 0

net.ipv4.tcp_tw_recycle = 1

net.ipv4.tcp_max_syn_backlog = 4096

net.ipv4.conf.all.arp_filter = 1

net.ipv4.ip_local_port_range = 10000 65535

net.core.netdev_max_backlog = 10000

net.core.rmem_max = 2097152

net.core.wmem_max = 2097152

vm.overcommit_memory = 2

vm.swappiness = 10

vm.dirty_expire_centisecs = 500

vm.dirty_writeback_centisecs = 100

vm.dirty_background_ratio = 0

vm.dirty_ratio=0

vm.dirty_background_bytes = 1610612736

vm.dirty_bytes = 4294967296

然后執行生效命令:sysctl -p

4)配置/etc/security/limits.conf文件,添加以下內容(sdw4)

* soft nofile 65536 #打開文件的最大數目

* hard nofile 65536

* soft nproc 131072 #進程的最大數目

* hard nproc 131072

5設置預讀塊的值為16384sdw4)(可不設置)

# /sbin/blockdev --getra /dev/sda 查看預讀塊,默認大小為256

# /sbin/blockdev --setra 16384 /dev/sda  設置預讀塊

6設置磁盤訪問I/O調度策略sdw4)(可不設置)

#echo deadline > /sys/block/sda/queue/scheduler

7)創建admin用戶和data文件夾(sdw4

跟其他三台主機保持一致:/home/gpadmin  /home/data(文件夾要授權gpadmin

8)創建文件exist_hostsnew_hostsMaster節點)

$ vi /home/exist_hosts

mdw
sdw1
sdw2

sdw3

smdw

$ vi /home/new_hosts

sdw4

(9)交換ssh keyMaster節點)

Master使用gp管理員用戶(root)訪問了所有segment不需要密碼,Master pub拷貝到所有的segment authorized_keys

$ gpssh-exkeys -e /home/exist_hosts -x /home/new_hosts

(10)安裝軟件到segment hosts

$ gpseginstall -f /home/new_hosts -u gpdamin -p gp38281850

(11)使用gpcheckperf檢測性能

$ gpcheckperf -f /home/new_hosts -d /tmp -v

 

接下來和4.2.1的示例一的步驟差不多;

 

(12)產生配置文件

 

$ gpexpand -f /home/new_hosts -cgpexpand -f /home/new_hosts -D addseg

 

注意:dbid, contendid都務必連續,通過查看gp_segment_configuration(同一主機,端口不能沖突)

 

(13)修改greeenplum bin目錄權限

 

gpexpand需要在這個目錄寫入一些內容;

 

$ chmod -R 700 /usr/local/greenplum-db-5.16.0/

 

(14)執行gpexpand進行擴展

 

$ gpexpand -i gpexpand_inputfile_20190313_112380 -D addseg -S -V -v -n 1 -B 1 -t /tmp

 

(15)執行重分布命令

 

需要在指定計划在多久內完成,或者計划在哪天完成重分布交本會自動調用重分布;

 

$ gpexpand -a -d 1:00:00 -D addseg -S -t /tmp -v -n 1

 

五、問題處理

日志路徑:/home/gpadmin/gpAdminLogs

5.1問題一:擴展節點提示:主機名和Ip的秘鑰不一致

問題:

Warning: the ECDSA host key for 'sjck-db001tf' differs from the key for the IP address '192.25.108.84'

解決方案:

(1)查看#cat /home/gpadmin/.ssh/known_hosts

(2)刪除秘鑰:#rm -rf  /root/.ssh/known_hosts#rm -rf cat /home/gpadmin/.ssh/known_hosts;

(3)將所有涉及到的主機名(包含所有節點以及節點本機的hostname)加入到

#vi /home/all_hostnames,內容如下:

mdw

smdw

sdw1

sdw2

sdw3

sjck-db001tf

sjck-db002tf

sjck-db003tf

(4)重新shh生成key#gpssh-exkeys -f /home/all_hostnames(生成/root/.ssh/known_hosts文件)

(5)重新生成帳號gpadminshh key#gpseginstall -f /home/all_hosts -u gpadmin -p gp38281808(生成/home/gpadmin/.ssh/known_hosts 文件)

(6)查看#/root/.ssh/known_hosts #cat /home/gpadmin/.ssh/known_hosts 內容是否一致;

(7)不一致,則刪除文件重新生成;


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM