greenplum常見問題及解決方法

本文轉載自查看原文 2019-09-24 09:49 1738 數據倉庫DB-Greenplum/ 數據倉庫DB-Deepgreen 報錯處理

本文鏈接：https://blog.csdn.net/q936889811/article/details/85612046

文章目錄

1、錯誤：數據庫初始化：gpinitsystem -c gpconfigs/gpinitsystem_config -h list

2、錯誤：執行檢查：gpcheck -f list

3、錯誤：gpadmin-[CRITICAL]:-gpstate failed. (Reason='Environment Variable MASTER_DATA_DIRECTORY not set!') exiting...

4、錯誤： Reason='[Errno 12] Cannot allocate memory'

5、ERROR: permission denied: "gp_segment_configuration" is a system catalog

6、錯誤：FATAL","XX000","could not create shared memory segment: Cannot allocate memory (pg_shmem.c:183)"

7、修改shared_buffer，使無法啟動數據庫

8、

9、File "/home/gpadmin/greenplum-db/lib/python/gppylib/commands/base.py", line 243, in run

10、ould not create shared memory segment: Invalid argument (pg_shmem.c:136),Failed

11、"failed to acquire resources on one or more segments","connection pointer is NULL

12、

13、VM protect failed to allocate 131080 bytes from system, VM Protect 8098 MB available

14、psql: FATAL: DTM initialization: failure during startup recovery, retry failed, check segment status (cdbtm.c:1602)
1、錯誤：數據庫初始化：gpinitsystem -c gpconfigs/gpinitsystem_config -h list
錯誤提示：
2018-08-29 16:51:01.338476 CST,,,p21229,th406714176,,,,0,,,seg-999,,,,,"FATAL","XX000","could not create semaphores: No space left on device (pg_sema.c:129)","Failed system call was semget(127, 17, 03600).","This error does *not* mean that you have run out of disk space.
It occurs when either the system limit for the maximum number of semaphore sets (SEMMNI), or the system wide maximum number of semaphores (SEMMNS), would be exceeded. You need to raise the respective kernel parameter. Alternatively, reduce PostgreSQL's consumption ofsemaphores by reducing its max_connections parameter (currently 753).
The PostgreSQL documentation contains more information about configuring your system for PostgreSQL.",,,,,,"InternalIpcSemaphoreCreate","pg_sema.c",129,1 0x95661b postgres errstart (elog.c:521)

解決辦法：
[root@bj-ksy-g1-mongos-02 primary]# cat /proc/sys/kernel/sem
250 32000 32 128

修改kernel.sem為：
[root@bj-ksy-g1-mongos-02 primary]# cat /etc/sysctl.conf
kernel.sem = 250 512000 100 2048

12345678910111213
2、錯誤：執行檢查：gpcheck -f list
錯誤提示：
XFS filesystem on device /dev/vdb1 is missing the recommended mount option 'allocsize=16m'

解決辦法：
[gpadmin@bj-ksy-g1-mongos-01 ~]$ cat /etc/fstab
/dev/vdb1 /opt xfs defaults,allocsize=16348k,inode64,noatime 1 1

1234567
3、錯誤：gpadmin-[CRITICAL]:-gpstate failed. (Reason=‘Environment Variable MASTER_DATA_DIRECTORY not set!’) exiting…
錯誤提示：
[gpadmin@bj-ksy-g1-mongos-01 ~]$ gpstop
20180830:09:11:42:011904 gpstop:bj-ksy-g1-mongos-01:gpadmin-[INFO]:-Starting gpstop with args:
20180830:09:11:42:011904 gpstop:bj-ksy-g1-mongos-01:gpadmin-[INFO]:-Gathering information and validating the environment...
20180830:09:11:42:011904 gpstop:bj-ksy-g1-mongos-01:gpadmin-[CRITICAL]:-gpstop failed. (Reason='Environment Variable MASTER_DATA_DIRECTORY not set!') exiting...
[gpadmin@bj-ksy-g1-mongos-01 ~]$ gpstop -M fast
20180830:09:12:07:011962 gpstop:bj-ksy-g1-mongos-01:gpadmin-[INFO]:-Starting gpstop with args: -M fast
20180830:09:12:07:011962 gpstop:bj-ksy-g1-mongos-01:gpadmin-[INFO]:-Gathering information and validating the environment...
20180830:09:12:07:011962 gpstop:bj-ksy-g1-mongos-01:gpadmin-[CRITICAL]:-gpstop failed. (Reason='Environment Variable MASTER_DATA_DIRECTORY not set!') exiting...
[gpadmin@bj-ksy-g1-mongos-01 ~]$ gpstate
20180830:09:13:03:012093 gpstate:bj-ksy-g1-mongos-01:gpadmin-[INFO]:-Starting gpstate with args:
20180830:09:13:03:012093 gpstate:bj-ksy-g1-mongos-01:gpadmin-[CRITICAL]:-gpstate failed. (Reason='Environment Variable MASTER_DATA_DIRECTORY not set!') exiting...
1234567891011
解決方法：
[gpadmin@bj-ksy-g1-mongos-01 ~]$ vim ~/.bashrc
添加：
MASTER_DATA_DIRECTORY=/opt/data/master/gpseg-1
export MASTER_DATA_DIRECTORY
1234
4、錯誤： Reason=’[Errno 12] Cannot allocate memory’

gpstart、gpstate、gpstop操作會報同樣的錯誤

錯誤提示：
[gpadmin@bj-ksy-g1-mongos-01 ~]$ gpstate -s
20180830:09:22:01:013309 gpstate:bj-ksy-g1-mongos-01:gpadmin-[INFO]:-Starting gpstate with args: -s
20180830:09:22:01:013309 gpstate:bj-ksy-g1-mongos-01:gpadmin-[CRITICAL]:-gpstate failed. (Reason='[Errno 12] Cannot allocate memory') exiting...
123
解決方法：
使用root用戶

[root@bj-ksy-g1-mongos-01 ~]# swapon -s #查看swap情況
[root@bj-ksy-g1-mongos-01 ~]# dd if=/dev/zero of=/swapfile bs=1024 count=1024k
1048576+0 records in
1048576+0 records out
1073741824 bytes (1.1 GB) copied, 3.20053 s, 335 MB/s
[root@bj-ksy-g1-mongos-01 ~]# mkswap /swapfile
Setting up swapspace version 1, size = 1048572 KiB
no label, UUID=3e8ef2b3-5d9e-4e04-9718-36caefbfc21d
[root@bj-ksy-g1-mongos-01 ~]# swapon /swapfile
swapon: /swapfile: insecure permissions 0644, 0600 suggested.

[root@bj-ksy-g1-mongos-01 ~]#vim /etc/fstab #使swap持久化
添加：
/swapfile none swap sw 0 0

進入gpadmin
驗證結果
[gpadmin@bj-ksy-g1-mongos-01 ~]$ gpstate -s
20180830:09:34:56:015816 gpstate:bj-ksy-g1-mongos-01:gpadmin-[INFO]:-Starting gpstate with args: -s
20180830:09:34:56:015816 gpstate:bj-ksy-g1-mongos-01:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 5.4.0 build commit:1971b301f52979ac74fb3d0a141bbaae06b70857'
20180830:09:34:56:015816 gpstate:bj-ksy-g1-mongos-01:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.3.23 (Greenplum Database 5.4.0 build commit:1971b301f52979ac74fb3d0a141bbaae06b70857) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.2.0, 64-bit compiled on Jan 12 2018 21:15:36'
20180830:09:34:56:015816 gpstate:bj-ksy-g1-mongos-01:gpadmin-[INFO]:-Obtaining Segment details from master...
20180830:09:34:56:015816 gpstate:bj-ksy-g1-mongos-01:gpadmin-[INFO]:-Gathering data from segments...
20180830:09:34:57:015816 gpstate:bj-ksy-g1-mongos-01:gpadmin-[INFO]:-----------------------------------------------------
20180830:09:34:57:015816 gpstate:bj-ksy-g1-mongos-01:gpadmin-[INFO]:--Master Configuration & Status
123456789101112131415161718192021222324252627
5、ERROR: permission denied: “gp_segment_configuration” is a system catalog
錯誤：

ERROR: permission denied: “gp_segment_configuration” is a system catalog

解決：
postgres=# delete from gp_segment_configuration where role='m';
ERROR: permission denied: "gp_segment_configuration" is a system catalog
postgres=# set allow_system_table_mods='dml';
SET
postgres=# delete from gp_segment_configuration where role='m';
DELETE 9
postgres=#
1234567
6、錯誤：FATAL",“XX000”,“could not create shared memory segment: Cannot allocate memory (pg_shmem.c:183)”
2018-10-15 19:45:37.841672 CST,,,p10296,th624441152,,,,0,,,seg-1,,,,,"FATAL","XX000","could not create shared memory segment: Cannot allocate memory (pg_shmem.c:183)","Failed system call was shmget(key=40002001, size=267762784, 03600).","This error usually means that PostgreSQL's request for a shared memory segment exceeded available memory or swap space. To reduce the request size (currently 267762784 bytes), reduce PostgreSQL's shared_buffers parameter (currently 4000) and/or its max_connections parameter (currently 753).
The PostgreSQL documentation contains more information about shared memory configuration.",,,,,,"InternalIpcMemoryCreate","pg_shmem.c",183,1    0x95661b postgres errstart (elog.c:521)
2    0x7bc723 postgres <symbol not found> (pg_shmem.c:145)
3    0x7bc9ba postgres PGSharedMemoryCreate (pg_shmem.c:387)
4    0x812d69 postgres CreateSharedMemoryAndSemaphores (ipci.c:242)
5    0x7d47dc postgres PostmasterMain (postmaster.c:3996)
6    0x4c8af7 postgres main (main.c:206)
7    0x7f372083ab15 libc.so.6 __libc_start_main + 0xf5
8    0x4c904c postgres <symbol not found> + 0x4c904c

12345678910
解決方法：
使用root用戶

[root@bj-ksy-g1-mongos-01 ~]#vim /etc/fstab #使swap持久化
添加：
/swapfile none swap sw 0 0
12345678910111213141516
7、修改shared_buffer，使無法啟動數據庫
gpconfig -c shared_buffers -v "8192MB"
greenplum修改shared_buffer，使無法啟動數據庫。
原因：kernel.shmmax的值為500000000(476MB),shared_buffer大於476MB時，數據庫就無法正常啟動。kernel.shmmax參數設置過小。

解決辦法：增加kernel.shmmax，最好把此參數設置為總內存的50%。

123456
8、
greenplum運行一段時間連接失敗，並且pg_stat_activity的連接數沒有達到設置的限制。
net.core.somaxconn=65535
net.core.rmem_max=16777216
net.core.wmem_max=16777216
net.core.somaxconn是Linux中的一個kernel參數，表示socket監聽（listen）的backlog上限。什么是backlog呢？backlog就是socket的監聽隊列，當一個請求（request）尚未被處理或建立時，他會進入backlog。而socket server可以一次性處理backlog中的所有請求，處理后的請求不再位於監聽隊列中。當server處理請求較慢，以至於監聽隊列被填滿后，新來的請求會被拒絕。
Linux的參數net.core.somaxconn默認值同樣為128。當服務端繁忙時，如NameNode或JobTracker，128是遠遠不夠的。這樣就需要增大backlog，例如我們的3000台集群就將ipc.server.listen.queue.size設成了32768，為了使得整個參數達到預期效果，同樣需要將kernel參數net.core.somaxconn設成一個大於等於32768的值。
9、File “/home/gpadmin/greenplum-db/lib/python/gppylib/commands/base.py”, line 243, in run
錯誤提示：
gpstate -s
所有的segment出現故障
開始停掉greenplum
gpstop -a
錯誤輸出：
'
20181227:10:18:11:2243549 gpstop:hrdskf-k:gpadmin-[ERROR]:-ExecutionError: 'non-zero rc: 1' occured. Details: 'ssh -o StrictHostKeyChecking=no -o ServerAliveInterval=60 hrdskf-k ". /home/gpadmin/greenplum-db/./greenplum_path.sh; $GPHOME/sbin/gpoperation.py"' cmd had rc=1 completed=True halted=False
stdout=''
stderr='\S
Kernel \r on an \m
Warm tips :Authorized for Haier Utility's Uses only. All activity may be monitored and reported.
If you have any questions,please contact us.
Mailbox:dts.jxjg@haier.com
Phone:68066686 / 1000 / 8173
WARNING: Your password has expired.
Password change required but no TTY available.
'
Traceback (most recent call last):
File "/home/gpadmin/greenplum-db/lib/python/gppylib/commands/base.py", line 243, in run
    self.cmd.run()
File "/home/gpadmin/greenplum-db/lib/python/gppylib/operations/__init__.py", line 53, in run
    self.ret = self.execute()
File "/home/gpadmin/greenplum-db/lib/python/gppylib/operations/utils.py", line 48, in execute
    cmd.run(validateAfter=True)
File "/home/gpadmin/greenplum-db/lib/python/gppylib/commands/base.py", line 717, in run
    self.validate()
File "/home/gpadmin/greenplum-db/lib/python/gppylib/commands/base.py", line 764, in validate
    raise ExecutionError("non-zero rc: %d" % self.results.rc, self)
ExecutionError: ExecutionError: 'non-zero rc: 1' occured. Details: 'ssh -o StrictHostKeyChecking=no -o ServerAliveInterval=60 hrdskf-k ". /home/gpadmin/greenplum-db/./greenplum_path.sh; $GPHOME/sbin/gpoperation.py"' cmd had rc=1 completed=True halted=False
stdout=''
stderr='\S
Kernel \r on an \m
Warm tips :Authorized for Haier Utility's Uses only. All activity may be monitored and reported.
If you have any questions,please contact us.
Mailbox:dts.jxjg@haier.com
Phone:68066686 / 1000 / 8173
WARNING: Your password has expired.
Password change required but no TTY available.
123456789101112131415161718192021222324252627282930313233
解決思路：
通過日志分析ssh問題
1、驗證是否可以免密登陸
2、結果需要重新設置密碼
3、ssh hostname 提示修改密碼
服務器的普通設置，默認有實效時間
查看並修改密碼有效時間
[root@hrdskf-m ~]# chage -l gpadmin
Last password change                                    : Dec 27, 2018
Password expires                                        : Feb 25, 2019
Password inactive                                       : never
Account expires                                         : never
Minimum number of days between password change          : 1
Maximum number of days between password change          : 60
Number of days of warning before password expires       : 14
[root@hrdskf-m ~]# chage -l root
Last password change                                    : Dec 24, 2018
Password expires                                        : never
Password inactive                                       : never
Account expires                                         : never
Minimum number of days between password change          : 0
Maximum number of days between password change          : 99999
Number of days of warning before password expires       : 7
[root@hrdskf-m ~]# chage -M 99999 gpadmin   #此設置永不過期
[root@hrdskf-m ~]# chage -l gpadmin
Last password change                                    : Dec 27, 2018
Password expires                                        : never
Password inactive                                       : never
Account expires                                         : never
Minimum number of days between password change          : 1
Maximum number of days between password change          : 99999
Number of days of warning before password expires       : 14
[root@hrdskf-m ~]#
1234567891011121314151617181920212223242526
10、ould not create shared memory segment: Invalid argument (pg_shmem.c:136),Failed
error:"could not create shared memory segment: Invalid argument (pg_shmem.c:136),Failed "
解決：You will need to reduce the value of the parameter max_connections.
11、“failed to acquire resources on one or more segments”,"connection pointer is NULL
錯誤：2018-11-09 10:08:13.279910 CST,"gpadmin","xn_report",p119553,th-1821042816,"172.23.0.74","16532",2018-11-09 10:08:13 CST,0,con10783,,seg-1,,dx2364872,,sx1,"ERROR","58M01","failed to acquire resources on one or more segments","connection pointer is NULL
1
這與Master上的Query Dispatcher（QD）進程有關。它顯示連接到主服務器上的postmaster進程的主服務器上的QD進程連接問題。
可以將參數gp_reject_internal_tcp_connection更改為“off”。此參數的默認值為“on”。此參數用於允許與主服務器的內部TCP連接。理想情況下，應使用UNIX域套接字而不是TCP連接，這就是參數gp_reject_internal_tcp_connection的默認值為“on”的原因。
此參數是受限制的參數，在設置此參數時，您需要使用“–skipvalidation”值。要設置參數，您需要運行以下命令：
gpconfig -c gp_reject_internal_tcp_connection -v off --skipvalidation
注意 - 設置此參數后，需要重新啟動數據庫。
https://community.pivotal.io/s/article/Error-Failed-to-acquire-resources-on-one-or-more-segments-in-Pivotal-Greenplum
12、
max_connections 數據庫服務器的最大並發連接數。在Greenplum系統中，用戶客戶端連接僅通過Greenplum主實例。段實例應該允許5-10倍的數量。增加此參數時，還必須增加max_prepared_transactions。
max_prepared_transactions：
設置可以同時處於准備狀態的最大事務數。Greenplum在內部使用准備好的事務來確保各個段的數據完整性。該值必須至少與主服務器上的max_connections值一樣大。段實例應設置為與主節點相同的值。
gpconfig -c max_prepared_transactions -v 500
gpconfig -c max_connections -v 2500 -m 500
13、VM protect failed to allocate 131080 bytes from system, VM Protect 8098 MB available
VM protect failed to allocate 131080 bytes from system, VM Protect 8098 MB available

gpconfig -c gp_max_plan_size -v "200MB"

1234
14、psql: FATAL: DTM initialization: failure during startup recovery, retry failed, check segment status (cdbtm.c:1602)
psql: FATAL: DTM initialization: failure during startup recovery, retry failed, check segment status (cdbtm.c:1602)

12
數據庫啟動節點都是up正常狀態
解決辦法：
GOPTIONS='-c gp_session_role=utility' psql -d postgres
————————————————
版權聲明：本文為CSDN博主「柔於似水」的原創文章，遵循 CC 4.0 BY-SA 版權協議，轉載請附上原文出處鏈接及本聲明。
原文鏈接：https://blog.csdn.net/q936889811/article/details/85612046

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 NHibernate常見問題及解決方法 SVN常見問題及解決方法 petalinux常見問題及解決方法 Nacos 常見問題及解決方法 Nacos 常見問題及解決方法 ROS常見問題及解決方法 Kafka常見問題及解決方法 WMI常見問題及解決方法 Confluence常見問題，及解決方法 SAP常見問題與解決方法