KingbaseES啟動數據庫失敗后如何分析


 

關鍵字:

   KingbaseES、sys_ctl、啟動日志

一、KingbaseES數據庫服務啟動

1.1 數據庫啟動機制

1) 數據庫通過sys_ctl工具手工啟動數據庫服務kingbase。

2) 對於sys_ctl工具需要通過-D參數指定數據庫數據存儲路徑。

3) 數據庫啟動需要讀取kingbase.conf文件,獲取數據庫實例初始化的參數配置。

4) 數據庫啟動時產生的日志信息可以寫入到指定的日志文件或顯示在標准輸出上。

5) 可以通過數據庫啟動日志來判斷、分析數據庫啟動的故障原因。

1.2 數據庫服務啟動工具sys_ctl

 

 

 

圖1-1 sys_ctl工具幫助信息

 

二、數據庫服務啟動故障分析

2.1 數據庫啟動端口被占用案例

案例說明:

數據庫在啟動時,日志信息提示“could not bind IPv4 address "0.0.0.0": Address already in use“,查看數據庫服務端口(default:54321),此端口在系統下處於”Listen“狀態,已經被其他數據庫服務占用。如果在主機上啟動多個數據庫實例,需要修改port,避免實例之間的數據庫服務端口沖突。

 

故障現象:

[kingbase@node1 data]$ /opt/Kingbase/ES/V8R6_021/Server/bin/sys_ctl start -D /data/kingbase/v8r6_021/data

 

waiting for server to start....2021-03-01 12:52:31.989 CST [15825] LOG:  sepapower extension initialized

2021-03-01 12:52:31.991 CST [15825] LOG:  starting KingbaseES V008R006C004B0021 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit

2021-03-01 12:52:31.991 CST [15825] LOG:  could not bind IPv4 address "0.0.0.0": Address already in use

2021-03-01 12:52:31.991 CST [15825] HINT:  Is another kingbase already running on port 54321? If not, wait a few seconds and retry.

2021-03-01 12:52:31.991 CST [15825] LOG:  could not bind IPv6 address "::": Address already in use

2021-03-01 12:52:31.991 CST [15825] HINT:  Is another kingbase already running on port 54321? If not, wait a few seconds and retry.

2021-03-01 12:52:31.991 CST [15825] WARNING:  could not create listen socket for "*"

2021-03-01 12:52:31.991 CST [15825] FATAL:  could not create any TCP/IP sockets

2021-03-01 12:52:31.991 CST [15825] LOG:  database system is shut down

 stopped waiting

sys_ctl: could not start server

Examine the log output.

 

故障分析:

查看端口(54321)使用情況,可以獲知54321端口已經被占用:

 

[kingbase@node1 data]$ netstat -antlp|grep -i listen|grep :54321

 

tcp        0      0 0.0.0.0:54321           0.0.0.0:*               LISTEN      14665/kingbase     

tcp6       0      0 :::54321                :::*                    LISTEN      14665/kingbase  

 

查看數據庫服務相關進程:  

[kingbase@node1 data]$ ps -ef |grep 14665

 

kingbase 14665     1  0 12:51 ?        00:00:00 /home/kingbase/cluster/R6HA/KHA/kingbase/bin/kingbase -D /home/kingbase/cluster/R6HA/KHA/kingbase/data

kingbase 14669 14665  0 12:51 ?        00:00:00 kingbase: logger  

kingbase 14671 14665  0 12:51 ?        00:00:00 kingbase: startup   recovering 000000070000000200000086

kingbase 14672 14665  0 12:51 ?        00:00:00 kingbase: checkpointer  

kingbase 14673 14665  0 12:51 ?        00:00:00 kingbase: background writer  

kingbase 14674 14665  0 12:51 ?        00:00:00 kingbase: stats collector  

kingbase 14676 14665  0 12:51 ?        00:00:02 kingbase: walreceiver   streaming 2/860023B0

kingbase 15088 14665  0 12:52 ?        00:00:01 kingbase: esrep esrep 192.168.7.248(26056) idle

kingbase 15769 14665  0 12:52 ?        00:00:00 kingbase: system test ::1(26355) idle

 

故障解決:

修改數據庫服務端口號:                   

 [kingbase@node1 data]$ cat kingbase.conf |grep port

port = 54322                            # (change requires restart)

 

 

2.2 數據庫啟動內存分配錯誤案例

案例說明:

   數據庫實例在啟動時,日志信息提示“could not map anonymous shared memory: Cannot allocate memory“。數據庫服務無法獲取buffer分配,導致實例啟動失敗。通過重新配置內核,增加共享內存的尺寸或者縮小數據庫共享buffer大小(shared_buffer)來解決問題。

 

故障現象:

[kingbase@node1 data]$ /opt/Kingbase/ES/V8R6_021/Server/bin/sys_ctl start -D /data/kingbase/v8r6_021/data

 

waiting for server to start....2021-03-01 13:01:46.176 CST [20183] LOG:  sepapower extension initialized

2021-03-01 13:01:46.179 CST [20183] LOG:  starting KingbaseES V008R006C004B0021 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit

2021-03-01 13:01:46.179 CST [20183] LOG:  listening on IPv4 address "0.0.0.0", port 54322

2021-03-01 13:01:46.179 CST [20183] LOG:  listening on IPv6 address "::", port 54322

2021-03-01 13:01:46.316 CST [20183] LOG:  listening on Unix socket "/tmp/.s.KINGBASE.54322"

2021-03-01 13:01:46.383 CST [20183] FATAL:  could not map anonymous shared memory: Cannot allocate memory

2021-03-01 13:01:46.383 CST [20183] HINT:  This error usually means that Kingbase's request for a shared memory segment exceeded available memory, swap space, or huge pages. To reduce the request size (currently 8850808832 bytes), reduce Kingbase's shared memory usage, perhaps by reducing shared_buffers or max_connections.

2021-03-01 13:01:46.383 CST [20183] LOG:  database system is shut down

 stopped waiting

sys_ctl: could not start server

Examine the log output.

 

故障分析:

 

查看kingbase.conf文件中buffer的配置參數:

[kingbase@node1 data]$ cat kingbase.conf |grep buffer

shared_buffers = 8192MB                 # min 128kB

 

查看系統內存使用情況:

[kingbase@node1 data]$ free -m

              total        used        free      shared  buff/cache   available

Mem:           3381         435        2060          70         885        1833

Swap:          2815           0        2815

 

===kingbase.conf文件中查看buffer配置(8192M),已經超出了系統物理內存和swap分區的總和(3381+2815 M),導致數據庫實例無法獲取到指定的buffer,從而導致實例啟動失敗。===

 

故障解決:

 

修改kingbase.conf文件調整buffer的大小:

[kingbase@node1 data]$ cat kingbase.conf |grep -i shared_buffer

shared_buffers = 1024MB                 # min 128kBM

 

三、總結

對於數據庫服務啟動的故障,可以根據啟動的日志信息進行分析、判斷所產生的故障原因;一般數據庫服務啟動的故障,大部分和數據庫的配置(kingbase.conf)參數有關,所以在分析、解決問題時,可以結合配置文件參數的配置和系統環境配置進行處理。

參考文檔:

[安裝與升級]基於Linux系統的數據庫軟件安裝指南(單機版)]


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM