關鍵字:
KingbaseES、sys_ctl、啟動日志
一、KingbaseES數據庫服務啟動
1.1 數據庫啟動機制
1) 數據庫通過sys_ctl工具手工啟動數據庫服務kingbase。
2) 對於sys_ctl工具需要通過-D參數指定數據庫數據存儲路徑。
3) 數據庫啟動需要讀取kingbase.conf文件,獲取數據庫實例初始化的參數配置。
4) 數據庫啟動時產生的日志信息可以寫入到指定的日志文件或顯示在標准輸出上。
5) 可以通過數據庫啟動日志來判斷、分析數據庫啟動的故障原因。
1.2 數據庫服務啟動工具sys_ctl
圖1-1 sys_ctl工具幫助信息
二、數據庫服務啟動故障分析
2.1 數據庫啟動端口被占用案例
案例說明:
數據庫在啟動時,日志信息提示“could not bind IPv4 address "0.0.0.0": Address already in use“,查看數據庫服務端口(default:54321),此端口在系統下處於”Listen“狀態,已經被其他數據庫服務占用。如果在主機上啟動多個數據庫實例,需要修改port,避免實例之間的數據庫服務端口沖突。
故障現象:
[kingbase@node1 data]$ /opt/Kingbase/ES/V8R6_021/Server/bin/sys_ctl start -D /data/kingbase/v8r6_021/data
waiting for server to start....2021-03-01 12:52:31.989 CST [15825] LOG: sepapower extension initialized
2021-03-01 12:52:31.991 CST [15825] LOG: starting KingbaseES V008R006C004B0021 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit
2021-03-01 12:52:31.991 CST [15825] LOG: could not bind IPv4 address "0.0.0.0": Address already in use
2021-03-01 12:52:31.991 CST [15825] HINT: Is another kingbase already running on port 54321? If not, wait a few seconds and retry.
2021-03-01 12:52:31.991 CST [15825] LOG: could not bind IPv6 address "::": Address already in use
2021-03-01 12:52:31.991 CST [15825] HINT: Is another kingbase already running on port 54321? If not, wait a few seconds and retry.
2021-03-01 12:52:31.991 CST [15825] WARNING: could not create listen socket for "*"
2021-03-01 12:52:31.991 CST [15825] FATAL: could not create any TCP/IP sockets
2021-03-01 12:52:31.991 CST [15825] LOG: database system is shut down
stopped waiting
sys_ctl: could not start server
Examine the log output.
故障分析:
查看端口(54321)使用情況,可以獲知54321端口已經被占用:
[kingbase@node1 data]$ netstat -antlp|grep -i listen|grep :54321
tcp 0 0 0.0.0.0:54321 0.0.0.0:* LISTEN 14665/kingbase
tcp6 0 0 :::54321 :::* LISTEN 14665/kingbase
查看數據庫服務相關進程:
[kingbase@node1 data]$ ps -ef |grep 14665
kingbase 14665 1 0 12:51 ? 00:00:00 /home/kingbase/cluster/R6HA/KHA/kingbase/bin/kingbase -D /home/kingbase/cluster/R6HA/KHA/kingbase/data
kingbase 14669 14665 0 12:51 ? 00:00:00 kingbase: logger
kingbase 14671 14665 0 12:51 ? 00:00:00 kingbase: startup recovering 000000070000000200000086
kingbase 14672 14665 0 12:51 ? 00:00:00 kingbase: checkpointer
kingbase 14673 14665 0 12:51 ? 00:00:00 kingbase: background writer
kingbase 14674 14665 0 12:51 ? 00:00:00 kingbase: stats collector
kingbase 14676 14665 0 12:51 ? 00:00:02 kingbase: walreceiver streaming 2/860023B0
kingbase 15088 14665 0 12:52 ? 00:00:01 kingbase: esrep esrep 192.168.7.248(26056) idle
kingbase 15769 14665 0 12:52 ? 00:00:00 kingbase: system test ::1(26355) idle
故障解決:
修改數據庫服務端口號:
[kingbase@node1 data]$ cat kingbase.conf |grep port
port = 54322 # (change requires restart)
2.2 數據庫啟動內存分配錯誤案例
案例說明:
數據庫實例在啟動時,日志信息提示“could not map anonymous shared memory: Cannot allocate memory“。數據庫服務無法獲取buffer分配,導致實例啟動失敗。通過重新配置內核,增加共享內存的尺寸或者縮小數據庫共享buffer大小(shared_buffer)來解決問題。
故障現象:
[kingbase@node1 data]$ /opt/Kingbase/ES/V8R6_021/Server/bin/sys_ctl start -D /data/kingbase/v8r6_021/data
waiting for server to start....2021-03-01 13:01:46.176 CST [20183] LOG: sepapower extension initialized
2021-03-01 13:01:46.179 CST [20183] LOG: starting KingbaseES V008R006C004B0021 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit
2021-03-01 13:01:46.179 CST [20183] LOG: listening on IPv4 address "0.0.0.0", port 54322
2021-03-01 13:01:46.179 CST [20183] LOG: listening on IPv6 address "::", port 54322
2021-03-01 13:01:46.316 CST [20183] LOG: listening on Unix socket "/tmp/.s.KINGBASE.54322"
2021-03-01 13:01:46.383 CST [20183] FATAL: could not map anonymous shared memory: Cannot allocate memory
2021-03-01 13:01:46.383 CST [20183] HINT: This error usually means that Kingbase's request for a shared memory segment exceeded available memory, swap space, or huge pages. To reduce the request size (currently 8850808832 bytes), reduce Kingbase's shared memory usage, perhaps by reducing shared_buffers or max_connections.
2021-03-01 13:01:46.383 CST [20183] LOG: database system is shut down
stopped waiting
sys_ctl: could not start server
Examine the log output.
故障分析:
查看kingbase.conf文件中buffer的配置參數:
[kingbase@node1 data]$ cat kingbase.conf |grep buffer
shared_buffers = 8192MB # min 128kB
查看系統內存使用情況:
[kingbase@node1 data]$ free -m
total used free shared buff/cache available
Mem: 3381 435 2060 70 885 1833
Swap: 2815 0 2815
===從kingbase.conf文件中查看buffer配置(8192M),已經超出了系統物理內存和swap分區的總和(3381+2815 M),導致數據庫實例無法獲取到指定的buffer,從而導致實例啟動失敗。===
故障解決:
修改kingbase.conf文件調整buffer的大小:
[kingbase@node1 data]$ cat kingbase.conf |grep -i shared_buffer
shared_buffers = 1024MB # min 128kBM
三、總結
對於數據庫服務啟動的故障,可以根據啟動的日志信息進行分析、判斷所產生的故障原因;一般數據庫服務啟動的故障,大部分和數據庫的配置(kingbase.conf)參數有關,所以在分析、解決問題時,可以結合配置文件參數的配置和系統環境配置進行處理。
參考文檔:
[安裝與升級]基於Linux系統的數據庫軟件安裝指南(單機版)]