跟我一起讀postgresql源碼(一)——psql命令

本文轉載自查看原文 2016-10-31 22:05 8801 postgresql

進公司以來做的都是postgresql相關的東西，每次都是測試、修改邊邊角角的東西，這樣感覺只能留在表面，不能深入了解這個開源數據庫的精髓，遂想着看看postgresql的源碼，以加深對數據庫的理解，也算是好好提高自己。

但是目標很性感，現實很殘酷，postgesql的源碼都已經百萬級了。單單.c文件都有1000+。怎么辦，硬着頭皮看吧，所幸postgrsql的源碼很規范，這應該會給我省不少事。給自己頂一個小目標：每天看一點源碼，每天都更新做不到，每周都更新吧，每周至少一篇。希望看到我的博客的朋友們也和我一起學習，我有什么理解不對的地方，也希望大家提出意見~

大部分人初次接觸postgresql一般都是接觸psql這個命令行工具吧，那么我們今天就從psql程序的源碼開始看吧。

對了，這里要說一下，我看的代碼指的是postgresql9.5.4這個版本，不同版本的代碼當然是有區別的~

psql的源碼分為兩部分，一部分是psql的前台處理代碼，代碼都放在/src/bin/psql下；另一部分就是后台的查詢處理過程的代碼，代碼較多，過程也較為復雜。這部分代碼分布在/src/backend/目錄下的許多子目錄中。這篇博客是試水的文章，就先看看前台的代碼吧。后台的代碼放在后面的博客(如果有的話~)里再細細的說吧。

讓我們先打開/src/bin/psql目錄，這下面放的就是psql的前端程序代碼。基本所有的程序都有個main函數，psql的main函數就放在startup.c里面。

我們先看兩個數據結構：

enum _actions
{
    ACT_NOTHING = 0,
    ACT_SINGLE_SLASH,
	ACT_LIST_DB,
	ACT_SINGLE_QUERY,
	ACT_FILE
};

struct adhoc_opts
{
	char	   *dbname;
	char	   *host;
	char	   *port;
	char	   *username;
	char	   *logfilename;
	enum _actions action;
	char	   *action_string;
	bool		no_readline;
	bool		no_psqlrc;
	bool		single_txn;
};

其中：

枚舉類型_actions代表psql命令行程序當前所處的狀態；

結構體adhoc_opts 儲存了當前命令行程序的一些登錄信息，比如登陸的數據庫、主機、端口和日志文件的位置等等。

同樣還有幾個小函數：

static void parse_psql_options(int argc, char *argv[],   //解析命令行選項
    			   struct adhoc_opts * options);         
static void process_psqlrc(char *argv0);                 //載入.psqlrc文件，如果存在的話
static void process_psqlrc_file(char *filename);         //被process_psqlrc()調用
static void showVersion(void);                           //格式化輸出PostgreSQL的版本
static void EstablishVariableSpace(void);                //

對了，開頭的兩個宏定義指明了對psql命令行窗口的定制化信息：

#ifndef WIN32
#define SYSPSQLRC    "psqlrc"
#define PSQLRC		".psqlrc"
#else
#define SYSPSQLRC	"psqlrc"
#define PSQLRC		"psqlrc.conf"
#endif

通過這兩個文件可以定制自己的命令行窗口(分別指linux和windows下)的信息顯示式樣，很方便實用。

不廢話，進main函數。
第一個if顯示的很顯然是psql的help和version命令。

在后面有一個變量很重要：pset。它的數據結構PsqlSettings的定義放在src/bin/psql/settings.h里面。這個數據結構主要要表達的是當前psql命令行屬性和狀態集，通過這些屬性和狀態集判斷和處理來控制程序的走向。

typedef struct _psqlSettings
{
    PGconn	   *db;				/* connection to backend */
	int			encoding;		/* client_encoding */
	FILE	   *queryFout;		/* where to send the query results */
	bool		queryFoutPipe;	/* queryFout is from a popen() */
	FILE	   *copyStream;		/* Stream to read/write for \copy command */
	printQueryOpt popt;
	char	   *gfname;			/* one-shot file output argument for \g */
	char	   *gset_prefix;	/* one-shot prefix argument for \gset */
	bool		notty;			/* stdin or stdout is not a tty (as determined on startup) */							 
	enum trivalue getPassword;	/* prompt the user for a username and password */
	FILE	   *cur_cmd_source; /* describe the status of the current main  loop */								
	bool		cur_cmd_interactive;
	int			sversion;		/* backend server version */
	const char *progname;		/* in case you renamed psql */
	char	   *inputfile;		/* file being currently processed, if any */
	uint64		lineno;			/* also for error reporting */
	uint64		stmt_lineno;	/* line number inside the current statement */
	bool		timing;			/* enable timing of all queries */
	FILE	   *logfile;		/* session log file handle */
	VariableSpace vars;			/* "shell variable" repository */

	/*
	 * The remaining fields are set by assign hooks associated with entries in
	 * "vars".  They should not be set directly except by those hook
	 * functions.
	 */
	bool		autocommit;
	bool		on_error_stop;
	bool		quiet;
	bool		singleline;
	bool		singlestep;
	int			fetch_count;
	PSQL_ECHO	echo;
	PSQL_ECHO_HIDDEN echo_hidden;
	PSQL_ERROR_ROLLBACK on_error_rollback;
	PSQL_COMP_CASE comp_case;
	HistControl histcontrol;
	const char *prompt1;
	const char *prompt2;
	const char *prompt3;
	PGVerbosity verbosity;		/* current error verbosity level */
} PsqlSettings;

main主要干了哪些事兒呢？：比如你輸入：

psql -U postgres -p 26500 -w

程序在讀取psql后面這一大串參數之前，先初始化一些環境變量，當確定不是--version和--help這種輸出幫助信息就結束的參數時，利用輸入的參數，初始化前面提到的_psqlSettings類型的變量pset。然后驗證登錄密碼(如果指定了的話)，

進入334行的MainLoop函數。這個函數的定義在統計文件夾的mainloop.c文件中。這個函數的主要成分就是一個大的循環：

循環讀取命令行的查詢請求-->將請求發往后端-->從后端獲取請求的結果。

值得一說的是，MainLoop維護PQExpBuffer類型的query_buf，previous_buf，history_buf三個buffer。這三個buffer的定義在src/interfaces/libpq/pqexpbuffer.h。定義如下：

typedef struct PQExpBufferData
{
    char	   *data;
	size_t		len;
	size_t		maxlen;
} PQExpBufferData;

typedef PQExpBufferData *PQExpBuffer;

其中history_buf保存的是以前的歷史操作。previous_buf 保存的當前操作，由於psql中每個命令可以有多行（通過”\”+”Enter”進行分割），所以previous_buf 會一行一行的添加進char* line中的輸入，當一個命令滿足發出條件時，再把previous_buf中的數據送到query_buf中去。

在獲取到sql命令后，首先使用函數psql_scan_setup對啟動對指定行的詞法分析。然后調用psql_scan函數返回語句的狀態。返回的狀態有下面幾種：

PSCAN_SEMICOLON     以分號結束的命令
PSCAN_BACKSLASH     以反斜桿結束的命令
PSCAN_INCOMPLETE    到達行尾並沒有完成的命令
PSCAN_EOL           遇到了EOL結束符

MainLoop函數就是根據返回值控制Buffer，當一個命令輸入完畢以后發送到后台去執行。

在完成詞法分析后，調用SendQuery(const char *query)函數執行命令，該函數處理沒有連接數據庫、事務處理等具體細節。再調用results = PQexec(pset.db, query)，獲取數據庫后台返回的結果。在命令執行以后，使用ProcessCopyResult(results)把運行結果顯示在屏幕上。

如果后台完成查詢任務，會通知前端它已經空閑，這時前端可以發送新的查詢命令。下面給出了backend返回給前端的數據結構，前端按照該結構顯示結果。值得說一句的是，雖然對於psql交互窗口顯示出的結果可以完全看作是一串字符串,並不需要區分出表中結果每一個域。但psql和backend的通信協議是所有前台(包括基於GUI界面)和后台的通信協議。只不過psql顯示時把它轉換成字符串的表現形式。

struct pg_result
{
    int			ntups;
	int			numAttributes;
	PGresAttDesc *attDescs;
	PGresAttValue **tuples;		/* each PGresTuple is an array of
								 * PGresAttValue's */
	int			tupArrSize;		/* allocated size of tuples array */
	int			numParameters;
	PGresParamDesc *paramDescs;
	ExecStatusType resultStatus;
	char		cmdStatus[CMDSTATUS_LEN];		/* cmd status from the query */
	int			binary;			/* binary tuple values if binary == 1,
								 * otherwise text */
	/*
	 * These fields are copied from the originating PGconn, so that operations
	 * on the PGresult don't have to reference the PGconn.
	 */
	PGNoticeHooks noticeHooks;
	PGEvent    *events;
	int			nEvents;
	int			client_encoding;	/* encoding id */
	/*
	 * Error information (all NULL if not an error result).  errMsg is the
	 * "overall" error message returned by PQresultErrorMessage.  If we have
	 * per-field info then it is stored in a linked list.
	 */
	char	   *errMsg;			/* error message, or NULL if no error */
	PGMessageField *errFields;	/* message broken into fields */
	/* All NULL attributes in the query result point to this null string */
	char		null_field[1];
	/*
	 * Space management information.  Note that attDescs and error stuff, if
	 * not null, point into allocated blocks.  But tuples points to a
	 * separately malloc'd block, so that we can realloc it.
	 */
	PGresult_data *curBlock;	/* most recently allocated block */
	int			curOffset;		/* start offset of free space in block */
	int			spaceLeft;		/* number of free bytes remaining in block */
};

該數據結構定義在src/interfaces/libpq/libpq-int.h中。

總之呢，前台就是這個樣子，和后台的查詢處理的邏輯相比要簡單得多。

第一次寫對源碼解析的東西，思路比較亂，寫的也比較雜亂無章，基本是想到哪寫到哪。看來以后還得進一步加強寫作的鍛煉。這次感覺好像源碼貼的有點多，下次盡量注意哈。這次就先這樣了，不管怎么說總算開了個頭，明天早起在修改修改吧。

希望自己能堅持下去。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 跟我一起讀postgresql源碼(五)——Planer(查詢規划模塊)(下) 跟我一起讀postgresql源碼(四)——Planer(查詢規划模塊)(上) 跟我一起讀postgresql源碼(二)——Parser(查詢分析模塊) 【跟我一起讀 linux 源碼 01】boot 跟我一起讀postgresql源碼(七)——Executor(查詢執行模塊之——數據定義語句的執行) PostgreSQL操作-psql基本命令 postgresql的psql常用命令-4 跟大佬一起讀源碼：CurrentHashMap的擴容機制跟我一起學XNA(1)讓物體動起來①(附源碼) psql 命令