TinyHTTPd源碼分析

本文轉載自查看原文 2017-05-02 14:49 1813 C/C++

TinyHTTPd

TinyHTTPd是一個超輕量級的http服務器, 使用C語言開發, 代碼只有500多行, 不用於實際生產, 只是為了學習使用. 通過閱讀代碼可以理解初步web服務器的本質.

主頁地址 : http://tinyhttpd.sourceforge.net/

注釋后的源碼 : https://github.com/tw1996/TinyHTTPd

HTTP協議

在閱讀源碼之間, 我們先要初步了解HTTP協議. 簡單地說HTTP協議就是規定了客戶端和服務器的通信格式, 它建立在TCP協議的基礎上, 默認使用80端口. 但是並不涉及數據包的傳輸, 只規定了通信的規范. HTTP本身是無連接的, 也就是說建立TCP連接后就可以直接發送數據, 不必再建立HTTP連接, 對於數據包丟失重傳由TCP實現, 下面簡單介紹HTTP幾個版本.

HTTP/0.9

TCP連接建立后, 客戶端只能使用GET方式請求

GET /index.html

服務器只能回應html格式的字符串

<html>
  <body>Hello World</body>
</html>

發送完畢后馬上斷開TCP連接.

HTTP/1.0

與HTTP/0.9相比, 增加了許多新的功能, 支持任何格式傳輸, 包括文本, 二進制數據, 文件, 音頻等. 支持GET, POST, HEAD命令.

改變了數據通信的格式, 增加了頭信息; 其他的新增功能還包括狀態碼（status code）,多字符集支持,多部分發送（multi-part type）,權限（authorization）,緩存（cache）,內容編碼（content encoding）等, 所以HTTP協議一共可分為３部分 , 開始行, 首部行, 實體主體. 其中在首部行和實體主體之間以空格分開, 開始行和首部行都是以 \r\n 結尾舉個例子 :

請求信息

GET / HTTP/1.0　　//請求行
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5)　　 //請求頭
Accept: */*　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　//請求頭

響應信息

HTTP/1.0 200 OK 　　　　　　　　　　　　　　　　　　//響應行
Content-Type: text/plain　　　　　　　　　　　　　//響應頭
Content-Length: 137582
Expires: Thu, 05 Dec 1997 16:00:00 GMT
Last-Modified: Wed, 5 August 1996 15:55:28 GMT
Server: Apache 0.84

<html>　　　　　　　　　　　　　　　　　　　　　　　　　　　//響應主體
  <body>Hello World</body>
</html>

HTTP/1.0規定, 頭信息必須是ASCII碼, 后面的數據可以是任何格式, Content-Type 用於規定格式. 下面是一些常見的 Content-Type 字段取值.

text/plain
text/html
text/css
image/jpeg
image/png
image/svg+xml
audio/mp4
video/mp4
application/javascript
application/pdf
application/zip
application/atom+xml

有的瀏覽器為了提高通信效率, 使用了一個非標准的字段 Connection:keep-alive. 即維持一個TCP連接不斷開, 多次發送HTTP數據, 直到客戶端或服務器主動斷開.

HTTP/1.1

現在最流行的HTTP協議, 默認復用TCP連接, 即不需要手動設置Connection:keep-alive, 客戶端在最后一個請求時發送 Connection:close　斷開連接.

增加了許多方法 : PUT, PUTCH, HEAD, OPTIONS, DELETE.

引入管道機制, 以前是先發送一個請求, 等待回應繼續發送下一個請求. 現在可以連續發送多個請求, 不用等待, 但是服務器仍然會按順序回應. 使用 Content-Lenth字段區分數據包屬於哪一個回應.

為了避免隊頭堵塞, 只有兩種辦法 : 少發送數據, 同時開多個持久連接.

HTTP/2

這里就不多做介紹了

CGI與FASTCGI

參考這篇文章 : http://www.php-internals.com/book/?p=chapt02/02-02-03-fastcgi

工作流程

服務器啟動, 如果沒有指定端口則隨機選取端口建立套接字監聽客戶端連接
accept()會一直阻塞等待客戶端連接, 如果客戶端連接上, 則創建一個新線程處理該客戶端連接.
在accetp_request() 主要處理客戶端連接, 首先解析HTTP請求報文. 只支持GET/POST請求, 否則返回HTTP501錯誤. 如果有請求參數的話, 記錄在query_string中. 將請求的路徑記錄在path中, 如果請求的是目錄, 則訪問該目錄下的index.html文件.
最后判斷請求類型, 如果是靜態請求, 直接讀取文件發送給客戶端; 如果是動態請求, 則fork()一個子進程, 在子進程中調用exec()函數簇執行cgi腳本. 然后父進程讀取子進程執行結果父子進程之間通過管道通信實現.
父進程等待子進程結束后, 關閉連接, 完成一次HTTP請求.

源碼分析

首先看程序入口, 這里建立套接字, 然后與sockaddr_in結構體進行綁定, 然后用listen監聽該套接字上的連接請求, 這幾步都在startup()中實現.

然后服務器在通過accept接受客戶端請求, 如沒有請求accept()會阻塞, 如果有請求就會創建一個新線程去處理客戶端請求.

int main(void)
{
    /* 定義socket相關信息 */
    int server_sock = -1;
    u_short port = 4000;
    int client_sock = -1;
    struct sockaddr_in client_name;
    socklen_t  client_name_len = sizeof(client_name);
    pthread_t newthread;

    server_sock = startup(&port);
    printf("httpd running on port %d\n", port);

    while (1)
    {
        /* 通過accept接受客戶端請求, 阻塞方式 */
        client_sock = accept(server_sock,
                (struct sockaddr *)&client_name,
                &client_name_len);
        if (client_sock == -1)
            error_die("accept");
        /* accept_request(&client_sock); */
        /* 開啟線程處理客戶端請求 */
        if (pthread_create(&newthread , NULL, accept_request, (void *)&client_sock) != 0)
            perror("pthread_create");
    }

    close(server_sock);

    return(0);
}

accept_request()主要處理客戶端請求, 做出了基本的錯誤處理. 主要功能判斷是靜態請求還是動態請求, 靜態請求直接讀取文件發送給客戶端即可, 動態請求則調用execute_cgi()處理.

/**********************************************************************/
/* A request has caused a call to accept() on the server port to
 * return.  Process the request appropriately.
 * Parameters: the socket connected to the client 
 *　處理每個客戶端連接
 * */
/**********************************************************************/
void *accept_request(void *arg)
{
    int client = *(int*)arg;
    char buf[1024];
    size_t numchars;
    char method[255];
    char url[255];
    char path[512];
    size_t i, j;
    struct stat st;
    int cgi = 0;      /* becomes true if server decides this is a CGI
                       * program */
    char *query_string = NULL;
    
    /* 獲取請求行，　返回字節數  eg: GET /index.html HTTP/1.1 */
    numchars = get_line(client, buf, sizeof(buf));
    /* debug */
    //printf("%s", buf);

    /* 獲取請求方式, 保存在method中  GET或POST */
    i = 0; j = 0;
    while (!ISspace(buf[i]) && (i < sizeof(method) - 1))
    {
        method[i] = buf[i];
        i++;
    }
    j=i;
    method[i] = '\0';

    /* 只支持GET 和 POST 方法 */
    if (strcasecmp(method, "GET") && strcasecmp(method, "POST"))
    {
        unimplemented(client);
        return NULL;
    }

    /* 如果支持POST方法, 開啟cgi */
    if (strcasecmp(method, "POST") == 0)
        cgi = 1;

    i = 0;
    while (ISspace(buf[j]) && (j < numchars))
        j++;
    while (!ISspace(buf[j]) && (i < sizeof(url) - 1) && (j < numchars))
    {
        url[i] = buf[j];
        i++; j++;
    }
    /* 保存請求的url, url上的參數也會保存 */
    url[i] = '\0';

    //printf("%s\n", url);

    if (strcasecmp(method, "GET") == 0)
    {
        /* query_string 保存請求參數 index.php?r=param  問號后面的 r=param */
        query_string = url;
        while ((*query_string != '?') && (*query_string != '\0'))
            query_string++;
        /* 如果有?表明是動態請求, 開啟cgi */
        if (*query_string == '?')
        {
            cgi = 1;
            *query_string = '\0';
            query_string++;
        }
    }

//    printf("%s\n", query_string);

    /* 根目錄在 htdocs 下, 默認訪問當前請求下的index.html*/
    sprintf(path, "htdocs%s", url);
    if (path[strlen(path) - 1] == '/')
        strcat(path, "index.html");

    //printf("%s\n", path);
    /* 找到文件, 保存在結構體st中*/
    if (stat(path, &st) == -1) {
        /* 文件未找到, 丟棄所有http請求頭信息 */
        while ((numchars > 0) && strcmp("\n", buf))  /* read & discard headers */
            numchars = get_line(client, buf, sizeof(buf));
        /* 404 no found */
        not_found(client);
    }
    else
    {

        //如果請求參數為目錄, 自動打開index.html
        if ((st.st_mode & S_IFMT) == S_IFDIR)
            strcat(path, "/index.html");        
        //文件可執行
        if ((st.st_mode & S_IXUSR) ||
                (st.st_mode & S_IXGRP) ||
                (st.st_mode & S_IXOTH)    )
            cgi = 1;
        if (!cgi)
            /* 請求靜態頁面 */
            serve_file(client, path);
        else
            /*　執行cgi 程序*/
            execute_cgi(client, path, method, query_string);
    }

    close(client);
    return NULL;
}

View Code

下面這個函數的功能就是重點了. 思路是這樣的 :

通過fork()一個cgi子進程, 然后在子進程中調用exec函數簇執行該請求, 父進程從子進程讀取執行后的結果, 然后發送給客戶端.

父子進程之間通過無名管道通信, 　因為cgi是使用標准輸入輸出, 要獲取標准輸入輸出, 可以把它們重定向到管道. 把stdin 重定向到 cgi_input管道, stdout重定向到 cgi_outout管道.

在父進程中關閉cgi_input的讀端個cgi_output的寫端, 在子進程中關閉cgi_input的寫端和cgi_output的讀端.

數據流向為 : cgi_input[1](父進程) -----> cgi_input[0](子進程)[執行cgi函數] -----> stdin -----> stdout -----> cgi_output[1](子進程) -----> cgi_output[0](父進程)[將結果發送給客戶端]

/**********************************************************************/
/* Execute a CGI script.  Will need to set environment variables as
 * appropriate.
 * Parameters: client socket descriptor
 *             path to the CGI script */
/**********************************************************************/
void execute_cgi(int client, const char *path,
        const char *method, const char *query_string)
{
    char buf[1024];
    int cgi_output[2];
    int cgi_input[2];
    pid_t pid;
    int status;
    int i;
    char c;
    int numchars = 1;
    int content_length = -1;

    buf[0] = 'A'; buf[1] = '\0';
    if (strcasecmp(method, "GET") == 0)
            /* 讀取和丟棄http請求頭*/
        while ((numchars > 0) && strcmp("\n", buf))  /* read & discard headers */
            numchars = get_line(client, buf, sizeof(buf));
    else if (strcasecmp(method, "POST") == 0) /*POST*/
    {
        numchars = get_line(client, buf, sizeof(buf));
        while ((numchars > 0) && strcmp("\n", buf))
        {
            buf[15] = '\0';
            /* 獲取http消息傳輸長度 */
            if (strcasecmp(buf, "Content-Length:") == 0)
                content_length = atoi(&(buf[16]));
            numchars = get_line(client, buf, sizeof(buf));
        }
        if (content_length == -1) {
            bad_request(client);
            return;
        }
    }
    else/*HEAD or other*/
    {
    }


    /* 
     * 建立兩條管道, 用於父子進程之間通信, cig使用標准輸入和輸出.
     * 要獲取標准輸入輸出, 可以把stdin重定向到cgi_input管道,  把stdout重定向到cgi_output管道
     * 為什么使用兩條管道 ? 一條管道可以看做儲存一個信息, 只是一段用來讀, 另一端用來寫. 我們有標准輸入和標准輸出兩個信息, 所以要兩條管道
     * */
    if (pipe(cgi_output) < 0) {
        cannot_execute(client);
        return;
    }
    if (pipe(cgi_input) < 0) {
        cannot_execute(client);
        return;
    }

    /*  創建子進程執行cgi函數, 獲取cgi的標准輸出通過管道傳給父進程, 由父進程發給客戶端. */
    if ( (pid = fork()) < 0 ) {
        cannot_execute(client);
        return;
    }
    /* 200　ok狀態 */
    sprintf(buf, "HTTP/1.0 200 OK\r\n");
    send(client, buf, strlen(buf), 0);

    /* 子進程執行cgi腳本 */
    if (pid == 0)  /* child: CGI script */
    {
        char meth_env[255];
        char query_env[255];
        char length_env[255];
        
        dup2(cgi_output[1], STDOUT);    //標准輸出重定向到cgi_output的寫端
        dup2(cgi_input[0], STDIN);        //標准輸入重定向到cgi_input的讀端
        close(cgi_output[0]);            //關閉cgi_output讀端
        close(cgi_input[1]);            //關閉cgi_input寫端
        
        /* 添加到子進程的環境變量中 */
        sprintf(meth_env, "REQUEST_METHOD=%s", method);
        putenv(meth_env);
        if (strcasecmp(method, "GET") == 0) {
            //設置QUERY_STRING環境變量
            sprintf(query_env, "QUERY_STRING=%s", query_string);
            putenv(query_env);
        }
        else {   /* POST */
            sprintf(length_env, "CONTENT_LENGTH=%d", content_length);
            putenv(length_env);
        }
        // 最后，子進程使用exec函數簇，調用外部腳本來執行
        execl(path,path,NULL);
        exit(0);
    } else {    /* parent */
        /* 父進程關閉cgi_output的寫端和cgi_input的讀端 */
        close(cgi_output[1]);
        close(cgi_input[0]);
        /* 如果是POST方法, 繼續讀取寫入到cgi_input管道, 這是子進程會從此管道讀取 */
        if (strcasecmp(method, "POST") == 0)
            for (i = 0; i < content_length; i++) {
                recv(client, &c, 1, 0);
                write(cgi_input[1], &c, 1);
            }
        /* 從cgi_output管道中讀取子進程的輸出, 發送給客戶端 */
        while (read(cgi_output[0], &c, 1) > 0)
            send(client, &c, 1, 0);
        /* 關閉管道 */
        close(cgi_output[0]);
        close(cgi_input[1]);
        /* 等待子進程退出 */
        waitpid(pid, &status, 0);
    }
}

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 源碼分析之tinyhttpd-0.1 知名C開源項目 - TinyHttpd 源碼分析 HTTP服務器的本質:tinyhttpd源碼分析及拓展 Tinyhttpd精讀解析關於Tinyhttpd最全注釋解析 Elasticsearch源碼分析 - 源碼構建 Netty源碼分析（七）----- read過程源碼分析 HashMap源碼分析(一):JDK源碼分析系列【MyBatis源碼分析】select源碼分析及小結 Spring源碼分析：從哪里開始看spring源碼