TCP/IP網絡編程之多線程服務端的實現（一）

本文轉載自查看原文 2018-09-26 21:36 3561 C語言/ TCP/IP網絡編程

為什么引入線程

為了實現服務端並發處理客戶端請求，我們介紹了多進程模型、select和epoll，這三種辦法各有優缺點。創建（復制）進程的工作本身會給操作系統帶來相當沉重的負擔。而且，每個進程有獨立的內存空間，所以進程間通信的實現難度也會隨之提高。且進程的切換同樣也是不菲的開銷。什么是進程切換？我們都知道計算機即便只有一個CPU也可以同時運行多個進程，這是因為系統將CPU時間分成多個微小的塊后分配給多個進程，比方進程B在進程A之后執行，當進程A所分配的CPU時間到點之后，要開始執行進程B，此時需要將進程A的數據移出內存保存到磁盤，並讀入進程B的數據，所以上下文切換需要比較長的時間，即使通過優化加快速度，也會存在局限

為了保持多進程的優點，同時在一定程度上克服其缺點，人們引入了線程。這是為了將進程的各種劣勢降至最低限度而設計的一種“輕量級進程”，線程相比進程有如下優點：

線程的創建和上下文切換比進程的創建和上下文切換更快
線程間交換數據時無需特殊技術

線程和進程的差異

每個進程的內存空間都由保存全局變量的“數據區”、向malloc等函數動態分配提供空間的堆（Heap）、函數運行時使用的棧（Stack）構成。每個進程都擁有這種獨立的空間，多個進程結構如圖1-1所示

圖1-1 進程間獨立的內存

但如果以獲得多個代碼執行流為主要目的，則不應像圖1-1那樣完全分離內存結構，而只需分離棧區域，通過這種方式可以獲得如下優勢：

上下文切換時不需要切換數據區和堆
可以利用數據區和堆交換數據

實際上這就是線程，線程為了保持多條代碼執行流而隔開了棧區域，因此具有如圖1-2所示的內存結構

圖1-2 線程的內存結構

如圖1-2所示，多個線程將共享數據區和堆，為了保持這種結構，線程將在進程內創建並運行。也就是說，進程和線程可以定義為如下形式：

進程：在操作系統構成單獨執行流的單位
線程：在進程構成單獨執行流的單位

如果說進程在操作系統內部生成多個執行流，那么線程就在同一進程內部創建多條執行流。因此，操作系統、進程、線程之間的關系可以通過圖1-3表示

圖1-3 操作系統、進程、線程之間的關系

線程的創建及運行

線程具有單獨的執行流，因此需要單獨定義線程的main函數，還需要請求操作系統在單獨的執行流中執行該函數，完成該功能的函數如下：

#include<pthread.h>
int pthread_create(pthread_t * restrict thread, const pthread_attr_t * restrict attr, void* (* start_routine)(void *), void * restrict arg);//成功時返回0，失敗時返回其他值

thread：保存新創建線程ID的變量地址值，線程與進程相同，也需要用於區分不同線程的ID
attr：用於傳遞線程屬性的參數，傳遞NULL時，創建默認屬性的線程
start_routine：相當於線程的main函數的、在單獨執行流中執行的函數地址值（函數指針）
arg：通過第三個參數傳遞調用函數時包含傳遞參數信息的變量地址值

下面，我們來看一個示例

thread1.c

#include <stdio.h>
#include <pthread.h>
void *thread_main(void *arg);

int main(int argc, char *argv[])
{
    pthread_t t_id;
    int thread_param = 5;

    if (pthread_create(&t_id, NULL, thread_main, (void *)&thread_param) != 0)
    {
        puts("pthread_create() error");
        return -1;
    };
    sleep(10); puts("end of main");
    return 0;
}

void *thread_main(void *arg)
{
    int i;
    int cnt = *((int *)arg);
    for (i = 0; i < cnt; i++)
    {
        sleep(1); puts("running thread");
    }
    return NULL;
}

第10行：請求創建一個線程，從thread_main函數調用開始，在單獨的執行流中執行。同時在調用thread_main函數時向其傳遞thread_param變量的地址值
第15行：調用sleep函數使main函數停頓10秒，這是為了延遲進程的終止時間。執行第16行的return語句后終止進程，同時終止內部創建的線程。因此，為保證線程的正常執行而添加這條語句
第19、22行：傳入arg參數的是第10行pthread_create函數的第四個參數

編譯thread1.c並運行

# gcc thread1.c -o thread1 -lpthread
# ./thread1 
running thread
running thread
running thread
running thread
running thread
end of main

從上述運行結果可以看到，線程相關代碼在編譯時需添加-lpthread選項聲明需要連接線程庫，只有這樣才能調用頭文件pthread.h中聲明的函數，上述程序的執行流程如圖1-4所示

圖1-4 示例thread1.c的執行流程

圖1-4中的虛線代表執行流程，向下的箭頭指的是執行流，橫向箭頭是函數調用。

接下來，可以嘗試將上述示例的第15行sleep函數的調用語句改為sleep(2)。運行之后大家會發現不會再像之前那樣打印5次"running thread"字符串。因為main函數返回后整個進程將被銷毀，如圖1-5所示

圖1-5 終止進程和線程

正因如此，我們之前的示例中通過調用sleep函數向線程提供了充足的時間

那么，如果我們希望等線程執行完畢，再結束程序，是不是一定要調用sleep函數？如果是，那么又牽扯出一個問題了，線程是在何時執行完畢呢？並非所有的程序都像thread1.c一樣可預測線程的執行時間。那么，為了等待線程執行完畢，難道我們要用一個非常大的數作為sleep的參數嗎？那這樣就算線程可以執行完，程序依然在休眠，造成計算機資源的浪費是一定的。那么，針對這一困境，是否有解決方案呢？當然是有的，那就是pthread_join函數

#include <pthread.h>
int pthread_join(pthread_t thread, void ** status);//成功時返回0，失敗時返回其他值

thread： thread所對應的線程終止后才會從pthread_join函數返回，換言之調用該函數后當前線程會一直阻塞到thread對應的線程執行完畢后才返回
status：保存線程的main函數返回值的指針變量地址值

thread2.c

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <pthread.h>
void *thread_main(void *arg);

int main(int argc, char *argv[])
{
    pthread_t t_id;
    int thread_param = 5;
    void *thr_ret;

    if (pthread_create(&t_id, NULL, thread_main, (void *)&thread_param) != 0)
    {
        puts("pthread_create() error");
        return -1;
    };

    if (pthread_join(t_id, &thr_ret) != 0)
    {
        puts("pthread_join() error");
        return -1;
    };

    printf("Thread return message: %s \n", (char *)thr_ret);
    free(thr_ret);
    return 0;
}

void *thread_main(void *arg)
{
    int i;
    int cnt = *((int *)arg);
    char *msg = (char *)malloc(sizeof(char) * 50);
    strcpy(msg, "Hello, I'am thread~ \n");

    for (i = 0; i < cnt; i++)
    {
        sleep(1); puts("running thread");
    }
    return (void *)msg;
}

第19行：main函數中，針對第13行創建的線程調用pthread_join函數，因此，main函數將等待ID保存在t_id變量中的線程終止
第11、19、41行：第41行返回的值將保存到第19行第二個參數thr_ret。需要注意的是，該返回值是thread_main函數內部動態分配的內存空間地址值

編譯thread2.c並運行

# gcc thread2.c -o thread2 -lpthread
# ./thread2 
running thread
running thread
running thread
running thread
running thread
Thread return message: Hello, I'am thread~

接下來我們來看thread2.c的執行流程圖，如圖1-6所示

圖1-6 調用pthread_join函數

可在臨界區內調用的函數

之前的示例只創建一個線程，接下來的示例將創建多個線程。當然，無論創建多少個線程，其創建方法沒有區別。但關於線程的運行需要考慮“多個線程同時調用函數時（執行時）可能產生的問題”。這類函數內部存在臨界區，也就是說，多個線程同時執行這部分代碼時，可能引起問題。根據臨界區是否引起問題，函數可分為兩類：

線程安全函數
非線程安全函數

線程安全函數被多個線程同時調用不會發生問題，反之，非線程安全函數被調用時就會出現問題。

下面我們介紹一個示例，將計算1到10的和，但並不是在main函數中計算，而是創建兩個線程，其中一個線程計算1到5的和，另一個線程計算6到10的和，main函數只負責輸出結果。這種方式的編程模型稱為“工作線程模型”。計算1到5之和與計算6到10之和的線程將成為main線程管理的工作。最后，在給出示例代碼之前先給出程序執行流程圖，如圖1-7所示

圖1-7 示例thread3.c的執行流程

thread3.c

#include <stdio.h>
#include <pthread.h>
void *thread_summation(void *arg);
int sum = 0;

int main(int argc, char *argv[])
{
    pthread_t id_t1, id_t2;
    int range1[] = {1, 5};
    int range2[] = {6, 10};

    pthread_create(&id_t1, NULL, thread_summation, (void *)range1);
    pthread_create(&id_t2, NULL, thread_summation, (void *)range2);

    pthread_join(id_t1, NULL);
    pthread_join(id_t2, NULL);
    printf("result: %d \n", sum);
    return 0;
}

void *thread_summation(void *arg)
{
    int start = ((int *)arg)[0];
    int end = ((int *)arg)[1];

    while (start <= end)
    {
        sum += start;
        start++;
    }
    return NULL;
}

這里要注意一下，兩個線程都訪問全局變量sum

編譯thread3.c 並運行

# gcc thread3.c -o thread3 -lpthread
# ./thread3 
result: 55

運行結果是55，雖然正確，但示例本身存在問題。此處存在臨界區相關問題，因此再介紹另一示例，該示例與上述示例相似，只是增加了發生臨界區相關錯誤的可能性

thread4.c

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <pthread.h>
#define NUM_THREAD 100

void *thread_inc(void *arg);
void *thread_des(void *arg);
long long num = 0;

int main(int argc, char *argv[])
{
    pthread_t thread_id[NUM_THREAD];
    int i;

    printf("sizeof long long: %d \n", sizeof(long long));
    for (i = 0; i < NUM_THREAD; i++)
    {
        if (i % 2)
            pthread_create(&(thread_id[i]), NULL, thread_inc, NULL);
        else
            pthread_create(&(thread_id[i]), NULL, thread_des, NULL);
    }

    for (i = 0; i < NUM_THREAD; i++)
        pthread_join(thread_id[i], NULL);

    printf("result: %lld \n", num);
    return 0;
}

void *thread_inc(void *arg)
{
    int i;
    for (i = 0; i < 50000000; i++)
        num += 1;
    return NULL;
}
void *thread_des(void *arg)
{
    int i;
    for (i = 0; i < 50000000; i++)
        num -= 1;
    return NULL;
}

上述示例共創建100個線程，其中一半執行thread_inc函數中的代碼，另一半則執行thread_des函數中的代碼，全局變量sum經過增減后的值應還是0，但是，我們在編譯執行下程序

# gcc thread4.c -o thread4 -lpthread
# ./thread4 
sizeof long long: 8 
result: 10862532

可以看到，結果並非我們預想的那樣。雖然暫時不清楚原因，但可以肯定，冒然使用線程對變量進行操作，是有可能發生問題的。那么，這是什么問題？如何解決，我們會在后面的一章介紹

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 TCP/IP網絡編程之多進程服務端（一） TCP/IP網絡編程之基於TCP的服務端/客戶端（一） TCP/IP網絡編程之基於TCP的服務端/客戶端（二）網絡編程進階：並發編程之多線程 c++ 網絡編程（九）LINUX/windows-IOCP模型多線程超詳細教程及多線程實現服務端並發編程之多線程 python並發編程之多線程 python並發編程之多線程 python並發編程之多線程 python並發編程之多線程