高並發Flask服務部署

本文轉載自查看原文 2021-06-08 15:52 2893 Python

高並發Flask服務部署

AI模型持久化

OOP：

利用面向對象思想，實現算法在內存上的實例化及持久化。即一次模型加載，多次請求調用。

class ocr_infer_class(threading.Thread):

    def __init__(self, input_queue, output_queue):
        super().__init__()
        logger.info("Model Init Start ...")
        # YOLO Needed
        self.yolo_model_def = os.path.join("../config", "yolov3.cfg")
        self.yolo_weights_path = os.path.join("../checkpoints", "yolo.pth")

        # OCR Needed
        self.ocr_weights_path = os.path.join("../checkpoints", "ocr.h5")

        # 初始化隊列
        self.input_queue = input_queue
        self.output_queue = output_queue

        # allow growth
        config = tf.ConfigProto()
        config.gpu_options.allow_growth = True
        self.graph = tf.get_default_graph()
        self.sess = tf.Session(config=config)

        # 初始化模型並加載權重
        self.yolo_init()
        # 初始化OCR
        self.ocr_init()

    def yolo_init(self):
        # load model
        logger.info("YOLO Init Success ...")

    def ocr_init(self):
        with self.graph.as_default():
            with self.sess.as_default():
            	# load model
        logger.info("OCR Init Success ...")

Thread+Queue

利用thread和queue實現請求監聽

# 接收端
# 循環監聽input_queue，一旦有數據就執行推理動作
def run(self):
    while True:
        try:
            data = self.input_queue.get(True, 2000)  # 監聽獲取input_queue中的數據
            except queue.Empty as e:  # 沒有數據
                logger.info('waiting for request...')
                else:
                    # 獲取數據后進行推理
                    image = data[0]
                    image_name = data[1]
		
                    # 推理代碼
                    result = self.get_detect_result(image, image_name,
                                                    height=image.shape[0], width=image.shape[1])

                    # 推理完畢后將結果放到output_queue中
                    self.output_queue.put(result)

#################################################################################################

# 發送端
# 將請求數據通過request拿到后，put到被監聽的queue中。
def get_result(self, image, image_name):
    # 向被監聽queue中put數據，觸發推理動作
    self.input_queue.put((image, image_name), )
    logger.info("Geting Request ...")
    # 推理完畢后會向output_queue中put數據
    res = self.output_queue.get()

    return res

一些名詞解釋

高並發

並發是OS的一個概念，指一段時間內多任務交替執行的現象。高並發泛指大流量、高請求量的業務場景；如雙十一秒殺、春運搶票

度量指標QPS、TPS、RT、並發數

QPS：Queries Per Second意思是“每秒查詢率”，服務器在規定時間內處理多少流量。

TPS：Transactions Per Second意思是“每秒事務量”。一個事務是指一個客戶機向服務器發送請求然后服務器做出反應的過程。客戶機在發送請求時開始計時，收到服務器響應后結束計時，以此來計算使用的時間和完成的事務個數。

RT： Response Time，客戶端發起請求到收到服務器響應結果的時間。直接反映了系統的快慢

並發數：系統同時能處理的請求數量，反應了系統的負載能力。

吞吐量：即承壓能力。與request對CPU、IO的消耗等等緊密關聯。單個request 對CPU消耗越高，外部系統接口、IO速度越慢，系統吞吐能力越低，反之越高。
實現高並發的手段
- 擴充硬件。更多核心更高主頻更大存儲空間更多帶寬
- 優化軟件。改進架構、應用多線程、協程、更快的數據結構等
- 架構分層、服務拆分、微服務解耦等。通過分布式集群和計算實現

高可用

通過設計減少系統不能提供服務的時間。如果一個系統能夠一直提供服務，那么這個可用性則是百分之百，如keepalived+Nginx實現的雙機熱備方法，當一台Nginx服務器宕機時，另一台頂上

高性能

程序處理速度越快，所占內存越少，cpu占用率越低，性能越高

並發與並行

並發：同時執行多個任務，但可能一個正在執行，另一個已暫停或結束。
並行：多個任務同時執行

Flask

Flask是一個同步的Web框架，可以實現簡易的web服務，但處理請求的時候是以單進程的方式，當同時訪問的人數過多時，服務就會出現阻塞的情況。

優點：易調試、可以很快的實現web server interface。

缺點：自帶的web server不穩定，無法承受大量並發請求。

WSGI

Web Server Gateway interface，服務器網關接口，是web服務器與web程序/框架之間的通用接口。是一種協議，一種規范。用於解決眾多web 服務器和web程序/框架之間的兼容性問題。

Gunicorn

What's Gunicorn?

Gunicorn是一個實現了WSGI協議的HTTP 服務器。類似的還有uWSGI、Gevent、Proxy setups等。

What's pre-fork?

可以采用pre-fork的方式提前准備多個子進程用於承載請求。實現了接收並處理並發請求。

安裝Gunicorn

pip install gunicorn -i http://pypi.douban.com/simple/ --trusted-host pypi.douban.com
# 安裝異步worker支持的三方庫
pip install gevent eventlet greenlet

配置Gunicorn

$ vim gunicorn.conf
bind = ":5000"
workers = 2 # 推薦核數*2+1發揮最佳性能
worker_class = 'gevent' # 除協程外，還有sync異步等等
threads = 1
worker_connections = 2000
timeout = 600  # 深度學習模型加載比較耗時，設長一點
reload = True
daemon = False

accesslog = '../logs/access.log'
errorlog = '../logs/error.log'
loglevel = 'debug'

啟動Gunicorn

# flask運行的入口文件為Flask_App.py，Flask實例名稱為app
# 利用配置文件，啟動一個主進程，兩個子進程
$ gunicorn -c gunicorn.conf Flask_APP:app

請求實例

Gunicorn pre-fork了兩個Flask進程，對外端口為5000，我們請求容器映射的 30006->5000/tcp

至此已經利用面向對象+線程+隊列+Gunicorn實現了一個持久化的、可以處理並發請求的web服務。但HTTP 服務器性能還不夠，

Nginx

正向代理與反向代理

正向代理：

代理服務器替客戶端請求訪問不到的資源並返回（中介）

用處：

突破訪問限制（梯子）
提升訪問速度，一個客戶端請求過的內容可能會緩存下來，另一個請求時直接發送。
隱藏客戶端真實IP，避免受攻擊

反向代理：

代理服務器接收Internet上的鏈接請求，然后將請求轉發給內部網絡上的服務器，並將其返回的結果再返回給Internet上請求鏈接的客戶端。此時代理服務器的行為就叫做反向代理。（二房東）

用處：

對客戶端隱藏服務器真實IP
負載均衡，代理服務器做反向代理前，會根據所有真實服務器的負載情況，決定將客戶端的請求分發到哪台服務器上
提升訪問速度，同正向代理時一樣，可以緩存一些東西，客戶端請求時直接返回，如一些靜態文件
提供安全保障。可以用作防火牆，提供對web攻擊的防護、加密服務器等

What's Nginx?

更強的HTTP 服務器，對外界請求做反向代理，能緩存一些靜態文件提升訪問速度的同時，還能做負載均衡，減輕服務器壓力，並能提高服務器安全性。

使用nginx來轉發gunicorn服務。為什么要在gunicorn之上再加層nginx呢？一方面nginx可以補充gunicorn在某些方面的不足，如SSL支持、高並發處理、負載均衡處理等，另一方面如果是做一個web網站，除了服務之外，肯定會有一些靜態文件需要托管，這方面也是nginx的強項

Nginx 安裝

apt upgrade && apt install nginx

Nginx配置

$ vim /etc/nginx/sites-available/default
# 修改對外端口
server {
        listen 8000 default_server;
        listen [::]:8000 default_server;
# 添加反向代理服務
location / {
    # 這一行記得注掉，因為我們沒有緩存靜態文件，只是一個請求接口
    # try_files $uri $uri/ =404;

	# 要轉發的服務為本地5000端口的gunicorn服務
    proxy_pass http://0.0.0.0:5000/;
    proxy_redirect off;

    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}

Nginx啟動

/etc/init.d/nginx start

請求實例

Nginx代理了5000端口的gunicorn服務，對外端口為8000，映射到容器外為30007。

利用ab 進行壓力測試

What's ab?

ApacheBench(ab)，命令使用簡單，效率高，統計信息完善，施壓機器內存壓力小。是在unix機器上推薦的壓測工具。

ab的安裝

apt-get install apache2-utils

ab的命令選項

option	含義
-r	當接收到socket錯誤的時候ab不退出
-t	發送請求的最長時間
-c	並發數，一次構造的請求數量
-n	發送的請求數量
-p	postfile，指定包含post數據的文件
-T	content-type,指定post和put發送請求時請求體的類型

Flask服務的並發測試

單純的Flask 服務時, 設置60s超時，1k並發，總請求10w，60s超時

$ ab -r -t 60 -c 1000 -n 100000 -p /workspace/pressure_test/post_data.txt -T 'application/json' http://127.0.0.1:8705/id_recognition

This is ApacheBench, Version 2.3 <$Revision: 1807734 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient)
Completed 10000 requests
Completed 20000 requests
Completed 30000 requests
Completed 40000 requests
Finished 44442 requests


Server Software:        Werkzeug/2.0.1
Server Hostname:        127.0.0.1
Server Port:            8705

Document Path:          /id_recognition
Document Length:        78 bytes

Concurrency Level:      1000
Time taken for tests:   60.000 seconds
Complete requests:      44442
Failed requests:        2790
   (Connect: 0, Receive: 930, Length: 930, Exceptions: 930)
Total transferred:      10312344 bytes
Total body sent:        7696691010
HTML transferred:       3393936 bytes
Requests per second:    740.70 [#/sec] (mean)
Time per request:       1350.080 [ms] (mean)
Time per request:       1.350 [ms] (mean, across all concurrent requests)
Transfer rate:          167.84 [Kbytes/sec] received
                        125271.12 kb/s sent
                        125438.96 kb/s total

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0  620 2456.7      0   31086
Processing:     0  458 2627.2    175   53163
Waiting:        0  235 613.7    173   26668
Total:        142 1079 3959.1    176   54166

Percentage of the requests served within a certain time (ms)
  50%    176
  66%    186
  75%    205
  80%   1172
  90%   1395
  95%   3215
  98%   8853
  99%  17271
 100%  54166 (longest request)

Flask+Gunicorn服務的並發測試

$ ab -r -t 60 -c 1000 -n 100000 -p /workspace/pressure_test/post_data.txt -T 'application/json' http://127.0.0.1:5000/id_recognition

This is ApacheBench, Version 2.3 <$Revision: 1807734 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient)
Completed 10000 requests
Completed 20000 requests
Completed 30000 requests
Completed 40000 requests
Finished 48453 requests


Server Software:        gunicorn
Server Hostname:        127.0.0.1
Server Port:            5000

Document Path:          /id_recognition
Document Length:        96 bytes

Concurrency Level:      1000
Time taken for tests:   60.000 seconds
Complete requests:      48453
Failed requests:        706
   (Connect: 0, Receive: 193, Length: 320, Exceptions: 193)
Total transferred:      12273915 bytes
Total body sent:        8364221730
HTML transferred:       4620768 bytes
Requests per second:    807.55 [#/sec] (mean)
Time per request:       1238.315 [ms] (mean)
Time per request:       1.238 [ms] (mean, across all concurrent requests)
Transfer rate:          199.77 [Kbytes/sec] received
                        136136.24 kb/s sent
                        136336.01 kb/s total

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0  747 1805.0      1   31078
Processing:     0  417 2215.4    171   53472
Waiting:        0  312 994.1    169   33853
Total:         69 1164 2977.1    193   54476

Percentage of the requests served within a certain time (ms)
  50%    193
  66%   1157
  75%   1180
  80%   1195
  90%   3147
  95%   3390
  98%   7197
  99%  12255
 100%  54476 (longest request)

Flask+Gunicorn+Nginx服務的並發測試

$ ab -r -t 60 -c 1000 -n 100000 -p /workspace/pressure_test/post_data.txt -T 'application/json' http://127.0.0.1:8000/id_recognition

This is ApacheBench, Version 2.3 <$Revision: 1807734 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient)
Completed 10000 requests
Completed 20000 requests
Completed 30000 requests
Completed 40000 requests
Completed 50000 requests
Completed 60000 requests
Completed 70000 requests
Finished 79043 requests


Server Software:        
Server Hostname:        127.0.0.1
Server Port:            8000

Document Path:          /id_recognition
Document Length:        0 bytes

Concurrency Level:      1000
Time taken for tests:   60.006 seconds
Complete requests:      79043
Failed requests:        100893
   (Connect: 0, Receive: 21865, Length: 46413, Exceptions: 32615)
Non-2xx responses:      1850
Total transferred:      12631524 bytes
Total body sent:        13739835630
HTML transferred:       4650808 bytes
Requests per second:    1317.26 [#/sec] (mean)
Time per request:       759.154 [ms] (mean)
Time per request:       0.759 [ms] (mean, across all concurrent requests)
Transfer rate:          205.57 [Kbytes/sec] received
                        223608.53 kb/s sent
                        223814.10 kb/s total

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    8  45.2      1    1009
Processing:     0  693 1742.3    177   32773
Waiting:        0  686 1744.0    176   32773
Total:          0  701 1745.3    179   32774

Percentage of the requests served within a certain time (ms)
  50%    179
  66%    209
  75%    604
  80%   1190
  90%   1499
  95%   3219
  98%   7186
  99%   7362
 100%  32774 (longest request)

至此，我們利用Gunicorn+Nginx實現了一個可做負載均衡，具有反向代理功能的高並發Web服務。但意外停止怎么辦？

我們利用supervisor實現進程管理，對Nginx和Gunicorn服務做監聽，掌控它們的啟動、停止、重啟

Supervisor

supervisor是一個用python語言編寫的進程管理工具，它可以很方便的監聽、啟動、停止、重啟一個或多個進程。當一個進程意外被殺死，supervisor監聽到進程死后，可以很方便的讓進程自動恢復

安裝supervisor

# 安裝supervisor的方式很多種，現在新版本的supervisor也原生支持python3，但apt安裝的版本配置更簡單，坑少，推薦
apt-get install supervisor

# 啟動、停止、重啟
$ /etc/init.d/supervisor start / stop / restart

配置Gunicorn服務

# 測試是否成功方法，在directory下執行command，如果不報錯就行
$ vim /etc/supervisor/conf.d/gunicorn.conf
[program:gunicorn_flask]
command=gunicorn -c gunicorn.conf Flask_App:app
directory=/workspace/Flask/
autostart=true
autorestart=true
user=root
redirect_stderr=true

配置Nginx服務

$ vim /etc/supervisor/conf.d/nginx.conf
[program:nginx_flask]
command=/usr/sbin/nginx -g 'daemon on;'
autostart=true
autorestart=true
user=root
redirect_stderr=true

查看supervisor的日志

cat /var/log/supervisor/supervisord.log

啟動測試

先關閉Gunicorn和Nginx服務。

$ kill -9 $(ps -ef| egrep 'gunicorn|nginx' | grep -v grep | awk '{print $2}')
$ ps ax | egrep 'gunicorn|nginx'

利用supervisor啟動Gunicorn和Nginx

$ /etc/init.d/supervisor start
$ ps ax | egrep 'gunicorn|nginx'

請求測試Nginx的代理服務

重啟測試

大工告成！感謝閱讀

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 分享一個flask高並發部署方案 DHCP服務部署 Nginx服務部署 WCF服務部署 LDAP服務部署 NFS 服務部署 NFS服務部署 windows服務部署 frp服務部署 PaddleOCR的服務部署