基於tornado的文件上傳demo

本文轉載自查看原文 2017-03-10 14:13 1867 python/ 技術/ Architecture/ HTML

這里，web框架是tornado的4.0版本，文件上傳組件，是用的bootstrap-fileinput。

這個小demo，是給合作伙伴提供的，模擬APP上攝像頭拍照，上傳給后台服務進行圖像識別用，識別結果OK，則告知客戶端不需要繼續上傳圖片了，若結果不ok，則還要繼續上傳。利用的是每次上傳的圖片，拍攝的角度或者光線什么的可能不同，豐富后台識別系統識別的判決依據。

還有一點，要注意的是，一般基於http請求操作，都是基於session操作，我們要標識多次圖像上傳的操作，對應的是同一個業務流，怎么做到呢？就好比用戶登錄http后台服務器后，通過session保持住一個會話，直到用戶退出。在這個應用場景中，可能一個session中存在多個業務流，如何區分？是有辦法的，同樣利用http session的原理，只是，在我們這個demo里面，我利用時間戳的方式，即每次前端上傳圖片的時候都帶上一個timestamp的字段，這個值在前端通過js生成。當圖片識別結果是OK的時候，就將這個timestamp進行刷新，否則就繼續保持timestamp的值不變。

web后端服務，采用的是多進程方式，因為python的GIL（全局解析鎖）的緣故，無法利用多線程發揮並發優勢。故而采用了多進程。多進程要做的事情包括：

1> 接收客戶端上傳的圖像數據，寫文件，保存以備后續做學習素材。

2> 處理圖像識別的邏輯，處理的結果寫入共享數據區。

說到這里，基於tornado的web應用，在接收http請求的時候，這個處理http請求的過程，其實也是一個進程。所以，這個demo就相當於是3個進程之間的協助了。多進程協助，就必須考慮同步和資源共享的問題。

《一》先將web后端的服務代碼貼上來，然后給予一些解說，方便讀者理解：

 1 #!/usr/bin/env python
 2 #-*- coding:utf-8 -*-
 3 #__author__ "shihuc"
 4 
 5 import tornado.ioloop
 6 import tornado.web
 7 import os
 8 import json
 9 import multiprocessing
10 
11 import aibusiness
12 
13 procPool = multiprocessing.Pool() 14 
15 class MainHandler(tornado.web.RequestHandler):
16     def get(self):
17         self.render("uploadAI.html")
18 
19 class UploadHandler(tornado.web.RequestHandler):
20 
21     def post(self,*args,**kwargs):
22         file_metas=self.request.files['tkai_file']                         #提取表單中‘name’為‘tkai_file’的文件元數據
23         timestamp = self.get_argument("sequence") 24         xsrf = self.get_argument("_xsrf") 25 
26         res = {} 27         #注意，只會有一個文件在每次http請求中
28         for meta in file_metas:
29             filename=meta['filename']
30             procPool.apply_async(aibusiness.doWriteImageJob, (filename, meta['body'],))
31             p = multiprocessing.Process(target=aibusiness.doRecJob, args=(timestamp, meta['body'],))
32             p.start()
33  p.join() 34         retVal = aibusiness.reportResult(timestamp)
35         print "timestamp: %s, xrsf: %s, res: %s, filename: %s\r\n" % (timestamp, xsrf, retVal, filename)
36         res['result'] = retVal 37  self.write(json.dumps(res)) 38 
39 
40 
41 settings = {
42     'template_path': 'page',          # html文件
43     'static_path': 'resource',        # 靜態文件（css,js,img）
44     'static_url_prefix': '/resource/',# 靜態文件前綴
45     'cookie_secret': 'shihuc',        # cookie自定義字符串加鹽
46     'xsrf_cookies': True              # 防止跨站偽造
47 }
48 
49 def make_app():
50     return tornado.web.Application([
51         (r"/", MainHandler),(r"/upload", UploadHandler) 52     ], default_host='',transforms=None, **settings)
53 
54 if __name__ == "__main__":
55     app = make_app()
56     app.listen(9909)
57     tornado.ioloop.IOLoop.current().start()
58     procPool.close()

針對上面的代碼，我簡單的加以解釋說明：

a>本demo中，多進程中，接受圖像並寫入文件的過程，采用的是進程池。注意第13行，我定義全局的變量procPool的時候，multiprocessing.Pool()，沒有指定參數，默認會根據當前主機的cpu核數決定啟動幾個進程。

b>圖像識別的處理過程，采用的是來一個請求就啟動一個進程的方式。這里的圖像識別的處理邏輯，采用的是模擬的過程，用生成隨機數的方式替代，相關的邏輯，在aiprocess這個模塊中，后面將會附上代碼。

c>開41行，這里的settings，給tornado的web應用程序指定基本的配置信息，這里有web應用的頁面顯示文件的存放路徑，以及html文件里面用到的資源文件的存放路徑，還有安全防御相關的配置。

比如html文件存放路徑，這里是page目錄；資源文件（css,js,image等）的根目錄在resource下面。
安全相關的，cookie字符串加密過程中添加了自定義的鹽；防止跨站請求偽造(CSRF)的功能開關是否開啟，在tornado框架下，csrf被叫作xsrf了，本例中，xsrf開關是開啟的。

d>多進程之間的同步，這里，主要需要考慮的是http接收消息的進程與圖形識別進程之間的同步，因為識別后的結果要返回給客戶端，所以，接收消息的進程必須要等待圖形識別進程執行關閉。這里，這個同步，主要利用的是33行的代碼join完成的。

e>看26,36,37行的代碼，這里要注意，http處理函數post結束后，必須放回json格式的結果給客戶端。因為這個是bootstrap-fileinput框架檢查結果要求的。

《二》接下來看看，aiprocess模塊的內容

 1 #!/usr/bin/env python
 2 #-*- coding:utf-8 -*-
 3 #__author__ "shihuc"
 4 
 5 import os
 6 import json
 7 import random
 8 import multiprocessing
 9 
10 
11 #記錄同一個業務請求對應的上傳的圖片數量，key是前端傳來的timestamp，value是對應該
12 #timestamp值的圖片處理結果，一個list。
13 timestamp_filecount_map = multiprocessing.Manager().dict() 14 
15 procLock = multiprocessing.Lock() 16 procEvent = multiprocessing.Event()
17 
18 upload_path=os.path.join(os.path.dirname(__file__),'uploadfiles')  #文件的暫存路徑
19 
20 def doWriteImageJob(filename, imgData):
21        """ 1. Add your business logic here, write image data as file! 
22        """
23        #Below do result update
24        filepath=os.path.join(upload_path,filename)
25        with open(filepath,'wb') as up:                                #有些文件需要已二進制的形式存儲，實際中可以更改
26             up.write(imgData)
27 
28 def doRecJob(timestamp, imgData):
29        """ 1. Add your business logic here, for example, image recognization! 
30            2. After image rec process, you must update the timestamp_filecount_map
31        to check the next final result in the next step.
32        """
33        #Here, do recognization, simulate the result by random
34        procLock.acquire()
35        result = random.randrange(0, 10, 1)
36        #Below do result update
37        res = []
38        if timestamp_filecount_map.get(str(timestamp)) is None:
39           res.append(result)
40        else:
41           res = timestamp_filecount_map.get(str(timestamp))
42           res.append(result)
43        timestamp_filecount_map[str(timestamp)] = res
44        print timestamp_filecount_map
45        procLock.release()
46 
47 
48 def reportResult(timestamp):
49        """ Add your business logic here, check whether the result is ok or not. 
50        Here, I will simulate the logic that check the existing result whether it
51        is accepted as OK, e.g. the present of image with same result is no less
52        80%, which is defined to be OK.
53        """
54        #Here, simulation. check if all the result, if there is 80% image whose result 
55        #is no less 2, then the final is OK.
56        procLock.acquire()
57        tempCnt = 0
58        try:
59            detail_info = timestamp_filecount_map.get(str(timestamp)) 60            if detail_info is None:
61               return "OK"
62            else:
63               for elem in detail_info:
64                  if elem >= 2:
65                      tempCnt += 1
66               if tempCnt >= len(detail_info) * 0.8:
67                  del timestamp_filecount_map[str(timestamp)]
68                  return "OK"
69               else:
70                  return "NOK"
71        finally:
72            procLock.release()

上述代碼，有幾點需要解釋說明：

1>進程之間的同步問題，用到多進程的Lock，例如代碼15行 procLock = multiprocessing.Lock()。每次進程操作的時候，對該代碼邏輯進行鎖定，因為都在操作共享資源timestamp_filecount_map這個結構，加鎖可以保證數據操作的完整性，避免出現臟讀現象。

2>進程之間的共享，一定要用多進程模塊的Manager生成相應的數據結構。例如代碼13行timestamp_filecount_map = multiprocessing.Manager().dict()。否則，假若用一般的字典結構，例如下面： timestamp_filecount_map = {}，那么，在進程之間，就無法傳遞共享的數據，典型的測試結果就是每次在調研reportResult的時候，執行到第59行時，返回的detail_info都是None。

3>上面的代碼，處理圖像識別的邏輯，是通過生成隨機數來模擬的，隨機數大於2，表示識別結果是OK的。最終叛變一個業務流是否OK，就是看隨機數列表中，不小於2的數的個數是不是不小於隨機數總數的80%，是則OK，否則NOK。

《三》看看基於bootstrap-fileinput的前端

 1 <!doctype html>
 2 <html>
 3 <head>
 4     <meta charset="UTF-8">
 5     <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1"> 
 6     <meta name="viewport" content="width=device-width, initial-scale=1.0">
 7     <title>Python多進程DEMO</title>
 8     <link href="{{static_url('css/bootstrap.min.css')}}" rel="stylesheet">
 9     <link rel="stylesheet" type="text/css" href="{{static_url('css/default.css')}}">
10     <link href="{{static_url('css/fileinput.css')}}" media="all" rel="stylesheet" type="text/css" />    
11     <script src="{{static_url('js/jquery-2.1.1.min.js')}}"></script>
12     <script src="{{static_url('js/fileinput.js')}}" type="text/javascript"></script>
13     <script src="{{static_url('js/bootstrap.min.js')}}" type="text/javascript"></script>
14     <script src="{{static_url('js/bootbox.js')}}" type="text/javascript"></script>
15 </head>
16 <body>
17     <div class="htmleaf-container">
18         <div class="container kv-main">
19            <div class="page-header">
20              <h2>Python concurrency demo<small></h2>
21            </div>
22            <form enctype="multipart/form-data" method="post">
23               <div class="form-group">
24                   {% module xsrf_form_html() %} 25                   <input type="file" name="tkai_file" id="tkai_input" multiple>
26               </div>
27               <hr>
28            </form>
29         </div>
30     </div>
31     <script>
32             $(document).ready(function() {
33                 if(sessionStorage.image_ai_sequence == null || sessionStorage.image_ai_sequence == undefined){ 34                     sessionStorage.image_ai_sequence = Date.parse(new Date()); 35  } 36                 var fileInput= $("#tkai_input").fileinput({
37                         uploadUrl: "/upload",
38                         uploadAsync: true,
39                         maxFileCount: 15,
40                         allowedFileExtensions : ['jpg','jpeg','png','gif'],//允許的文件類型
41                         showUpload: false,                                 //是否顯示上傳按鈕
42                         showCaption: true,                                 //是否顯示標題
43                         showPreview: true,
44                         autoReplace: true,
45                         dropZoneEnabled: true,               
46                         uploadExtraData: function() { return {'sequence': sessionStorage.image_ai_sequence, '_xsrf': document.getElementsByName('_xsrf')[0].value}} 47                     }).on('filepreajax', function(event, previewId, index) {
48                         console.log('previewId:' + previewId + ', index: ' + index + ', seq: ' + sessionStorage.image_ai_sequence);
49                     }).on('filepreupload', function(event, data, previewId, index, jqXHR){
50                         //console.log('filepreupload');
51                     }).on('fileuploaded',function(event, data) {     //單個文件上傳成功后的回調
52                         //console.log('fileuploaded');
53                         var res=data.response;
54                         if(res.result == "NOK"){
55                             ;                                        //如果后台處理的結果顯示為NOK，說明識別效果沒有達到預期，要重新傳圖片
56                         }else if (res.result == "OK"){
57                             sessionStorage.image_ai_sequence = Date.parse(new Date());       //識別效果達到預期，可以不用再傳文件了。
58                             bootbox.alert("Result is acceptable!");
59                         }
60                    }).on('filecustomerror', function(event, params, msg) {
61                         //console.log(params)
62                         //console.log(msg)
63                    }).on('fileclear', function(event,data) {         //刪除按鈕對應的回調
64                         //console.log(data);
65                    }).on('filebatchuploadsuccess', function(event,data) { //批量上傳文件對應的回調
66                         //console.log(data);
67                    });
68             });
69     </script>
70 </body>
71 </html>

對這段代碼，也做一些必要的解釋說明

1>第8行處，紅色部分static_url這個函數，是tornado的模板解釋函數，在這里{{static_url('css/bootstrap.min.css')}}，要結合第一部分web后端代碼中介紹的settings中的靜態資源路徑配置信息，本demo中的資源路徑前綴是/resource/，所以這個紅色部分在模板解析后，全路徑就是/resource/css/bootstrap.min.css。上述代碼中其他的涉及到static_url的地方，都是如此。資源加載的模板格式都是{{。。。}}這樣的，這樣用有一個好處，每次加載資源，tornado都會給資源文件添加一個版本號，強制瀏覽器放棄使用緩存，每次刷新頁面，都會重新加載，不會出現因為緩存造成文件內容刷新不及時的問題。

2>第24行代碼，這里也是利用了tornado的模板語言，加載了一段代碼，生成xsrf相關邏輯的，對應就是添加了一個input元素，已hidden的方式，以name為_xsrf，value為一段tornado生成的字符串，相當於token，是隨機的，防止跨站請求偽造用的。提交表單時沒有這個值或者這個值和tornado后台的值對不上，都是會拒絕提交的表單的。這里的模板格式是{% 。。。 %}。

3>第33-35行的代碼，對應前面說到的標記一個業務流的timestamp標記，當然這個值，可以是后台生成。這里是demo，就前端生成了。這里用到了sessionStorage的存儲功能，防止頁面刷新導致這個值可能出現的不一致。

4>fileinput插件，多文件上傳過程，支持兩種文件上傳模式，一個是同步的批量上傳，一個是異步的一張一張的文件上傳。第38行的代碼，就是設置為異步的單張文件的上傳。這種模式下，后台接收文件的地方，每次http請求到來時，里面只會有一個文件。若是批量上傳，則http后台接收文件時，會是一個數組的形式，接收到多個文件。我們的python后台代碼，是同時支持單張上傳和批量上傳的。

5>第46行的代碼，利用了fileinput的上傳過程支持上傳額外數據的能力，即不僅僅上傳form表單中的數據，還可以上傳用戶自定義的數據。這里，通過回調函數的方式設置uploadExtraData，就是為了在每次上傳之前，都重新獲取一次新數據，防止每次上傳的數據都是頁面加載時的初始值。

最后，將整個基於tornado的web項目目錄結構信息附在這里：

1 [root@localhost demo]# ll
2 總計 20
3 -rw-r--r-- 1 root root 2686 03-09 10:36 aibusiness.py
4 drwxr-xr-x 2 root root 4096 03-10 14:12 page
5 drwxr-xr-x 6 root root 4096 03-03 15:07 resource
6 drwxr-xr-x 2 root root 4096 03-07 17:07 uploadfiles
7 -rw-r--r-- 1 root root 1858 03-07 17:05 web_server.py

項目啟動后，從瀏覽器訪問項目，看到的效果如下圖

另外，這個demo的所有源文件，我都上傳到了github,地址https://github.com/shihuc/fileupload，有興趣的，可以去參考。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 tornado上傳大文件以及多文件上傳 tornado上傳文件，且把文件進行保存 tornado結合前端進行文件上傳 java文件上傳Demo python requests上傳文件 tornado 接收文件簡單文件上傳漏洞demo springMVC 多文件上傳前后台demo 【nodejs】文件上傳demo實現 .net core 上傳文件Demo 第二百七十三節，Tornado框架-文件上傳