亞馬遜的lambda跟api_dateway搭配編寫接口測試,看了文檔發現時間29秒必須出結果,否則超時,像我們爬蟲數據比較多的,多爬點數據就容易超時,那應該怎么辦呢。
於是我挑了一個輕量級的進行簡單學習--tonador
這邊我寫一個接口用來提取介詞短語進行接口測試。--正好最近要進行提取介詞短語的測試,就以這個例子為主,編碼。
首先安裝tornado
pip install tonador
接下來看代碼的實例
# -*- coding:utf-8 -*- import sys import json import pymysql sys.path.append('../') import tornado.httpserver import tornado.ioloop import tornado.options import tornado.web import nltk,sys import json from extractor import phrases_extractor from tornado.options import define, options define("port", default=8000, help="run on the given port", type=int) #定義處理類型 class IndexHandler(tornado.web.RequestHandler): #添加一個處理get請求方式的方法 def get(self,input): #向響應中,添加數據
#提取傳入參數的轉換 data_list = str(input).split("=_=") #data_list=["i go to work by bus", "hello world", "go to school","by car","go to school byebye good morning","the apple in the box"] returnItem = {} returnItem["getPhrase"] = getScopeOfApplication(data_list) returnItem["getPhraseWithoutPre"] = getProductCharacteristics(data_list) self.write(json.dumps(returnItem)) def getScopeOfApplication(data_list): if len(data_list)==0: return [] text = data_list[1] for i in data_list[2:]: text += ',' + i grammar = r""" NP: {<DT>?<JJ|CC>*<NN>+} {<NNP>+} PP: {<IN><NP>} """ label = 'PP' phrase_list = phrases_extractor.get_phrases(text, grammar, label) result_list = [] for phrase in phrase_list: if phrase not in result_list and len(phrase.split(' ')) > 2: result_list.append(phrase) if len(result_list) < 1: result_list = getProductCharacteristics(data_list) return result_list def getProductCharacteristics(data_list): if len(data_list)==0: return [] cur_title = data_list[0] other_titles = data_list[1] for i in data_list[2:]: other_titles += ',' + i grammar = r""" NP: {<DT>?<JJ|CC>*<NN>+} """ label = 'NP' phrase_list = phrases_extractor.get_phrases(cur_title, grammar, label) cur_list = list(set(phrase_list)) other_phrase_list = phrases_extractor.get_phrases(other_titles, grammar, label) result_list = [] for phrase in cur_list: if phrase in other_phrase_list: result_list.append(phrase) return result_list #定義接口的參數獲取 app = tornado.web.Application(handlers=[(r"/getwords/(.*?)$", IndexHandler)]) #主函數執行 if __name__ == "__main__": tornado.options.parse_command_line() http_server = tornado.httpserver.HTTPServer(app, max_buffer_size=504857600, max_body_size=504857600) # http_server.listen(options.port) http_server.bind(options.port) #開啟五個線程 http_server.start(5) tornado.ioloop.IOLoop.instance().start()
然后python 程序.py 將服務開啟,掛到服務器上進行執行
然后我們在瀏覽器中 輸入 http://你的ip地址:8000/getwords/需要拆分單詞的句子=_=the apple in the box 例如 http://你的ip地址:8000/getwords/i go to work by bus=_=the apple in the box
然后我們可以看到
很簡單的寫好一個接口,完美!