亞馬遜的lambda跟api_dateway搭配編寫接口測試,看了文檔發現時間29秒必須出結果,否則超時,像我們爬蟲數據比較多的,多爬點數據就容易超時,那應該怎么辦呢。
於是我挑了一個輕量級的進行簡單學習--tonador
這邊我寫一個接口用來提取介詞短語進行接口測試。--正好最近要進行提取介詞短語的測試,就以這個例子為主,編碼。
首先安裝tornado
pip install tonador
接下來看代碼的實例
# -*- coding:utf-8 -*-
import sys
import json
import pymysql
sys.path.append('../')
import tornado.httpserver
import tornado.ioloop
import tornado.options
import tornado.web
import nltk,sys
import json
from extractor import phrases_extractor
from tornado.options import define, options
define("port", default=8000, help="run on the given port", type=int)
#定義處理類型
class IndexHandler(tornado.web.RequestHandler):
#添加一個處理get請求方式的方法
def get(self,input):
#向響應中,添加數據
#提取傳入參數的轉換
data_list = str(input).split("=_=")
#data_list=["i go to work by bus", "hello world", "go to school","by car","go to school byebye good morning","the apple in the box"]
returnItem = {}
returnItem["getPhrase"] = getScopeOfApplication(data_list)
returnItem["getPhraseWithoutPre"] = getProductCharacteristics(data_list)
self.write(json.dumps(returnItem))
def getScopeOfApplication(data_list):
if len(data_list)==0:
return []
text = data_list[1]
for i in data_list[2:]:
text += ',' + i
grammar = r"""
NP:
{<DT>?<JJ|CC>*<NN>+}
{<NNP>+}
PP: {<IN><NP>}
"""
label = 'PP'
phrase_list = phrases_extractor.get_phrases(text, grammar, label)
result_list = []
for phrase in phrase_list:
if phrase not in result_list and len(phrase.split(' ')) > 2:
result_list.append(phrase)
if len(result_list) < 1:
result_list = getProductCharacteristics(data_list)
return result_list
def getProductCharacteristics(data_list):
if len(data_list)==0:
return []
cur_title = data_list[0]
other_titles = data_list[1]
for i in data_list[2:]:
other_titles += ',' + i
grammar = r"""
NP:
{<DT>?<JJ|CC>*<NN>+}
"""
label = 'NP'
phrase_list = phrases_extractor.get_phrases(cur_title, grammar, label)
cur_list = list(set(phrase_list))
other_phrase_list = phrases_extractor.get_phrases(other_titles, grammar, label)
result_list = []
for phrase in cur_list:
if phrase in other_phrase_list:
result_list.append(phrase)
return result_list
#定義接口的參數獲取
app = tornado.web.Application(handlers=[(r"/getwords/(.*?)$", IndexHandler)])
#主函數執行
if __name__ == "__main__":
tornado.options.parse_command_line()
http_server = tornado.httpserver.HTTPServer(app, max_buffer_size=504857600, max_body_size=504857600)
# http_server.listen(options.port)
http_server.bind(options.port)
#開啟五個線程
http_server.start(5)
tornado.ioloop.IOLoop.instance().start()
然后python 程序.py 將服務開啟,掛到服務器上進行執行
然后我們在瀏覽器中 輸入 http://你的ip地址:8000/getwords/需要拆分單詞的句子=_=the apple in the box 例如 http://你的ip地址:8000/getwords/i go to work by bus=_=the apple in the box
然后我們可以看到

很簡單的寫好一個接口,完美!
