一、首先什么是代理:
代理其實就是中間轉發的那個玩意,所以在代碼邏輯上也是如此的。
二、Python寫http代理的基本邏輯:
(1)接受瀏覽器發出的請求,解析,拼湊成該有的樣子,然后使用套接字發出去。
(2)完了,其實Demo就這么簡單。
三、下面講講如何接受瀏覽器發起的請求,其實只要是請求就可以,沒必要是瀏覽器的。外部發來的請求一樣OK哦。
#接受請求就是一個服務器,沒毛病老鐵。所以用到了一個庫BaseHTTPServer
1 #-*- coding:utf-8 -*- 2 3 #import lib-file 4 import urllib 5 import socket 6 from BaseHTTPServer import BaseHTTPRequestHandler,HTTPServer 7 8 #define handler function class 9 class MyHandler(BaseHTTPRequestHandler): 10 #HTTP method GET (e.x.) 11 def do_GET(self): 12 url = self.path 13 print "url:",url 14 protocol,rest = urllib.splittype(url) 15 print "protocol:",protocol 16 host,rest = urllib.splithost(rest) 17 print "host:",host 18 path = rest 19 print "path:",path 20 host,port = urllib.splitnport(host) 21 print "host:",host 22 port = 80 if port < 0 else port 23 host_ip = socket.gethostbyname(host) 24 print (host_ip,port) 25 #above easy to understand 26 del self.headers['Proxy-Connection'] 27 print self.headers 28 self.headers['Connection'] = 'close' 29 #Above! Three lines code removes Proxy-Connection columns and set connection to close to make sure no keep-alive link 30 #Bottom! Lines make request like what we see in the burpsuite! 31 send_data = 'GET ' + path + ' ' + self.protocol_version + '\r\n' 32 head = '' 33 for key, val in self.headers.items(): 34 head = head + "%s: %s\r\n" % (key, val) 35 send_data = send_data + head + '\r\n' 36 print send_data 37 # 38 client = socket.socket(socket.AF_INET,socket.SOCK_STREAM) 39 client.connect((host_ip,port)) 40 client.send(send_data) 41 #while True: 42 # ret = server.recv(4096) 43 # print ret 44 data = '' 45 while True: 46 tmp = client.recv(4096) 47 if not tmp: 48 break 49 data = data + tmp 50 51 # socprint data 52 client.close() 53 self.wfile.write(data)
看邏輯很簡單,利用basehttpserver 收請求socket轉發
起main函數:
1 def main(): 2 try: 3 server = HTTPServer(('127.0.0.1', 8888), MyHandler) 4 print 'Welcome to the machine...' 5 server.serve_forever() 6 print "testend" 7 except KeyboardInterrupt: 8 print '^C received, shutting down server' 9 server.socket.close() 10 11 if __name__ == '__main__': 12 main()
這里可以看到已經ok了,但是由於百度那邊跳轉和阻塞,還是沒能成功完成代理,不過數據包確確實實轉發出去了,但是代碼邏輯已經ok。
參考:
http://www.lyyyuna.com/2016/01/16/http-proxy-get1/