相關內容簡體繁體

[轉]python的httplib、urllib和urllib2的區別及用

本文轉載自查看原文 2017-02-22 09:54 2678 python

原文鏈接：http://blog.csdn.net/dolphin_h/article/details/45296353

慢慢的把它們總結一下，總結就是最好的學習方法

宗述

首先來看一下他們的區別

urllib和urllib2

urllib 和urllib2都是接受URL請求的相關模塊，但是urllib2可以接受一個Request類的實例來設置URL請求的headers，urllib僅可以接受URL。

這意味着，你不可以偽裝你的User Agent字符串等。

urllib提供urlencode方法用來GET查詢字符串的產生，而urllib2沒有。這是為何urllib常和urllib2一起使用的原因。

目前的大部分http請求都是通過urllib2來訪問的

httplib

httplib實現了HTTP和HTTPS的客戶端協議，一般不直接使用，在python更高層的封裝模塊中（urllib,urllib2）使用了它的http實現。

urllib簡單用法

urllib.urlopen(url[, data[, proxies]]) :

[python] view plain copy

google = urllib.urlopen('http://www.google.com')
print 'http header:/n', google.info()
print 'http status:', google.getcode()
print 'url:', google.geturl()
for line in google: # 就像在操作本地文件
print line,
google.close()

詳細使用方法見

urllib2簡單用法

最簡單的形式

[python] view plain copy

import urllib2
response=urllib2.urlopen('http://www.douban.com')
html=response.read()

實際步驟：

1、urllib2.Request()的功能是構造一個請求信息，返回的req就是一個構造好的請求

2、urllib2.urlopen()的功能是發送剛剛構造好的請求req，並返回一個文件類的對象response，包括了所有的返回信息。

3、通過response.read()可以讀取到response里面的html，通過response.info()可以讀到一些額外的信息。

如下：

[python] view plain copy

#!/usr/bin/env python
import urllib2
req = urllib2.Request("http://www.douban.com")
response = urllib2.urlopen(req)
html = response.read()
print html

有時你會碰到，程序也對，但是服務器拒絕你的訪問。這是為什么呢?問題出在請求中的頭信息(header)。有的服務端有潔癖，不喜歡程序來觸摸它。這個時候你需要將你的程序偽裝成瀏覽器來發出請求。請求的方式就包含在header中。
常見的情形：

[python] view plain copy

import urllib
import urllib2
url = 'http://www.someserver.com/cgi-bin/register.cgi'
user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'# 將user_agent寫入頭信息
values = {'name' : 'who','password':'123456'}
headers = { 'User-Agent' : user_agent }
data = urllib.urlencode(values)
req = urllib2.Request(url, data, headers)
response = urllib2.urlopen(req)
the_page = response.read()

values是post數據

GET方法

例如百度：

百度是通過http://www.baidu.com/s?wd=XXX 來進行查詢的，這樣我們需要將{‘wd’:’xxx’}這個字典進行urlencode

[python] view plain copy

#coding:utf-8
import urllib
import urllib2
url = 'http://www.baidu.com/s'
values = {'wd':'D_in'}
data = urllib.urlencode(values)
print data
url2 = url+'?'+data
response = urllib2.urlopen(url2)
the_page = response.read()
print the_page

POST方法

[python] view plain copy

import urllib
import urllib2
url = 'http://www.someserver.com/cgi-bin/register.cgi'
user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)' //將user_agent寫入頭信息
values = {'name' : 'who','password':'123456'} //post數據
headers = { 'User-Agent' : user_agent }
data = urllib.urlencode(values) //對post數據進行url編碼
req = urllib2.Request(url, data, headers)
response = urllib2.urlopen(req)
the_page = response.read()

urllib2帶cookie的使用

[python] view plain copy

#coding:utf-8
import urllib2,urllib
import cookielib
url = r'http://www.renren.com/ajaxLogin'
#創建一個cj的cookie的容器
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
#將要POST出去的數據進行編碼
data = urllib.urlencode({"email":email,"password":pass})
r = opener.open(url,data)
print cj

httplib簡單用法

簡單示例

[python] view plain copy

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import httplib
import urllib
def sendhttp():
data = urllib.urlencode({'@number': 12524, '@type': 'issue', '@action': 'show'})
headers = {"Content-type": "application/x-www-form-urlencoded",
"Accept": "text/plain"}
conn = httplib.HTTPConnection('bugs.python.org')
conn.request('POST', '/', data, headers)
httpres = conn.getresponse()
print httpres.status
print httpres.reason
print httpres.read()
if __name__ == '__main__':
sendhttp()

具體用法見

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 python的httplib、urllib和urllib2的區別及用 python中urllib, urllib2,urllib3, httplib,httplib2, request的區別 Python網頁抓取urllib,urllib2,httplib[1] Python urllib與urllib2 Python 爬蟲 urllib、urllib2、urllib3用法及區別 Python的urllib和urllib2模塊 urllib和urllib2在python2以及python3的區別 urllib2和requests的區別 Python urllib2 模塊 Python urllib2 proxy

粵ICP備18138465號 © 2018-2026 CODEPRJ.COM