Python之模塊(一)


什么是Python模塊?

1.1 Python模塊簡介

  模塊讓你能夠有邏輯地組織你的Python代碼段。把相關的代碼分配到一個 模塊里能讓你的代碼更好用,更易懂。模塊也是Python對象,具有隨機的名字屬性用來綁定或引用。簡單地說,模塊就是一個保存了Python代碼的文件。模塊能定義函數,類和變量。模塊里也能包含可執行的代碼。

一個叫做aname的模塊里的Python代碼一般都能在一個叫aname.py的文件中找到。下例是個簡單的模塊support.py。

def print_func( par ):
   print "Hello : ", par
   return

1.2 Python模塊導入

  當解釋器遇到import語句,如果模塊在當前的搜索路徑就會被導入。搜索路徑是一個解釋器會先進行搜索的所有目錄的列表。一個模塊只會被導入一次,不管你執行了多少次import。這樣可以防止導入模塊被一遍又一遍地執行。

# 單模塊
import xxx
# 嵌套在文件夾下
from xxx import xxx
from xxx import *
from xxx import xxx as xxx

導入模塊其實就是告訴Python解釋器去解釋那個py文件

  • 導入一個py文件,解釋器解釋該py文件
  • 導入一個包,解釋器解釋該包下的 __init__.py 文件

那么問題來了,導入模塊時是根據那個路徑作為基准來進行的呢?當你導入一個模塊,Python解析器對模塊位置的搜索順序是:

  • 當前目錄
  • 如果不在當前目錄,Python 則搜索在 shell 變量 PYTHONPATH 下的每個目錄。
  • 如果都找不到,Python會察看默認路徑。UNIX下,默認路徑一般為/usr/local/lib/python/。

模塊搜索路徑存儲在system模塊的sys.path變量中。變量里包含當前目錄,PYTHONPATH和由安裝過程決定的默認目錄。

模塊路徑:

# 獲取路徑
>>> import sys
>>> for i in sys.path:
...     print(i)
...

C:\Python35-32\python35.zip
C:\Python35-32\DLLs
C:\Python35-32\lib                         # 存放標准庫
C:\Python35-32
C:\Python35-32\lib\site-packages        # 存放第三方庫,擴展庫

>>> import sys
>>> import os
>>> pre_path = os.path.abspath('../')
>>> sys.path.append(pre_path)
>>> for i in sys.path:
...     print(i)
...

C:\Python35-32\python35.zip
C:\Python35-32\DLLs
C:\Python35-32\lib
C:\Python35-32
C:\Python35-32\lib\site-packages
C:\Users

1.3 dir()函數

dir()函數一個排好序的字符串列表,內容是一個模塊里定義過的名字。返回的列表容納了在一個模塊里定義的所有模塊,變量和函數。如下一個簡單的實例:

>>> for i in dir(os):
...     print(i)
...
.....省略
sys
system
urandom
utime
waitpid
walk
write

1.4 reload()函數

  當一個模塊被導入到一個腳本,模塊頂層部分的代碼只會被執行一次。因此,如果你想重新執行模塊里頂層部分的代碼,可以用reload()函數。該函數會重新導入之前導入過的模塊。語法如下:

reload(module_name)

在這里,module_name要直接放模塊的名字,而不是一個字符串形式。比如想重載hello模塊,如下:

reload(hello)

1.5 模塊分類

自定義模塊
內置模塊
開源模塊

開源模塊

1.6.1 模塊安裝

1)使用Python包管理工具

pip
# 生成依賴包列表
pip freeze > requirements.txt
pip install -r requirements.txt
pip官網資料:https://pip.pypa.io/en/stable/quickstart/
easy_install

2)源碼安裝

# 在使用源碼安裝時,需要使用到gcc編譯和python開發環境,所以,需要先執行:
yum install gcc
yum install python-devel
或
apt-get python-dev

# 源碼包安裝流程
下載源碼
解壓源碼
進入目錄
編譯源碼    python setup.py build
安裝源碼    python setup.py install

# 安裝成功后,模塊會自動安裝到 sys.path中的某個目錄中,如:
/usr/lib/python2.7/site-packages/

16.2 模塊導入

  導入方式與自定義模塊相同

16.3 案例講解

  paramiko是一個用於做遠程控制的模塊,使用該模塊可以對遠程服務器進行命令或文件操作,值得一說的是,fabric和ansible內部的遠程管理就是使用的paramiko來現實。

官網:http://www.paramiko.org/

1)模塊安裝

# pip 安裝
python -m pip install paramiko   #  Python 2.x
python3 -m pip install paramiko  #  Python 3.x

# 源碼安裝
# pycrypto,由於 paramiko 模塊內部依賴pycrypto,所以先下載安裝pycrypto
# 下載安裝 pycrypto
wget http://files.cnblogs.com/files/wupeiqi/pycrypto-2.6.1.tar.gz
tar -xvf pycrypto-2.6.1.tar.gz
cd pycrypto-2.6.1
python setup.py build
python setup.py install

# 進入python環境,導入Crypto檢查是否安裝成功
# 下載安裝 paramiko
wget http://files.cnblogs.com/files/wupeiqi/paramiko-1.10.1.tar.gz
tar -xvf paramiko-1.10.1.tar.gz
cd paramiko-1.10.1
python setup.py build
python setup.py install
# 進入python環境,導入paramiko檢查是否安裝成功
python3 -c "import paramiko"

2)模塊應用

# 通過賬號密碼連接服務器 import paramiko ssh = paramiko.SSHClient() ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy()) ssh.connect('192.168.1.108', 22, 'alex', '123') stdin, stdout, stderr = ssh.exec_command('df') print stdout.read() ssh.close(); # SSHClient封裝Transport import paramiko transport = paramiko.Transport(('hostname', 22)) transport.connect(username='wupeiqi', password='123') ssh = paramiko.SSHClient() ssh._transport = transport stdin, stdout, stderr = ssh.exec_command('df') print(stdout.read()) transport.close()

 

# 通過密鑰鏈接服務器 import paramiko private_key_path = '/home/auto/.ssh/id_rsa' key = paramiko.RSAKey.from_private_key_file(private_key_path) ssh = paramiko.SSHClient() ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy()) ssh.connect('主機名 ', 端口, '用戶名', key) stdin, stdout, stderr = ssh.exec_command('df') print stdout.read() ssh.close() # SSHClient 封裝 Transport import paramiko private_key = paramiko.RSAKey.from_private_key_file('/home/auto/.ssh/id_rsa') transport = paramiko.Transport(('hostname', 22)) transport.connect(username='wupeiqi', pkey=private_key) ssh = paramiko.SSHClient() ssh._transport = transport stdin, stdout, stderr = ssh.exec_command('df') transport.close()

 

# 上傳或者下載文件 - 通過用戶名和密碼
import os,sys
import paramiko

t = paramiko.Transport(('192.168.1.1',22))
t.connect(username='root',password='123456')
sftp = paramiko.SFTPClient.from_transport(t)
sftp.put('/tmp/test.py','/tmp/test.py') 
# sftp.get('/tmp/test.py','/tmp/test2.py')
t.close()

 

 

# 上傳或下載文件 - 通過密鑰
import paramiko

pravie_key_path = '/home/auto/.ssh/id_rsa'
key = paramiko.RSAKey.from_private_key_file(pravie_key_path)

t = paramiko.Transport(('192.168.1.100',22))
t.connect(username='root',pkey=key)

sftp = paramiko.SFTPClient.from_transport(t)
sftp.put('/tmp/test3.py','/tmp/test3.py') 
# sftp.get('/tmp/test3.py','/tmp/test4.py') 
t.close()
#!/usr/bin/env python
# -*- coding:utf-8 -*-
import paramiko
import uuid

class SSHConnection(object):

    def __init__(self, host='172.16.103.191', port=22, username='wupeiqi',pwd='123'):
        self.host = host
        self.port = port
        self.username = username
        self.pwd = pwd
        self.__k = None

    def create_file(self):
        file_name = str(uuid.uuid4())
        with open(file_name,'w') as f:
            f.write('sb')
        return file_name

    def run(self):
        self.connect()
        self.upload('/home/wupeiqi/tttttttttttt.py')
        self.rename('/home/wupeiqi/tttttttttttt.py', '/home/wupeiqi/ooooooooo.py)
        self.close()

    def connect(self):
        transport = paramiko.Transport((self.host,self.port))
        transport.connect(username=self.username,password=self.pwd)
        self.__transport = transport

    def close(self):

        self.__transport.close()

    def upload(self,target_path):
        # 連接,上傳
        file_name = self.create_file()

        sftp = paramiko.SFTPClient.from_transport(self.__transport)
        # 將location.py 上傳至服務器 /tmp/test.py
        sftp.put(file_name, target_path)

    def rename(self, old_path, new_path):

        ssh = paramiko.SSHClient()
        ssh._transport = self.__transport
        # 執行命令
        cmd = "mv %s %s" % (old_path, new_path,)
        stdin, stdout, stderr = ssh.exec_command(cmd)
        # 獲取命令結果
        result = stdout.read()

    def cmd(self, command):
        ssh = paramiko.SSHClient()
        ssh._transport = self.__transport
        # 執行命令
        stdin, stdout, stderr = ssh.exec_command(command)
        # 獲取命令結果
        result = stdout.read()
        return result
        


ha = SSHConnection()
ha.run()
案例1
import paramiko
import uuid

class SSHConnection(object):

    def __init__(self, host='192.168.11.61', port=22, username='alex',pwd='alex3714'):
        self.host = host
        self.port = port
        self.username = username
        self.pwd = pwd
        self.__k = None

    def run(self):
        self.connect()
        pass
        self.close()

    def connect(self):
        transport = paramiko.Transport((self.host,self.port))
        transport.connect(username=self.username,password=self.pwd)
        self.__transport = transport

    def close(self):
        self.__transport.close()

    def cmd(self, command):
        ssh = paramiko.SSHClient()
        ssh._transport = self.__transport
        # 執行命令
        stdin, stdout, stderr = ssh.exec_command(command)
        # 獲取命令結果
        result = stdout.read()
        return result

    def upload(self,local_path, target_path):
        # 連接,上傳
        sftp = paramiko.SFTPClient.from_transport(self.__transport)
        # 將location.py 上傳至服務器 /tmp/test.py
        sftp.put(local_path, target_path)

ssh = SSHConnection()
ssh.connect()
r1 = ssh.cmd('df')
ssh.upload('s2.py', "/home/alex/s7.py")
ssh.close()
案例2

內置模塊

  Python 帶有一個內置模塊庫,並發布有單獨的文檔叫Python 庫參考手冊(以下簡稱"庫參考手冊")。有些模塊被直接構建在解析器里;這些操作雖然不是語言核心的部分,但是依然被內建進來,一方面是效率的原因,另一方面是為了提供訪問操作系統原語如系統調用的功能。這些模塊是可配置的,也取決於底層的平台。例如,winreg 模塊只在 Windows 系統上提供。有一個特別的模塊需要注意: sys,它內置在每一個 Python 解釋器中。變量 sys.ps1 和 sys.ps2 定義了主提示符和輔助提示符使用的字符串:

>>> import sys
>>> sys.ps1                      # 只有在交互式模式中,這兩個變量才有定義
'>>> '
>>> sys.ps2
'... '
>>> sys.ps1 = 'mads>>>'          # 重新設置ps1,ps2的值
mads>>>

模塊中的特殊變量

1)__doc__ 將文檔的注釋封裝到該方法  """這種注釋才會被封裝"""

#!/usr/bin/env python
# encoding: utf-8

"""
@version: v1.0
@author: XXX
@license: Apache Licence
@contact: xxx@qq.com
@site: http://madsstudy.blog.51cto.com/
@software: PyCharm
@file: mod_index.py
@time: 2016/6/11 16:22
"""

print(__doc__)

結果:
@version: v1.0
@author: xxx
@license: Apache Licence
@contact: xxx@qq.com
@site: http://madsstudy.blog.51cto.com/
@software: PyCharm
@file: mod_index.py
@time: 2016/6/11 16:22

2)__file__ 獲取當前文件的所在路徑

print(__file__)
E:/開源中國/192.168.21.140/trunk/Python/scripts/S13/s13day06/模塊/mod_index.py

# 應用案例
import sys
import os
sys.path.append(os.path.dirname(os.path.abspath(__file__)))

3) __name__只有執行當前文件時候,當前文件的特殊變量__name__ == "__main__"

4)__cached__

5)__package__指定的函數從哪個模塊導入

from bin import admin
print(__package__)
print(admin.__package__)

# 結果:
None
bin

 

17.5 json和pickel模塊

用於序列化的兩個模塊
1)json
功能:用於字符串和 python數據類型間進行轉換
特點:更加適合跨語言,處理Python的基本數據類型
方法:
loads:將字符串形式轉化為Python基本數據類型
dumps:將Python基本數據類型轉化為字符串
load :讀文件,進行反序列化
dump :序列化寫文件

2)pickel
功能:用於python特有的類型 和 python的數據類型間進行轉換
特點:Python所有類型的序列化,僅適用Python,還有可能發生版本兼容問題,應用場景:游戲自自定義玩家數據
方法:
dumps:
loads:
 dump:只支持寫字符串類型,不能是二進制
 load:

3)應用案例

import json

dic = {"name": "alex", "age": 29}  # 字典
print(dic, type(dic))
st = json.dumps(dic)
print(st, type(st))


li = ["alex", 29]  # 列表
print(li, type(li))
st = json.dumps(li)
print(st, type(st))


st = '{"k1": "v1" ,"k2": 23}'   # 最外層必須使用單引號,里面使用雙引號 否則loads時報錯
print(st, type(st))
dic = json.loads(st)
print(dic, type(dic))


li = ["alex", 29]  # 列表
print(li, type(li))
st = json.dump(li, open('user_info.txt', 'w'))

st = json.load(open('user_info.txt', 'r'))
print(st, type(st))
import pickle

li = [11, 22, 33, ]
r = pickle.dumps(li)
print(r)

res = pickle.loads(r)
print(res)

li = [11, 22, 33, ]
pickle.dump(li, open('user_info.pickle', 'wb'))

res = pickle.load(open('user_info.pickle', 'rb'))
print(res)

17.6 logging模塊

  很多程序都有記錄日志的需求,並且日志中包含的信息即有正常的程序訪問日志,還可能有錯誤、警告等信息輸出,python的logging模塊提供了標准的日志接口,你可以通過它存儲各種格式的日志,logging的日志可以分為 debug(), info(), warning(), error() and critical() 5個級別
簡單的將日志打印到屏幕

>>> import logging
>>>
>>> logging.debug('This is debug message')
>>> logging.info('This is info message')
>>> logging.warning('This is warning message')
WARNING:root:This is warning message
>>>

默認情況下,logging將日志打印到屏幕,日志級別為WARNING;
日志級別大小關系為:CRITICAL > ERROR > WARNING > INFO > DEBUG > NOTSET,當然也可以自己定義日志級別。

將日志寫入文件

import logging

logging.basicConfig(level=logging.DEBUG,                       # 設置日志級別為debug
                format='%(asctime)s %(filename)s [line:%(lineno)d] %(levelname)s %(message)s',
                datefmt='%a, %d %b %Y %H:%M:%S',
                filename='my_app.log',
                filemode='w')

logging.debug('This is debug message')
logging.info('This is info message')
logging.warning('This is warning message')

# 日志文件內容
E:\開源中國\192.168.21.140\trunk\Python\scripts\S13\s13day05>cat my_app.log                                                                                                                                          
Mon, 06 Jun 2016 22:34:31 logging_v1.py [line:24] DEBUG This is debug message
Mon, 06 Jun 2016 22:34:31 logging_v1.py [line:25] INFO This is info message
Mon, 06 Jun 2016 22:34:31 logging_v1.py [line:26] WARNING This is warning message

 參數講解:

filename: 指定日志文件名
filemode: 和file函數意義相同,指定日志文件的打開模式,'w'或'a'
format: 指定輸出的格式和內容,format可以輸出很多有用信息,如上例所示:
%(levelno)s: 打印日志級別的數值
%(levelname)s: 打印日志級別名稱
%(pathname)s: 打印當前執行程序的路徑,其實就是sys.argv[0]
%(filename)s: 打印當前執行程序名
%(funcName)s: 打印日志的當前函數
%(lineno)d: 打印日志的當前行號
%(asctime)s: 打印日志的時間
%(thread)d: 打印線程ID
%(threadName)s: 打印線程名稱
%(process)d: 打印進程ID
%(message)s: 打印日志信息
datefmt: 指定時間格式,同time.strftime()
level: 設置日志級別,默認為logging.WARNING
stream: 指定將日志的輸出流,可以指定輸出到sys.stderr,sys.stdout或者文件,默認輸出到sys.stderr,當stream和filename同時指定時,stream被忽略

同時將日志輸出到屏幕和日志

import logging

logging.basicConfig(level=logging.DEBUG,
                format='%(asctime)s %(filename)s[line:%(lineno)d] %(levelname)s %(message)s',
                datefmt='%a, %d %b %Y %H:%M:%S',
                filename='my_app.log',
                filemode='w')


# 定義一個StreamHandler,將INFO級別或更高的日志信息打印到標准錯誤,並將其添加到當前的日志處理對象
console = logging.StreamHandler()
console.setLevel(logging.INFO)
formatter = logging.Formatter('%(name)-12s: %(levelname)-8s %(message)s')
console.setFormatter(formatter)
logging.getLogger('').addHandler(console)


logging.debug('This is debug message')
logging.info('This is info message')
logging.warning('This is warning message')

# 屏幕上打印:
# root        : INFO     This is info message
# root        : WARNING  This is warning message
#
# ./my_app.log文件中內容為:
# Sun, 24 May 2009 21:48:54 demo2.py[line:11] DEBUG This is debug message
# Sun, 24 May 2009 21:48:54 demo2.py[line:12] INFO This is info message
# Sun, 24 May 2009 21:48:54 demo2.py[line:13] WARNING This is warning message

日志回滾

import logging
from logging.handlers import RotatingFileHandler

# 定義一個RotatingFileHandler,最多備份5個日志文件,每個日志文件最大10M
Rthandler = RotatingFileHandler('my_app.log', maxBytes=10*1024*1024, backupCount=5)
Rthandler.setLevel(logging.INFO)
formatter = logging.Formatter('%(name)-12s: %(levelname)-8s %(message)s')
Rthandler.setFormatter(formatter)
logging.getLogger('').addHandler(Rthandler)

從上例和本例可以看出,logging有一個日志處理的主對象,其它處理方式都是通過addHandler添加進去的。
logging的幾種handle方式如下:
logging.StreamHandler: 日志輸出到流,可以是sys.stderr、sys.stdout或者文件
logging.FileHandler: 日志輸出到文件

日志回滾方式,實際使用時用RotatingFileHandler和TimedRotatingFileHandler
logging.handlers.BaseRotatingHandler
logging.handlers.RotatingFileHandler
logging.handlers.TimedRotatingFileHandler

logging.handlers.SocketHandler: 遠程輸出日志到TCP/IP sockets
logging.handlers.DatagramHandler:  遠程輸出日志到UDP sockets
logging.handlers.SMTPHandler:  遠程輸出日志到郵件地址
logging.handlers.SysLogHandler: 日志輸出到syslog
logging.handlers.NTEventLogHandler: 遠程輸出日志到Windows NT/2000/XP的事件日志
logging.handlers.MemoryHandler: 日志輸出到內存中的制定buffer
logging.handlers.HTTPHandler: 通過"GET"或"POST"遠程輸出到HTTP服務器
  由於StreamHandler和FileHandler是常用的日志處理方式,所以直接包含在logging模塊中,而其他方式則包含在logging.handlers模塊中,
上述其它處理方式的使用請參見python2.5手冊!

通過logging.config模塊配置日志

#logger.conf

###############################################

[loggers]
keys=root,example01,example02

[logger_root]
level=DEBUG
handlers=hand01,hand02

[logger_example01]
handlers=hand01,hand02
qualname=example01
propagate=0

[logger_example02]
handlers=hand01,hand03
qualname=example02
propagate=0

###############################################

[handlers]
keys=hand01,hand02,hand03

[handler_hand01]
class=StreamHandler
level=INFO
formatter=form02
args=(sys.stderr,)

[handler_hand02]
class=FileHandler
level=DEBUG
formatter=form01
args=('myapp.log', 'a')

[handler_hand03]
class=handlers.RotatingFileHandler
level=INFO
formatter=form02
args=('myapp.log', 'a', 10*1024*1024, 5)

###############################################

[formatters]
keys=form01,form02

[formatter_form01]
format=%(asctime)s %(filename)s[line:%(lineno)d] %(levelname)s %(message)s
datefmt=%a, %d %b %Y %H:%M:%S

[formatter_form02]
format=%(name)-12s: %(levelname)-8s %(message)s
datefmt=

例3:

import logging
import logging.config

logging.config.fileConfig("logger.conf")
logger = logging.getLogger("example01")

logger.debug('This is debug message')
logger.info('This is info message')
logger.warning('This is warning message')

例4:

import logging
import logging.config

logging.config.fileConfig("logger.conf")
logger = logging.getLogger("example02")

logger.debug('This is debug message')
logger.info('This is info message')
logger.warning('This is warning message')

logging是線程安全的

17.7 hashlib

  用於加密相關的操作,代替了md5模塊和sha模塊,主要提供 SHA1, SHA224, SHA256, SHA384, SHA512 ,MD5 算法

# Python2.x 在Python3.x中已經廢棄
import md5

hash = md5.new()
hash.update('admin')
print hash.hexdigest()

# Python2.x 在Python3.x中已經廢棄
import sha

hash = sha.new()
hash.update('admin')
print hash.hexdigest()
import hashlib

# ######## md5 ########

obj = hashlib.md5()
obj.update(bytes('123', encoding='utf-8'))
print("md5:", obj.hexdigest())

# ######## sha1 ########

obj = hashlib.sha1()
obj.update(bytes('123', encoding='utf-8'))
print("sha1:", obj.hexdigest())

# ######## sha256 ########

obj = hashlib.sha256()
obj.update(bytes('123', encoding='utf-8'))
print("sha256:", obj.hexdigest())

# ######## sha384 ########

obj = hashlib.sha384()
obj.update(bytes('123', encoding='utf-8'))
print("sha384:", obj.hexdigest())

# ######## sha512 ########

obj = hashlib.sha512()
obj.update(bytes('123', encoding='utf-8'))
print("sha512:", obj.hexdigest())

# 結果:
# md5: 202cb962ac59075b964b07152d234b70
# sha1: 40bd001563085fc35165329ea1ff5c5ecbdbbeef
# sha256: a665a45920422f9d417e4867efdc4fb8a04a1f3fff1fa07e998e86f7f7a27ae3
# sha384: 9a0a82f0c0cf31470d7affede3406cc9aa8410671520b727044eda15b4c25532a9b5cd8aaf9cec4919d76255b6bfb00f
# sha512: 3c9909afec25354d551dae21590bb26e38d53f2173b8d3dc3eee4c047e7ab1c1eb8b85103e3be7ba613b31bb5c9c36214dc9f14a42fd7a2fdb84856bca5c44c2
import hashlib

print(hashlib.algorithms_available)  # 可以使用的加密算法
結果:{'SHA', 'SHA1', 'SHA512', 'md4', 'DSA', 'SHA224', 'sha1', 'sha256', 'md5', 'sha224', 'ecdsa-with-SHA1', 'sha', 'RIPEMD160', 'dsaEncryption', 'ripemd160', 'MD4', 'DSA-SHA', 'whirlpool', 'MD5', 'SHA256', 'dsaWithSHA', 'sha384', 'SHA384', 'sha512'}
print(hashlib.algorithms_guaranteed)  # python在所有平台上都可以使用的函數,也就是比較穩定的函數
結果:{'sha384', 'sha1', 'sha256', 'md5', 'sha224', 'sha512'}

     以上加密算法雖然依然非常厲害,但是還存在撞庫的問題。有必要對加密算法中添加自定義key再來做加密。

import hashlib

obj = hashlib.md5(bytes('gyyx', encoding='utf-8'))
obj.update(bytes('123', encoding='utf-8'))
result = obj.hexdigest()
print("md5:", result)

#結果:
# md5: 83a2156b15c1376ae1e93980d7d1af56
#Python 還有一個hmac模塊,它內部對創建key和內容再進行處理然后再加密
import hmac

obj = hmac.new(b'mykey')
obj.update(b'mymessage')
print(obj.hexdigest())

結果:
# hmac:d811630c4e62c6ef90d1bfe540212aaf

應用案例:

def md5(arg):
    """
    md5加密
    :param arg: 接收用戶輸入的加密字符串
    :return: 返回字符串的加密結果
    """
    obj = hashlib.md5(bytes('oldboy', encoding='utf-8'))
    obj.update(bytes(arg, encoding='utf-8'))
    return obj.hexdigest()

user_name = 'admin'
pass_word = '1e5413879f3e972653fdcf9101698b29'  # 明文:123456
for i in range(3):
    username = input("請輸入用戶名:")
    password = input("請輸入密碼:")
    if username == user_name and md5(password) == pass_word:
        print("登錄成功")
        break
    else:
        print("賬號或者密碼錯誤")

結果:
 

17.8 random

17.9 執行系統命令

可執行shell命令的相關模塊和函數:

  • os.system
  • os.spawn*
  • os.popen*  --廢棄
  • popen2.*    --廢棄
  • commands.*  --廢棄,3.x中被異常
import commands

result = commands.getoutput('cmd')
result = commands.getstatus('cmd')
result = commands.getstatusoutput('cmd')

  從Python 2.4開始,Python引入subprocess模塊來管理子進程,以取代一些舊模塊的方法。以上執行shell命令的相關的模塊和函數的功能均在 subprocess 模塊中實現,並提供了更豐富的功能。

subprocess.call:執行命令,返回狀態碼(shell=True,允許shell命令是字符串形式)

import subprocess

res = subprocess.call('ls -l', shell=True)
print(res)

res = subprocess.call(["ls", "-l"], shell=False)
print(res)

subprocess.check_call:執行命令,如果執行狀態是0,則返回0,否則拋異常

import subprocess

res = subprocess.check_call(["ls", "-l"])
print(res)

res = subprocess.check_call("lss", shell=True)  # 執行一個非shell命令
print(res)

subprocess.check_output:執行命令,如果狀態碼是0,則返回執行結果,否則拋異常

import subprocess

res = subprocess.check_output(["echo", "Hello World!"])
print(res)

res = subprocess.check_output("exit 1", shell=True)  # 拋異常
print(res)

subprocess.Popen():用於執行復雜的系統命令

參數:

  • args:shell命令,可以是字符串或者序列類型(如:list,元組)
  • bufsize:指定緩沖。0 無緩沖,1 行緩沖,其他 緩沖區大小,負值 系統緩沖
  • stdin, stdout, stderr:分別表示程序的標准輸入、輸出、錯誤句柄
  • preexec_fn:只在Unix平台下有效,用於指定一個可執行對象(callable object),它將在子進程運行之前被調用
  • close_sfs:在windows平台下,如果close_fds被設置為True,則新創建的子進程將不會繼承父進程的輸入、輸出、錯誤管道。所以不能將close_fds設置為True同時重定向子進程的標准輸入、輸出與錯誤(stdin, stdout, stderr)。
  • shell:同上
  • cwd:用於設置子進程的當前目錄
  • env:用於指定子進程的環境變量。如果env = None,子進程的環境變量將從父進程中繼承。
  • universal_newlines:不同系統的換行符不同,True -> 同意使用 \n
  • startupinfo與createionflags只在windows下有效,將被傳遞給底層的CreateProcess()函數,用於設置子進程的一些屬性,如:主窗口的外觀,進程的優先級等等
import subprocess
ret1 = subprocess.Popen(["mkdir","t1"])
ret2 = subprocess.Popen("mkdir t2", shell=True)

終端輸入命令分為兩種:

  • 輸入命令即可得到輸入,如:ifconfig
  • 輸入命令進入某種環境,再輸入特定的命令,如:Python
# 第一種情況:輸入命令即可得到輸出結果
import subprocess
obj = subprocess.Popen("mkdir t3", shell=True, cwd='/home/dev',)

# 第二種情況:進入某種環境之下特定命令,返回結果
import subprocess

obj = subprocess.Popen(["python"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
obj.stdin.write('print 1 \n ')
obj.stdin.write('print 2 \n ')
obj.stdin.write('print 3 \n ')
obj.stdin.write('print 4 \n ')
obj.stdin.close()

cmd_out = obj.stdout.read()
obj.stdout.close()
cmd_error = obj.stderr.read()
obj.stderr.close()

print cmd_out
print cmd_error

# 第二種情況:進入某種環境之下特定命令,返回結果,相對上一種 將輸出和錯誤輸出通過一個管道輸出
import subprocess

obj = subprocess.Popen(["python"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
obj.stdin.write('print 1 \n ')
obj.stdin.write('print 2 \n ')
obj.stdin.write('print 3 \n ')
obj.stdin.write('print 4 \n ')

out_error_list = obj.communicate()
print out_error_list

17.11 shutil

提供高級的文件、文件夾等操作

shutil.copyfileobj(fsrc, fdst[, length])將文件內容拷貝到另一個文件中,可以部分內容

# 源代碼實現
def copyfileobj(fsrc, fdst, length=16*1024):
    """copy data from file-like object fsrc to file-like object fdst"""
    while 1:
        buf = fsrc.read(length)
        if not buf:
            break
        fdst.write(buf)

shutil.copyfile(src, dst)拷貝文件

# 源代碼實現
def copyfile(src, dst):
    """Copy data from src to dst"""
    if _samefile(src, dst):
        raise Error("`%s` and `%s` are the same file" % (src, dst))

    for fn in [src, dst]:
        try:
            st = os.stat(fn)
        except OSError:
            # File most likely does not exist
            pass
        else:
            # XXX What about other special files? (sockets, devices...)
            if stat.S_ISFIFO(st.st_mode):
                raise SpecialFileError("`%s` is a named pipe" % fn)

    with open(src, 'rb') as fsrc:
        with open(dst, 'wb') as fdst:
            copyfileobj(fsrc, fdst)

shutil.copymode(src, dst)僅拷貝權限。內容、組、用戶均不變

# 源代碼實現
def copymode(src, dst):
    """Copy mode bits from src to dst"""
    if hasattr(os, 'chmod'):
        st = os.stat(src)
        mode = stat.S_IMODE(st.st_mode)
        os.chmod(dst, mode)

shutil.copystat(src, dst)拷貝狀態的信息,包括:mode bits, atime, mtime, flags

# 源代碼實現
def copystat(src, dst):
    """Copy all stat info (mode bits, atime, mtime, flags) from src to dst"""
    st = os.stat(src)
    mode = stat.S_IMODE(st.st_mode)
    if hasattr(os, 'utime'):
        os.utime(dst, (st.st_atime, st.st_mtime))
    if hasattr(os, 'chmod'):
        os.chmod(dst, mode)
    if hasattr(os, 'chflags') and hasattr(st, 'st_flags'):
        try:
            os.chflags(dst, st.st_flags)
        except OSError, why:
            for err in 'EOPNOTSUPP', 'ENOTSUP':
                if hasattr(errno, err) and why.errno == getattr(errno, err):
                    break
            else:
                raise

shutil.copy(src, dst)拷貝文件和權限

# 源代碼實現
def copy(src, dst):
    """Copy data and mode bits ("cp src dst").

    The destination may be a directory.

    """
    if os.path.isdir(dst):
        dst = os.path.join(dst, os.path.basename(src))
    copyfile(src, dst)
    copymode(src, dst)

shutil.copy2(src, dst)拷貝文件和狀態信息

# 源代碼實現
def copy2(src, dst):
    """Copy data and all stat info ("cp -p src dst").

    The destination may be a directory.

    """
    if os.path.isdir(dst):
        dst = os.path.join(dst, os.path.basename(src))
    copyfile(src, dst)
    copystat(src, dst)

shutil.ignore_patterns(*patterns)
shutil.copytree(src, dst, symlinks=False, ignore=None)遞歸的去拷貝文件

例如:copytree(source, destination, ignore=ignore_patterns('*.pyc', 'tmp*'))

# 源代碼實現
def ignore_patterns(*patterns):
    """Function that can be used as copytree() ignore parameter.

    Patterns is a sequence of glob-style patterns
    that are used to exclude files"""
    def _ignore_patterns(path, names):
        ignored_names = []
        for pattern in patterns:
            ignored_names.extend(fnmatch.filter(names, pattern))
        return set(ignored_names)
    return _ignore_patterns

def copytree(src, dst, symlinks=False, ignore=None):
    """Recursively copy a directory tree using copy2().

    The destination directory must not already exist.
    If exception(s) occur, an Error is raised with a list of reasons.

    If the optional symlinks flag is true, symbolic links in the
    source tree result in symbolic links in the destination tree; if
    it is false, the contents of the files pointed to by symbolic
    links are copied.

    The optional ignore argument is a callable. If given, it
    is called with the `src` parameter, which is the directory
    being visited by copytree(), and `names` which is the list of
    `src` contents, as returned by os.listdir():

        callable(src, names) -> ignored_names

    Since copytree() is called recursively, the callable will be
    called once for each directory that is copied. It returns a
    list of names relative to the `src` directory that should
    not be copied.

    XXX Consider this example code rather than the ultimate tool.

    """
    names = os.listdir(src)
    if ignore is not None:
        ignored_names = ignore(src, names)
    else:
        ignored_names = set()

    os.makedirs(dst)
    errors = []
    for name in names:
        if name in ignored_names:
            continue
        srcname = os.path.join(src, name)
        dstname = os.path.join(dst, name)
        try:
            if symlinks and os.path.islink(srcname):
                linkto = os.readlink(srcname)
                os.symlink(linkto, dstname)
            elif os.path.isdir(srcname):
                copytree(srcname, dstname, symlinks, ignore)
            else:
                # Will raise a SpecialFileError for unsupported file types
                copy2(srcname, dstname)
        # catch the Error from the recursive copytree so that we can
        # continue with other files
        except Error, err:
            errors.extend(err.args[0])
        except EnvironmentError, why:
            errors.append((srcname, dstname, str(why)))
    try:
        copystat(src, dst)
    except OSError, why:
        if WindowsError is not None and isinstance(why, WindowsError):
            # Copying file access times may fail on Windows
            pass
        else:
            errors.append((src, dst, str(why)))
    if errors:
        raise Error, errors

 

shutil.rmtree(path[, ignore_errors[, onerror]])遞歸的去刪除文件

 

# 源代碼實現
def rmtree(path, ignore_errors=False, onerror=None):
    """Recursively delete a directory tree.

    If ignore_errors is set, errors are ignored; otherwise, if onerror
    is set, it is called to handle the error with arguments (func,
    path, exc_info) where func is os.listdir, os.remove, or os.rmdir;
    path is the argument to that function that caused it to fail; and
    exc_info is a tuple returned by sys.exc_info().  If ignore_errors
    is false and onerror is None, an exception is raised.

    """
    if ignore_errors:
        def onerror(*args):
            pass
    elif onerror is None:
        def onerror(*args):
            raise
    try:
        if os.path.islink(path):
            # symlinks to directories are forbidden, see bug #1669
            raise OSError("Cannot call rmtree on a symbolic link")
    except OSError:
        onerror(os.path.islink, path, sys.exc_info())
        # can't continue even if onerror hook returns
        return
    names = []
    try:
        names = os.listdir(path)
    except os.error, err:
        onerror(os.listdir, path, sys.exc_info())
    for name in names:
        fullname = os.path.join(path, name)
        try:
            mode = os.lstat(fullname).st_mode
        except os.error:
            mode = 0
        if stat.S_ISDIR(mode):
            rmtree(fullname, ignore_errors, onerror)
        else:
            try:
                os.remove(fullname)
            except os.error, err:
                onerror(os.remove, fullname, sys.exc_info())
    try:
        os.rmdir(path)
    except os.error:
        onerror(os.rmdir, path, sys.exc_info())

shutil.move(src, dst)遞歸的去移動文件

# 遞歸的去移動文件
def move(src, dst):
    """Recursively move a file or directory to another location. This is
    similar to the Unix "mv" command.

    If the destination is a directory or a symlink to a directory, the source
    is moved inside the directory. The destination path must not already
    exist.

    If the destination already exists but is not a directory, it may be
    overwritten depending on os.rename() semantics.

    If the destination is on our current filesystem, then rename() is used.
    Otherwise, src is copied to the destination and then removed.
    A lot more could be done here...  A look at a mv.c shows a lot of
    the issues this implementation glosses over.

    """
    real_dst = dst
    if os.path.isdir(dst):
        if _samefile(src, dst):
            # We might be on a case insensitive filesystem,
            # perform the rename anyway.
            os.rename(src, dst)
            return

        real_dst = os.path.join(dst, _basename(src))
        if os.path.exists(real_dst):
            raise Error, "Destination path '%s' already exists" % real_dst
    try:
        os.rename(src, real_dst)
    except OSError:
        if os.path.isdir(src):
            if _destinsrc(src, dst):
                raise Error, "Cannot move a directory '%s' into itself '%s'." % (src, dst)
            copytree(src, real_dst, symlinks=True)
            rmtree(src)
        else:
            copy2(src, real_dst)
            os.unlink(src)

shutil.make_archive(base_name, format,...)創建壓縮包並返回文件路徑,例如:zip、tar

base_name: 壓縮包的文件名,也可以是壓縮包的路徑。只是文件名時,則保存至當前目錄,否則保存至指定路徑,
如:www                        =>保存至當前路徑
如:/Users/www =>保存至/Users/
format: 壓縮包種類,“zip”, “tar”, “bztar”,“gztar”
root_dir: 要壓縮的文件夾路徑(默認當前目錄)
owner: 用戶,默認當前用戶
group: 組,默認當前組
logger: 用於記錄日志,通常是logging.Logger對象

#將 /Users/Downloads/test 下的文件打包放置當前程序目錄
 
import shutil
ret = shutil.make_archive("www", 'gztar', root_dir='/Users/Downloads/test')
 
 
#將 /Users/Downloads/test 下的文件打包放置 /Users/目錄
import shutil
ret = shutil.make_archive("/Users/www", 'gztar', root_dir='/Users/Downloads/test')
# 源代碼實現
def make_archive(base_name, format, root_dir=None, base_dir=None, verbose=0,
                 dry_run=0, owner=None, group=None, logger=None):
    """Create an archive file (eg. zip or tar).

    'base_name' is the name of the file to create, minus any format-specific
    extension; 'format' is the archive format: one of "zip", "tar", "bztar"
    or "gztar".

    'root_dir' is a directory that will be the root directory of the
    archive; ie. we typically chdir into 'root_dir' before creating the
    archive.  'base_dir' is the directory where we start archiving from;
    ie. 'base_dir' will be the common prefix of all files and
    directories in the archive.  'root_dir' and 'base_dir' both default
    to the current directory.  Returns the name of the archive file.

    'owner' and 'group' are used when creating a tar archive. By default,
    uses the current owner and group.
    """
    save_cwd = os.getcwd()
    if root_dir is not None:
        if logger is not None:
            logger.debug("changing into '%s'", root_dir)
        base_name = os.path.abspath(base_name)
        if not dry_run:
            os.chdir(root_dir)

    if base_dir is None:
        base_dir = os.curdir

    kwargs = {'dry_run': dry_run, 'logger': logger}

    try:
        format_info = _ARCHIVE_FORMATS[format]
    except KeyError:
        raise ValueError, "unknown archive format '%s'" % format

    func = format_info[0]
    for arg, val in format_info[1]:
        kwargs[arg] = val

    if format != 'zip':
        kwargs['owner'] = owner
        kwargs['group'] = group

    try:
        filename = func(base_name, base_dir, **kwargs)
    finally:
        if root_dir is not None:
            if logger is not None:
                logger.debug("changing back to '%s'", save_cwd)
            os.chdir(save_cwd)

    return filename

shutil對壓縮包的處理是調用ZipFile和TarFile兩個模塊來完成的

import zipfile

# 文件壓縮
z = zipfile.ZipFile('test.zip', 'w')
# z.write('a.txt', 'b.txt')  # 這樣壓縮只會壓縮b.txt
z.write('a.txt')
z.write('b.txt')
z.close()

# 文件解壓
z = zipfile.ZipFile('test.zip', 'r')
# z.extractall()    # 解壓所有
z.extract('a.txt')  # 解壓指定文件
z.close()
import tarfile

# tar打包文件
tar = tarfile.TarFile('test.tar', 'w')
tar.add('a.txt')
tar.add('bbs2.zip', arcname='bbs2.zip')  # 將被打包文件重命名
tar.add('cmdb.zip', arcname='cmdb.zip')
tar.close()

# 解壓單個文件
tar = tarfile.TarFile('test.tar', 'r')
a = tar.getmembers('a.txt')
tar.extract(a)
tar.extractall()  # 解壓所有文件,還可以設置解壓地址
tar.close()

源代碼實現:參看tarfile.py及zipfile.py

17.11 xml

XML是實現不同語言或程序之間進行數據交換的協議,XML文件格式如下:

<data>
    <country name="Liechtenstein">
        <rank updated="yes">2</rank>
        <year>2023</year>
        <gdppc>141100</gdppc>
        <neighbor direction="E" name="Austria" />
        <neighbor direction="W" name="Switzerland" />
    </country>
    <country name="Singapore">
        <rank updated="yes">5</rank>
        <year>2026</year>
        <gdppc>59900</gdppc>
        <neighbor direction="N" name="Malaysia" />
    </country>
    <country name="Panama">
        <rank updated="yes">69</rank>
        <year>2026</year>
        <gdppc>13600</gdppc>
        <neighbor direction="W" name="Costa Rica" />
        <neighbor direction="E" name="Colombia" />
    </country>
</data>

1、解析XML

# 利用ElementTree.XML將字符串解析成xml對象
from xml.etree import ElementTree as ET

# 打開文件,讀取XML內容
str_xml = open('xo.xml', 'r').read()
# 將字符串解析成xml特殊對象,root代指xml文件的根節點
root = ET.XML(str_xml)


# 利用ElementTree.parse將文件直接解析成xml對象
from xml.etree import ElementTree as ET

# 直接解析xml文件
tree = ET.parse("xo.xml")
# 獲取xml文件的根節點
root = tree.getroot()

2、操作XML

XML格式類型是節點嵌套節點,對於每一個節點均有以下功能,以便對當前節點進行操作:

class Element:
    """An XML element.

    This class is the reference implementation of the Element interface.

    An element's length is its number of subelements.  That means if you
    want to check if an element is truly empty, you should check BOTH
    its length AND its text attribute.

    The element tag, attribute names, and attribute values can be either
    bytes or strings.

    *tag* is the element name.  *attrib* is an optional dictionary containing
    element attributes. *extra* are additional element attributes given as
    keyword arguments.

    Example form:
        <tag attrib>text<child/>...</tag>tail

    """

    當前節點的標簽名
    tag = None
    """The element's name."""

    當前節點的屬性

    attrib = None
    """Dictionary of the element's attributes."""

    當前節點的內容
    text = None
    """
    Text before first subelement. This is either a string or the value None.
    Note that if there is no text, this attribute may be either
    None or the empty string, depending on the parser.

    """

    tail = None
    """
    Text after this element's end tag, but before the next sibling element's
    start tag.  This is either a string or the value None.  Note that if there
    was no text, this attribute may be either None or an empty string,
    depending on the parser.

    """

    def __init__(self, tag, attrib={}, **extra):
        if not isinstance(attrib, dict):
            raise TypeError("attrib must be dict, not %s" % (
                attrib.__class__.__name__,))
        attrib = attrib.copy()
        attrib.update(extra)
        self.tag = tag
        self.attrib = attrib
        self._children = []

    def __repr__(self):
        return "<%s %r at %#x>" % (self.__class__.__name__, self.tag, id(self))

    def makeelement(self, tag, attrib):
        創建一個新節點
        """Create a new element with the same type.

        *tag* is a string containing the element name.
        *attrib* is a dictionary containing the element attributes.

        Do not call this method, use the SubElement factory function instead.

        """
        return self.__class__(tag, attrib)

    def copy(self):
        """Return copy of current element.

        This creates a shallow copy. Subelements will be shared with the
        original tree.

        """
        elem = self.makeelement(self.tag, self.attrib)
        elem.text = self.text
        elem.tail = self.tail
        elem[:] = self
        return elem

    def __len__(self):
        return len(self._children)

    def __bool__(self):
        warnings.warn(
            "The behavior of this method will change in future versions.  "
            "Use specific 'len(elem)' or 'elem is not None' test instead.",
            FutureWarning, stacklevel=2
            )
        return len(self._children) != 0 # emulate old behaviour, for now

    def __getitem__(self, index):
        return self._children[index]

    def __setitem__(self, index, element):
        # if isinstance(index, slice):
        #     for elt in element:
        #         assert iselement(elt)
        # else:
        #     assert iselement(element)
        self._children[index] = element

    def __delitem__(self, index):
        del self._children[index]

    def append(self, subelement):
        為當前節點追加一個子節點
        """Add *subelement* to the end of this element.

        The new element will appear in document order after the last existing
        subelement (or directly after the text, if it's the first subelement),
        but before the end tag for this element.

        """
        self._assert_is_element(subelement)
        self._children.append(subelement)

    def extend(self, elements):
        為當前節點擴展 n 個子節點
        """Append subelements from a sequence.

        *elements* is a sequence with zero or more elements.

        """
        for element in elements:
            self._assert_is_element(element)
        self._children.extend(elements)

    def insert(self, index, subelement):
        在當前節點的子節點中插入某個節點,即:為當前節點創建子節點,然后插入指定位置
        """Insert *subelement* at position *index*."""
        self._assert_is_element(subelement)
        self._children.insert(index, subelement)

    def _assert_is_element(self, e):
        # Need to refer to the actual Python implementation, not the
        # shadowing C implementation.
        if not isinstance(e, _Element_Py):
            raise TypeError('expected an Element, not %s' % type(e).__name__)

    def remove(self, subelement):
        在當前節點在子節點中刪除某個節點
        """Remove matching subelement.

        Unlike the find methods, this method compares elements based on
        identity, NOT ON tag value or contents.  To remove subelements by
        other means, the easiest way is to use a list comprehension to
        select what elements to keep, and then use slice assignment to update
        the parent element.

        ValueError is raised if a matching element could not be found.

        """
        # assert iselement(element)
        self._children.remove(subelement)

    def getchildren(self):
        獲取所有的子節點(廢棄)
        """(Deprecated) Return all subelements.

        Elements are returned in document order.

        """
        warnings.warn(
            "This method will be removed in future versions.  "
            "Use 'list(elem)' or iteration over elem instead.",
            DeprecationWarning, stacklevel=2
            )
        return self._children

    def find(self, path, namespaces=None):
        獲取第一個尋找到的子節點
        """Find first matching element by tag name or path.

        *path* is a string having either an element tag or an XPath,
        *namespaces* is an optional mapping from namespace prefix to full name.

        Return the first matching element, or None if no element was found.

        """
        return ElementPath.find(self, path, namespaces)

    def findtext(self, path, default=None, namespaces=None):
        獲取第一個尋找到的子節點的內容
        """Find text for first matching element by tag name or path.

        *path* is a string having either an element tag or an XPath,
        *default* is the value to return if the element was not found,
        *namespaces* is an optional mapping from namespace prefix to full name.

        Return text content of first matching element, or default value if
        none was found.  Note that if an element is found having no text
        content, the empty string is returned.

        """
        return ElementPath.findtext(self, path, default, namespaces)

    def findall(self, path, namespaces=None):
        獲取所有的子節點
        """Find all matching subelements by tag name or path.

        *path* is a string having either an element tag or an XPath,
        *namespaces* is an optional mapping from namespace prefix to full name.

        Returns list containing all matching elements in document order.

        """
        return ElementPath.findall(self, path, namespaces)

    def iterfind(self, path, namespaces=None):
        獲取所有指定的節點,並創建一個迭代器(可以被for循環)
        """Find all matching subelements by tag name or path.

        *path* is a string having either an element tag or an XPath,
        *namespaces* is an optional mapping from namespace prefix to full name.

        Return an iterable yielding all matching elements in document order.

        """
        return ElementPath.iterfind(self, path, namespaces)

    def clear(self):
        清空節點
        """Reset element.

        This function removes all subelements, clears all attributes, and sets
        the text and tail attributes to None.

        """
        self.attrib.clear()
        self._children = []
        self.text = self.tail = None

    def get(self, key, default=None):
        獲取當前節點的屬性值
        """Get element attribute.

        Equivalent to attrib.get, but some implementations may handle this a
        bit more efficiently.  *key* is what attribute to look for, and
        *default* is what to return if the attribute was not found.

        Returns a string containing the attribute value, or the default if
        attribute was not found.

        """
        return self.attrib.get(key, default)

    def set(self, key, value):
        為當前節點設置屬性值
        """Set element attribute.

        Equivalent to attrib[key] = value, but some implementations may handle
        this a bit more efficiently.  *key* is what attribute to set, and
        *value* is the attribute value to set it to.

        """
        self.attrib[key] = value

    def keys(self):
        獲取當前節點的所有屬性的 key

        """Get list of attribute names.

        Names are returned in an arbitrary order, just like an ordinary
        Python dict.  Equivalent to attrib.keys()

        """
        return self.attrib.keys()

    def items(self):
        獲取當前節點的所有屬性值,每個屬性都是一個鍵值對
        """Get element attributes as a sequence.

        The attributes are returned in arbitrary order.  Equivalent to
        attrib.items().

        Return a list of (name, value) tuples.

        """
        return self.attrib.items()

    def iter(self, tag=None):
        在當前節點的子孫中根據節點名稱尋找所有指定的節點,並返回一個迭代器(可以被for循環)。
        """Create tree iterator.

        The iterator loops over the element and all subelements in document
        order, returning all elements with a matching tag.

        If the tree structure is modified during iteration, new or removed
        elements may or may not be included.  To get a stable set, use the
        list() function on the iterator, and loop over the resulting list.

        *tag* is what tags to look for (default is to return all elements)

        Return an iterator containing all the matching elements.

        """
        if tag == "*":
            tag = None
        if tag is None or self.tag == tag:
            yield self
        for e in self._children:
            yield from e.iter(tag)

    # compatibility
    def getiterator(self, tag=None):
        # Change for a DeprecationWarning in 1.4
        warnings.warn(
            "This method will be removed in future versions.  "
            "Use 'elem.iter()' or 'list(elem.iter())' instead.",
            PendingDeprecationWarning, stacklevel=2
        )
        return list(self.iter(tag))

    def itertext(self):
        在當前節點的子孫中根據節點名稱尋找所有指定的節點的內容,並返回一個迭代器(可以被for循環)。
        """Create text iterator.

        The iterator loops over the element and all subelements in document
        order, returning all inner text.

        """
        tag = self.tag
        if not isinstance(tag, str) and tag is not None:
            return
        if self.text:
            yield self.text
        for e in self:
            yield from e.itertext()
            if e.tail:
                yield e.tail

由於每個節點都具有以上的方法,並且在上一步驟中解析時均得到了root(xml文件的根節點),所有可以利用以上方法進行操作xml文件

操作1:遍歷XML文檔的所有內容

from xml.etree import ElementTree as ET

############ 解析方式一 ############
"""
# 打開文件,讀取XML內容
str_xml = open('xo.xml', 'r').read()

# 將字符串解析成xml特殊對象,root代指xml文件的根節點
root = ET.XML(str_xml)
"""
############ 解析方式二 ############

# 直接解析xml文件
tree = ET.parse("xo.xml")

# 獲取xml文件的根節點
root = tree.getroot()


### 操作

# 頂層標簽
print(root.tag)


# 遍歷XML文檔的第二層
for child in root:
    # 第二層節點的標簽名稱和標簽屬性
    print(child.tag, child.attrib)
    # 遍歷XML文檔的第三層
    for i in child:
        # 第二層節點的標簽名稱和內容
        print(i.tag,i.text)

操作2:遍歷XML中指定的節點

from xml.etree import ElementTree as ET

############ 解析方式一 ############
"""
# 打開文件,讀取XML內容
str_xml = open('xo.xml', 'r').read()

# 將字符串解析成xml特殊對象,root代指xml文件的根節點
root = ET.XML(str_xml)
"""
############ 解析方式二 ############

# 直接解析xml文件
tree = ET.parse("xo.xml")

# 獲取xml文件的根節點
root = tree.getroot()

### 操作

# 頂層標簽
print(root.tag)

# 遍歷XML中所有的year節點
for node in root.iter('year'):
    # 節點的標簽名稱和內容
    print(node.tag, node.text)

操作3:修改節點內容

  由於修改的節點時,均是在內存中進行,其不會影響文件中的內容。所以,如果想要修改,則需要重新將內存中的內容寫到文件。

方法1:解析字符串方式,修改並保存

from xml.etree import ElementTree as ET

############ 解析方式一 ############

# 打開文件,讀取XML內容
str_xml = open('xo.xml', 'r').read()

# 將字符串解析成xml特殊對象,root代指xml文件的根節點
root = ET.XML(str_xml)

############ 操作 ############

# 頂層標簽
print(root.tag)

# 循環所有的year節點
for node in root.iter('year'):
    # 將year節點中的內容自增一
    new_year = int(node.text) + 1
    node.text = str(new_year)

    # 設置屬性
    node.set('name', 'alex')
    node.set('age', '18')
    # 刪除屬性
    del node.attrib['name']


############ 保存文件 ############
tree = ET.ElementTree(root)
tree.write("newnew.xml", encoding='utf-8')

解析字符串方式,修改,保存

方法2:解析文件方式,修改並保存

from xml.etree import ElementTree as ET

############ 解析方式二 ############

# 直接解析xml文件
tree = ET.parse("xo.xml")

# 獲取xml文件的根節點
root = tree.getroot()

############ 操作 ############

# 頂層標簽
print(root.tag)

# 循環所有的year節點
for node in root.iter('year'):
    # 將year節點中的內容自增一
    new_year = int(node.text) + 1
    node.text = str(new_year)

    # 設置屬性
    node.set('name', 'alex')
    node.set('age', '18')
    # 刪除屬性
    del node.attrib['name']


############ 保存文件 ############
tree.write("newnew.xml", encoding='utf-8')

解析文件方式,修改,保存

操作4:刪除節點

方法1:解析字符串方式打開,刪除並保存

from xml.etree import ElementTree as ET

############ 解析字符串方式打開 ############

# 打開文件,讀取XML內容
str_xml = open('xo.xml', 'r').read()

# 將字符串解析成xml特殊對象,root代指xml文件的根節點
root = ET.XML(str_xml)

############ 操作 ############

# 頂層標簽
print(root.tag)

# 遍歷data下的所有country節點
for country in root.findall('country'):
    # 獲取每一個country節點下rank節點的內容
    rank = int(country.find('rank').text)

    if rank > 50:
        # 刪除指定country節點
        root.remove(country)

############ 保存文件 ############
tree = ET.ElementTree(root)
tree.write("newnew.xml", encoding='utf-8')

解析字符串方式打開,刪除,保存

方法2:解析文件方式打開,刪除並保存

from xml.etree import ElementTree as ET

############ 解析文件方式 ############

# 直接解析xml文件
tree = ET.parse("xo.xml")

# 獲取xml文件的根節點
root = tree.getroot()

############ 操作 ############

# 頂層標簽
print(root.tag)

# 遍歷data下的所有country節點
for country in root.findall('country'):
    # 獲取每一個country節點下rank節點的內容
    rank = int(country.find('rank').text)

    if rank > 50:
        # 刪除指定country節點
        root.remove(country)

############ 保存文件 ############
tree.write("newnew.xml", encoding='utf-8')

解析文件方式打開,刪除,保存

3、創建XML

創建方法1:

from xml.etree import ElementTree as ET


# 創建根節點
root = ET.Element("famliy")


# 創建節點大兒子
son1 = ET.Element('son', {'name': '兒1'})
# 創建小兒子
son2 = ET.Element('son', {"name": '兒2'})

# 在大兒子中創建兩個孫子
grandson1 = ET.Element('grandson', {'name': '兒11'})
grandson2 = ET.Element('grandson', {'name': '兒12'})
son1.append(grandson1)
son1.append(grandson2)


# 把兒子添加到根節點中
root.append(son1)
root.append(son1)

tree = ET.ElementTree(root)
tree.write('oooo.xml',encoding='utf-8', short_empty_elements=False)

創建方法2:

from xml.etree import ElementTree as ET

# 創建根節點
root = ET.Element("famliy")


# 創建大兒子
# son1 = ET.Element('son', {'name': '兒1'})
son1 = root.makeelement('son', {'name': '兒1'})
# 創建小兒子
# son2 = ET.Element('son', {"name": '兒2'})
son2 = root.makeelement('son', {"name": '兒2'})

# 在大兒子中創建兩個孫子
# grandson1 = ET.Element('grandson', {'name': '兒11'})
grandson1 = son1.makeelement('grandson', {'name': '兒11'})
# grandson2 = ET.Element('grandson', {'name': '兒12'})
grandson2 = son1.makeelement('grandson', {'name': '兒12'})

son1.append(grandson1)
son1.append(grandson2)


# 把兒子添加到根節點中
root.append(son1)
root.append(son1)

tree = ET.ElementTree(root)
tree.write('oooo.xml',encoding='utf-8', short_empty_elements=False)

創建方法3:

from xml.etree import ElementTree as ET


# 創建根節點
root = ET.Element("famliy")


# 創建節點大兒子
son1 = ET.SubElement(root, "son", attrib={'name': '兒1'})
# 創建小兒子
son2 = ET.SubElement(root, "son", attrib={"name": "兒2"})

# 在大兒子中創建一個孫子
grandson1 = ET.SubElement(son1, "age", attrib={'name': '兒11'})
grandson1.text = '孫子'


et = ET.ElementTree(root)  #生成文檔對象
et.write("test.xml", encoding="utf-8", xml_declaration=True, short_empty_elements=False)

由於原生保存的XML時默認無縮進,如果要設置縮進的話,需要修改保存方式:

from xml.etree import ElementTree as ET
from xml.dom import minidom


def prettify(elem):
    """將節點轉換成字符串,並添加縮進。
    """
    rough_string = ET.tostring(elem, 'utf-8')
    reparsed = minidom.parseString(rough_string)
    return reparsed.toprettyxml(indent="\t")

# 創建根節點
root = ET.Element("famliy")


# 創建大兒子
# son1 = ET.Element('son', {'name': '兒1'})
son1 = root.makeelement('son', {'name': '兒1'})
# 創建小兒子
# son2 = ET.Element('son', {"name": '兒2'})
son2 = root.makeelement('son', {"name": '兒2'})

# 在大兒子中創建兩個孫子
# grandson1 = ET.Element('grandson', {'name': '兒11'})
grandson1 = son1.makeelement('grandson', {'name': '兒11'})
# grandson2 = ET.Element('grandson', {'name': '兒12'})
grandson2 = son1.makeelement('grandson', {'name': '兒12'})

son1.append(grandson1)
son1.append(grandson2)


# 把兒子添加到根節點中
root.append(son1)
root.append(son1)


raw_str = prettify(root)

f = open("xxxoo.xml",'w',encoding='utf-8')
f.write(raw_str)
f.close()

4、命名空間

http://www.w3school.com.cn/xml/xml_namespaces.asp

17.12 requests

  Requests 是使用 Apache2 Licensed 許可證的 基於Python開發的HTTP 庫,其在Python內置模塊的基礎上進行了高度的封裝,從而使得Pythoner進行網絡請求時,變得美好了許多,使用Requests可以輕而易舉的 完成瀏覽器可有的任何操作。

1、模塊安裝

pip3 install requests

2、模塊使用

# GET請求
# 無參數案例
import requests
 
ret = requests.get('https://github.com/timeline.json')
 
print(ret.url)
print(ret.text)

# 有參數實例
import requests
 
payload = {'key1': 'value1', 'key2': 'value2'}
ret = requests.get("http://httpbin.org/get", params=payload)
 
print(ret.url)
print(ret.text)
# POST請求

# 1、基本POST實例
import requests
 
payload = {'key1': 'value1', 'key2': 'value2'}
ret = requests.post("http://httpbin.org/post", data=payload)
 
print(ret.text)
 
# 2、發送請求頭和數據實例
import requests
import json
 
url = 'https://api.github.com/some/endpoint'
payload = {'some': 'data'}
headers = {'content-type': 'application/json'}
 
ret = requests.post(url, data=json.dumps(payload), headers=headers)
 
print(ret.text)
print(ret.cookies)
# POST請求

# 1、基本POST實例
import requests
 
payload = {'key1': 'value1', 'key2': 'value2'}
ret = requests.post("http://httpbin.org/post", data=payload)
 
print(ret.text)
 
# 2、發送請求頭和數據實例
import requests
import json
 
url = 'https://api.github.com/some/endpoint'
payload = {'some': 'data'}
headers = {'content-type': 'application/json'}
 
ret = requests.post(url, data=json.dumps(payload), headers=headers)
 
print(ret.text)
print(ret.cookies)
# 其他請求
requests.get(url, params=None, **kwargs)
requests.post(url, data=None, json=None, **kwargs)
requests.put(url, data=None, **kwargs)
requests.head(url, **kwargs)
requests.delete(url, **kwargs)
requests.patch(url, data=None, **kwargs)
requests.options(url, **kwargs)
 
# 以上方法均是在此方法的基礎上構建
requests.request(method, url, **kwargs)

其他資料:http://cn.python-requests.org/zh_CN/latest/

3、http請求和xml實例

實例1:檢測QQ賬號是否在線

import urllib
import requests
from xml.etree import ElementTree as ET

# 使用內置模塊urllib發送HTTP請求,或者XML格式內容
"""
f = urllib.request.urlopen('http://www.webxml.com.cn//webservices/qqOnlineWebService.asmx/qqCheckOnline?qqCode=424662508')
result = f.read().decode('utf-8')
"""


# 使用第三方模塊requests發送HTTP請求,或者XML格式內容
r = requests.get('http://www.webxml.com.cn//webservices/qqOnlineWebService.asmx/qqCheckOnline?qqCode=424662508')
result = r.text

# 解析XML格式內容
node = ET.XML(result)

# 獲取內容
if node.text == "Y":
    print("在線")
else:
    print("離線")

實例2:查看火車停靠信息

import urllib
import requests
from xml.etree import ElementTree as ET

# 使用內置模塊urllib發送HTTP請求,或者XML格式內容
"""
f = urllib.request.urlopen('http://www.webxml.com.cn/WebServices/TrainTimeWebService.asmx/getDetailInfoByTrainCode?TrainCode=G666&UserID=')
result = f.read().decode('utf-8')
"""

# 使用第三方模塊requests發送HTTP請求,或者XML格式內容
r = requests.get('http://www.webxml.com.cn/WebServices/TrainTimeWebService.asmx/getDetailInfoByTrainCode?TrainCode=G666&UserID=')
result = r.text

# 解析XML格式內容
root = ET.XML(result)
for node in root.iter('TrainDetailInfo'):
    print(node.find('TrainStation').text,node.find('StartTime').text,node.tag,node.attrib)

更多實例:http://www.cnblogs.com/wupeiqi/archive/2012/11/18/2776014.html

17.13 ConfigParser

configparser用來處理特定格式的文件,其本質是利用open來操作文件

# 注釋1
; 注釋2
 
[section1]    # 節點
k1 = v1       #
k2:v2         #
 
[section2]    # 節點
k1 = v1       #
# 獲取所有節點
import configparser

config_file = 'test.ini'
config = configparser.ConfigParser()
config.read(config_file, encoding='utf-8')
ret = config.sections()
print(ret)

# 判斷某個節點是否存在
import configparser

config_file = 'test.ini'
config = configparser.ConfigParser()
config.read(config_file, encoding='utf-8')
has_se = config.has_section('section1')
print(has_se)

# 添加節點
import configparser

config_file = 'test.ini'
config = configparser.ConfigParser()
config.read(config_file, encoding='utf-8')
config.add_section('section3')
config.write(open(config_file, 'w'))

# 刪除節點
import configparser

config_file = 'test.ini'
config = configparser.ConfigParser()
config.read(config_file, encoding='utf-8')
config.remove_section('section3')
config.write(open(config_file, 'w'))

# 獲取指定節點下所有的鍵值對
import configparser

config_file = 'test.ini'
config = configparser.ConfigParser()
config.read(config_file, encoding='utf-8')
kv = config.items('section2')

# 獲取指定節點下所有鍵
import configparser

config_file = 'test.ini'
config = configparser.ConfigParser()
config.read(config_file, encoding='utf-8')
ret = config.options('section1')

# 獲取指定節點下指定key的值
import configparser

config_file = 'test.ini'
config = configparser.ConfigParser()
config.read(config_file, encoding='utf-8')
v = config.get('section1', 'k1')
# v = config.getint('section1', 'k1')
# v = config.getfloat('section1', 'k1')
# v = config.getboolean('section1', 'k1')

# 檢查指定節點內的鍵值對
has_opt = config.has_option('section1', 'k1')
print(has_opt)

# 刪除指定節點內的鍵值對
config.remove_option('section1', 'k1')
config.write(open('xxxooo', 'w'))

# 設置指定節點內的鍵值對
config.set('section1', 'k10', "123")
config.write(open('xxxooo', 'w'))
print(v)

17.14 xml convert json

#!/usr/bin/env python
# encoding: utf-8

# json轉換xml
import xmltodict
import json


def python_conver_xml_to_json():
    """
    demo Python conversion between xml and json
    :return:
    """
    xml = """
    <data>
        <country name="Liechtenstein">
            <rank updated="yes">2</rank>
            <year>2023</year>
            <gdppc>141100</gdppc>
            <neighbor direction="E" name="Austria" />
            <neighbor direction="W" name="Switzerland" />
        </country>
        <country name="Singapore">
            <rank updated="yes">5</rank>
            <year>2026</year>
            <gdppc>59900</gdppc>
            <neighbor direction="N" name="Malaysia" />
        </country>
        <country name="Panama">
            <rank updated="yes">69</rank>
            <year>2026</year>
            <gdppc>13600</gdppc>
            <neighbor direction="W" name="Costa Rica" />
            <neighbor direction="E" name="Colombia" />
        </country>
    </data>
    """
    dic = xmltodict.parse(xml)
    # json_str = json.dumps(dic)  # 默認沒有進行格式化
    json_str = json.dumps(dic, indent=1)  # 默認沒有進行格式化
    print(json_str)


def python_conver_json_to_xml():
    """
    demo Python conversion between xml and json
    :return:
    """
    dic =  {
        'page': {
            'title': 'King Crimson',
            'ns': 0,
            'revision': {
                'id': 547909091,
            }
        }
    }
    xml = xmltodict.unparse(dic)
    print(xml)


if __name__ == '__main__':
    python_conver_xml_to_json()
    python_conver_json_to_xml()

練習題

1、通過HTTP請求和XML實現獲取電視節目

     API:http://www.webxml.com.cn/webservices/ChinaTVprogramWebService.asmx  

2、通過HTTP請求和JSON實現獲取天氣狀況

     API:http://wthrcdn.etouch.cn/weather_mini?city=北京


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM