python的subprocess的簡單使用和注意事項

本文轉載自查看原文 2014-03-27 00:38 7551 python/ python tricks

subprocess是python在2.4引入的模塊, 主要用來替代下面幾個模塊和方法:

os.system
os.spawn*
os.popen*
popen2.*
commands.*

可以參考PEP324: http://legacy.python.org/dev/peps/pep-0324/

這是一個用來調用外部命令的模塊, 替代了一些舊的模塊, 提供了更加友好統一的接口.

三個封裝方法

使用下面三個方法的時候, 注意兩個問題: 1. shell=True或False, 兩種解析方式是不同的 2. 注意PIPE的使用, 可能導致卡死

subprocess.call 運行命令, 等待完成, 並返回returncode

subprocess.check_call 運行命令, 等待完成, 如果返回值為0, 則返回returncode, 否則拋出帶有returncode的CalledPorcessError異常.

subprocess.check_output 和check_call類似, 會檢查返回值是否為0, 返回stdout.

卡死常見的原因

這個模塊在使用的時候, 可能會出現子進程卡死或hang住的情況. 一般出現這種情況的是這樣的用法.

import subprocess
import shlex

proc = subprocess.Popen(shlex.split(cmd), stdin=subprocess.PIPE,
                        stdout=subprocess.PIPE, stderr=subprocess.PIPE,
                        shell=False, universal_newlines=True)
print proc.stdout.read()

這里的直接讀取了Popen對象的stdout, 使用了subprocess.PIPE. 這種情況導致卡死的原因是PIPE管道的緩存被充滿了, 無法繼續寫入, 也沒有將緩存中的東西讀出來.

官方文檔的提示(Popen.wait()方法的Warning)

This will deadlock when using stdout=PIPE and/or stderr=PIPE and the child process generates enough output to a pipe such that it blocks waiting for the OS pipe buffer to accept more data. Use communicate() to avoid that.

為了解決這個問題, Popen有一個communicate方法, 這個方法會將stdout和stderr數據讀入到內存, 並且會一直獨到兩個流的結尾(EOF). communicate可以傳入一個input參數, 用來表示向stdin中寫入數據, 可以做到進程間的數據通信.

注意: 官方文檔中提示, 讀取的數據是緩存在內存中的, 所以當數據量非常大或者是無限制的時候, 不要使用communicate, 應該會導致OOM.

一般情況下, stdout和stderr數據量不是很大的時候, 使用communicate不會導致問題, 量特別大的時候可以考慮使用文件來替代PIPE, 例如stdout=open("stdout", "w")[參考1].

參考2中給出了另一種解決的思路, 使用select來讀取Popen的stdout和stderr中的數據, select的這種用法windows下是不支持的, 不過可以做到比較實時的讀取數據.

Popen中的shell參數含義

官方文檔中推薦shell=False, 這種方式更加安全, 我們來看一下官方給出的例子.

>>> from subprocess import call
>>> filename = input("What file would you like to display?\n")
What file would you like to display?
non_existent; rm -rf / #
>>> call("cat " + filename, shell=True) # Uh-oh. This will end badly...

上面這種命令拼寫的方式會導致一個安全問題, 就是用戶可以進行類似sql注入的shell注入, 刪除整個系統的文件, 這個是極其危險的.

shell=False會屏蔽shell中的很多特性, 所以可以避免上述這種安全問題, 當需要暴露給用戶去使用的時候, 尤其要注意上述的安全問題.

shell=True的時候就是按照shell的方式解析和運行的.

Popen的一些簡單調優思路

有個bufsize的參數, 這個默認是0, 就是不緩存, 1表示行緩存, 其他正數表示緩存使用的大小, 負數-1表示是使用系統默認的緩存大小.

在運行性能遇到問題時, 可能是緩存區未開啟或者太小, 導致了子進程被卡住, 效率低下, 可以考慮配置為-1或4096.

需要實時讀取stdout和stderr的, 可以查閱[參考2], 使用select來讀取stdout和stderr流.

參考:

小心subprocess的PIPE卡住你的python程序: http://noops.me/?p=92
pipe large amount of data to stdin while using subprocess.Popen: http://stackoverflow.com/questions/5911362/pipe-large-amount-of-data-to-stdin-while-using-subprocess-popen
python docs: http://docs.python.org/2/library/subprocess.html

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 使用subprocess.Poen注意事項簡單使用Vuex步驟及注意事項 es簡單介紹及使用注意事項 Python多線程使用和注意事項 mysqldump 使用及其注意事項 SqlBulkCopy使用注意事項視圖的使用及注意事項 In和Not In的使用注意事項和區別 dvaJs使用注意事項 NSTimer使用注意事項