基於python的extract_msg模塊提取outlook郵箱保存的msg文件中的附件


筆者保存了一些outlook郵箱中保存的一些msg格式的郵件文件,現需要將其中的附件提取出來,

當然直接在outlook中就可以另存附件,但outlook默認是不支持批量提取郵件中的附件的

思考過幾種方案,其中之一就是使用python編程語言下的extract_msg模塊,記錄如下

 

1、安裝extract_msg模塊 pip install extract-msg ,筆者寫此隨筆時,最新版本為extract-msg 0.27.4

     發布於Released: Sep 3, 2020,項目說明:https://pypi.org/project/extract-msg

2、安裝后,最簡單的使用,直接在命令行一條命令,即可將msg中的文件解壓到當前目錄下的一個子目錄中(目錄名與郵件信息有關)

#會在當前目錄下,生成一個目錄,然后將msg郵件文件中的附件和message.txt解壓到其中
python -m extract_msg qq_5201351.msg

3、在py文件中,可以使用如下方法只提取其中的附件(需要先創建要保存附件的目錄):

import extract_msg

msg = extract_msg.Message("qq_5201351.msg")

msg_attachment = msg.attachments

if msg_attachment:
    for attachment in msg_attachment:
        attachment.save(customPath="./qq_5201351_dir")

 

++++++未解決的問題>>>>:

1、使用上面的方法對於大多數msg都能夠正常提取出附件,或者郵件內容,但是筆者有的mgs提取時會報如下錯誤,

     目錄未找到解決方法, 如有找到解決方法的,歡迎下方留言,非常感謝!

Traceback (most recent call last):
  File "C:\Users\QQ5201351\AppData\Local\Programs\Python\Python37\lib\site-packages\extract_msg\msg.py", line 422, in named
    return self.__namedProperties
AttributeError: 'Message' object has no attribute '_MSGFile__namedProperties'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\QQ5201351\Desktop\mail\test\test.py", line 5, in <module>
    msg = extract_msg.Message("Important_msg_from_qq5201351.msg")
  File "C:\Users\QQ5201351\AppData\Local\Programs\Python\Python37\lib\site-packages\extract_msg\message.py", line 28, in __init__
    MessageBase.__init__(self, path, prefix, attachmentClass, filename, delayAttachments, overrideEncoding)
  File "C:\Users\QQ5201351\AppData\Local\Programs\Python\Python37\lib\site-packages\extract_msg\message_base.py", line 61, in __init__
    self.named
  File "C:\Users\QQ5201351\AppData\Local\Programs\Python\Python37\lib\site-packages\extract_msg\msg.py", line 424, in named
    self.__namedProperties = Named(self)
  File "C:\Users\QQ5201351\AppData\Local\Programs\Python\Python37\lib\site-packages\extract_msg\named.py", line 63, in __init__
    self.__properties.append(StringNamedProperty(entry, names[entry['id']], msg._getTypedData(streamID)) if entry['pkind'] == constants.STRING_NAMED else NumericalNamedProperty(entry, msg._getTypedData(streamID)))
  File "C:\Users\QQ5201351\AppData\Local\Programs\Python\Python37\lib\site-packages\extract_msg\msg.py", line 177, in _getTypedData
    found, result = self._getTypedStream('__substg1.0_' + id, prefix, _type)
  File "C:\Users\QQ5201351\AppData\Local\Programs\Python\Python37\lib\site-packages\extract_msg\msg.py", line 246, in _getTypedStream
    raise NotImplementedError('The stream specified is of type {}. We don\'t currently understand exactly how this type works. If it is mandatory that you have the contents of this stream, please create an issue labled "NotImplementedError: _getTypedStream {}".'.format(_type, _type))
NotImplementedError: The stream specified is of type 1014. We don't currently understand exactly how this type works. If it is mandatory that you have the contents of this stream, please create an issue labled "NotImplementedError: _getTypedStream 1014".

 

 

 

尊重別人的勞動成果 轉載請務必注明出處:https://www.cnblogs.com/5201351/p/13695389.html

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM