1. encode
encode將字符串轉換為bytes類型的對象 (即b為前綴, bytes類型), 即Ascll編碼, 字節數組
>>> a0 = '哈哈' >>> b = a0.encode() >>> type(b) <class 'bytes'> >>> >>> b b'\xe5\x93\x88\xe5\x93\x88'
2. decode
字節編碼可decode為str
>>> a1 = b'\xe7\x8e\x8b\xe8\x80\x85\xe5\x86\x9c\xe8\x8d\xaf' >>> b = a1.decode() >>> b '王者農葯' >>> type(b) <class 'str'>
3. encode('raw_unicode_escape')和 decode('raw_unicode_escape')
若某字符串的內容為bytes形式, 如 a = '\xe7\x8e\x8b\xe8\x80\x85\xe5\x86\x9c\xe8\x8d\xaf'
可使用encode('raw_unicode_escape')將此str轉化為bytes, 再decode為str
可使用decode('raw_unicode_escape')輸出內容為bytes形式的字符串
>>> a = '\xe7\x8e\x8b\xe8\x80\x85\xe5\x86\x9c\xe8\x8d\xaf' >>> b = a.encode('raw_unicode_escape') >>> type(b) <class 'bytes'> >>> b b'\xe7\x8e\x8b\xe8\x80\x85\xe5\x86\x9c\xe8\x8d\xaf' >>> >>> b.decode() '王者農葯' >>> >>> b.decode('raw_unicode_escape') 'ç\x8e\x8bè\x80\x85å\x86\x9cè\x8d¯'
4. encode('unicode-escape')和 decode('unicode-escape')
若某字符串的內容為unicode形式, 如s = '\u5403\u9e21\u6218\u573a', 在py3中默認為utf-8編碼, py3將其自動解釋為 '吃雞戰場'
encode('unicode-escape')可將此str編碼為bytes類型, 內容則是unicode形式
decode('unicode-escape')可將內容為unicode形式的bytes類型轉換為str
>>> a = '\u5403\u9e21\u6218\u573a' >>> b = a.encode('unicode-escape') >>> type(b) <class 'bytes'> >>> b b'\\u5403\\u9e21\\u6218\\u573a' >>> >>> b.decode('utf-8') '\\u5403\\u9e21\\u6218\\u573a' >>> >>> >>> c = b.decode('utf-8') >>> c '\\u5403\\u9e21\\u6218\\u573a' >>> >>> c.encode() b'\\u5403\\u9e21\\u6218\\u573a' >>> >>> c.encode().decode('unicode-escape') '吃雞戰場'
5. python2使用ASC11碼作為默認編碼方式, python3的默認編碼為utf-8
Python 2.7.16 (default, Aug 24 2019, 18:37:03) [GCC 4.2.1 Compatible Apple LLVM 11.0.0 (clang-1100.0.32.4) (-macos10.15-objc-s on darwin Type "help", "copyright", "credits" or "license" for more information. >>> >>> s = u'\u5403\u9e21\u6218\u573a' >>> s u'\u5403\u9e21\u6218\u573a' Python 3.7.4 (v3.7.4:e09359112e, Jul 8 2019, 14:54:52) [Clang 6.0 (clang-600.0.57)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> >>> s = u'\u5403\u9e21\u6218\u573a' >>> s 吃雞戰場