Python學習——struct模塊的pack、unpack示例


mport struct

pack、unpack、pack_into、unpack_from

 1 # ref: http://blog.csdn<a href="http://lib.csdn.net/base/dotnet" class='replace_word' title=".NET知識庫" target='_blank' style='color:#df3434; font-weight:bold;'>.NET</a>/JGood/archive/2009/06/22/4290158.aspx  
 2   
 3 import struct  
 4   
 5 #pack - unpack  
 6 print  
 7 print '===== pack - unpack ====='  
 8   
 9 str = struct.pack("ii", 20, 400)  
10 print 'str:', str  
11 print 'len(str):', len(str) # len(str): 8   
12   
13 a1, a2 = struct.unpack("ii", str)  
14 print "a1:", a1  # a1: 20  
15 print "a2:", a2  # a2: 400  
16   
17 print 'struct.calcsize:', struct.calcsize("ii") # struct.calcsize: 8  
18   
19   
20 #unpack  
21 print  
22 print '===== unpack ====='  
23   
24 string = 'test astring'  
25 format = '5s 4x 3s'  
26 print struct.unpack(format, string) # ('test ', 'ing')  
27   
28 string = 'he is not very happy'  
29 format = '2s 1x 2s 5x 4s 1x 5s'  
30 print struct.unpack(format, string) # ('he', 'is', 'very', 'happy')  
31   
32   
33 #pack  
34 print  
35 print '===== pack ====='  
36   
37 a = 20  
38 b = 400  
39   
40 str = struct.pack("ii", a, b)  
41 print 'length:', len(str) #length: 8  
42 print str  
43 print repr(str) # '/x14/x00/x00/x00/x90/x01/x00/x00'  
44   
45   
46 #pack_into - unpack_from  
47 print  
48 print '===== pack_into - unpack_from ====='  
49 from ctypes import create_string_buffer  
50   
51 buf = create_string_buffer(12)  
52 print repr(buf.raw)  
53   
54 struct.pack_into("iii", buf, 0, 1, 2, -1)  
55 print repr(buf.raw)  
56   
57 print struct.unpack_from("iii", buf, 0)  

 

運行結果:

[work@db-testing-com06-vm3.db01.baidu.com Python]$ python struct_pack.py

===== pack - unpack =====
str: ?
len(str): 8
a1: 20
a2: 400
struct.calcsize: 8

===== unpack =====
('test ', 'ing')
('he', 'is', 'very', 'happy')

===== pack =====
length: 8
?
'/x14/x00/x00/x00/x90/x01/x00/x00'

===== pack_into - unpack_from =====
'/x00/x00/x00/x00/x00/x00/x00/x00/x00/x00/x00/x00'
'/x01/x00/x00/x00/x02/x00/x00/x00/xff/xff/xff/xff'
(1, 2, -1)

 


Python是一門非常簡潔的語言,對於數據類型的表示,不像其他語言預定義了許多類型(如:在C#中,光整型就定義了8種)

它只定義了六種基本類型:字符串,整數,浮點數,元組(set),列表(array),字典(key/value)

通過這六種數據類型,我們可以完成大部分工作。但當Python需要通過網絡與其他的平台進行交互的時候,必須考慮到將這些數據類型與其他平台或語言之間的類型進行互相轉換問題。打個比方:C++寫的客戶端發送一個int型(4字節)變量的數據到Python寫的服務器,Python接收到表示這個整數的4個字節數據,怎么解析成Python認識的整數呢? Python的標准模塊struct就用來解決這個問題。

 

struct模塊的內容不多,也不是太難,下面對其中最常用的方法進行介紹:

1、 struct.pack
struct.pack用於將Python的值根據格式符,轉換為字符串(因為Python中沒有字節(Byte)類型,可以把這里的字符串理解為字節流,或字節數組)。其函數原型為:struct.pack(fmt, v1, v2, ...),參數fmt是格式字符串,關於格式字符串的相關信息在下面有所介紹。v1, v2, ...表示要轉換的python值。下面的例子將兩個整數轉換為字符串(字節流):

 1 #!/usr/bin/env python  
 2 #encoding: utf8  
 3   
 4 import sys  
 5 reload(sys)  
 6 sys.setdefaultencoding("utf-8")  
 7   
 8 import struct  
 9   
10 a = 20  
11 b = 400   
12 str = struct.pack("ii", a, b)  
13 print 'length: ', len(str)          # length:  8  
14 print str                           # 亂碼:   
15 print repr(str)                     # '\x14\x00\x00\x00\x90\x01\x00\x00'  

格式符"i"表示轉換為int,'ii'表示有兩個int變量。

進行轉換后的結果長度為8個字節(int類型占用4個字節,兩個int為8個字節)

可以看到輸出的結果是亂碼,因為結果是二進制數據,所以顯示為亂碼。

可以使用python的內置函數repr來獲取可識別的字符串,其中十六進制的0x00000014, 0x00001009分別表示20和400。

 

2、 struct.unpack
struct.unpack做的工作剛好與struct.pack相反,用於將字節流轉換成python數據類型。它的函數原型為:struct.unpack(fmt, string),該函數返回一個元組。 

下面是一個簡單的例子:

 1 #!/usr/bin/env python  
 2 #encoding: utf8  
 3   
 4 import sys  
 5 reload(sys)  
 6 sys.setdefaultencoding("utf-8")  
 7   
 8 import struct  
 9   
10 a = 20  
11 b = 400   
12   
13 # pack  
14 str = struct.pack("ii", a, b)  
15 print 'length: ', len(str)          # length:  8  
16 print str                           # 亂碼:   
17 print repr(str)                     # '\x14\x00\x00\x00\x90\x01\x00\x00'  
18   
19 # unpack  
20 str2 = struct.unpack("ii", str)  
21 print 'length: ', len(str2)          # length:  2  
22 print str2                           # (20, 400)  
23 print repr(str2)                     # (20, 400)  

3、 struct.calcsize
struct.calcsize用於計算格式字符串所對應的結果的長度,如:struct.calcsize('ii'),返回8。因為兩個int類型所占用的長度是8個字節。

1 import struct  
2 print "len: ", struct.calcsize('i')       # len:  4  
3 print "len: ", struct.calcsize('ii')      # len:  8  
4 print "len: ", struct.calcsize('f')       # len:  4  
5 print "len: ", struct.calcsize('ff')      # len:  8  
6 print "len: ", struct.calcsize('s')       # len:  1  
7 print "len: ", struct.calcsize('ss')      # len:  2  
8 print "len: ", struct.calcsize('d')       # len:  8  
9 print "len: ", struct.calcsize('dd')      # len:  16 

4、 struct.pack_into、 struct.unpack_from
這兩個函數在Python手冊中有所介紹,但沒有給出如何使用的例子。其實它們在實際應用中用的並不多。Google了很久,才找到一個例子,貼出來共享一下:

 1 #!/usr/bin/env python  
 2 #encoding: utf8  
 3   
 4 import sys  
 5 reload(sys)  
 6 sys.setdefaultencoding("utf-8")  
 7   
 8 import struct  
 9 from ctypes import create_string_buffer  
10   
11 buf = create_string_buffer(12)  
12 print repr(buf.raw)     # '\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'  
13   
14 struct.pack_into("iii", buf, 0, 1, 2, -1)  
15 print repr(buf.raw)     # '\x01\x00\x00\x00\x02\x00\x00\x00\xff\xff\xff\xff'  
16   
17 print struct.unpack_from("iii", buf, 0)     # (1, 2, -1) 

struct 類型表

 

Format C Type Python type Standard size Notes
x pad byte no value    
c char string of length 1 1  
b signed char integer 1 (3)
B unsigned char integer 1 (3)
? _Bool bool 1 (1)
h short integer 2 (3)
H unsigned short integer 2 (3)
i int integer 4 (3)
I unsigned int integer 4 (3)
l long integer 4 (3)
L unsigned long integer 4 (3)
q long long integer 8 (2), (3)
Q unsigned long long integer 8 (2), (3)
f float float 4 (4)
d double float 8 (4)
s char[] string 1  
p char[] string    
P void * integer   (5), (3)

 

 

Notes:

  1. The '?' conversion code corresponds to the _Bool type defined by C99. If this type is not available, it is simulated using a char. In standard mode, it is always represented by one byte.

    New in version 2.6.

  2. The 'q' and 'Q' conversion codes are available in native mode only if the platform C compiler supports C long long, or, on Windows, __int64. They are always available in standard modes.

    New in version 2.2.

  3. When attempting to pack a non-integer using any of the integer conversion codes, if the non-integer has a __index__() method then that method is called to convert the argument to an integer before packing. If no __index__() method exists, or the call to __index__() raisesTypeError, then the __int__() method is tried. However, the use of __int__() is deprecated, and will raise DeprecationWarning.

    Changed in version 2.7: Use of the __index__() method for non-integers is new in 2.7.

    Changed in version 2.7: Prior to version 2.7, not all integer conversion codes would use the __int__() method to convert, andDeprecationWarning was raised only for float arguments.

  4. For the 'f' and 'd' conversion codes, the packed representation uses the IEEE 754 binary32 (for 'f') or binary64 (for 'd') format, regardless of the floating-point format used by the platform.

  5. The 'P' format character is only available for the native byte ordering (selected as the default or with the '@' byte order character). The byte order character '=' chooses to use little- or big-endian ordering based on the host system. The struct module does not interpret this as native ordering, so the 'P' format is not available.

A format character may be preceded by an integral repeat count. For example, the format string '4h' means exactly the same as 'hhhh'.

Whitespace characters between formats are ignored; a count and its format must not contain whitespace though.

For the 's' format character, the count is interpreted as the size of the string, not a repeat count like for the other format characters; for example,'10s' means a single 10-byte string, while '10c' means 10 characters. For packing, the string is truncated or padded with null bytes as appropriate to make it fit. For unpacking, the resulting string always has exactly the specified number of bytes. As a special case, '0s' means a single, empty string (while '0c' means 0 characters).

The 'p' format character encodes a “Pascal string”, meaning a short variable-length string stored in a fixed number of bytes, given by the count. The first byte stored is the length of the string, or 255, whichever is smaller. The bytes of the string follow. If the string passed in to pack() is too long (longer than the count minus 1), only the leading count-1 bytes of the string are stored. If the string is shorter than count-1, it is padded with null bytes so that exactly count bytes in all are used. Note that for unpack(), the 'p' format character consumes count bytes, but that the string returned can never contain more than 255 characters.

For the 'P' format character, the return value is a Python integer or long integer, depending on the size needed to hold a pointer when it has been cast to an integer type. A NULL pointer will always be returned as the Python integer 0. When packing pointer-sized values, Python integer or long integer objects may be used. For example, the Alpha and Merced processors use 64-bit pointer values, meaning a Python long integer will be used to hold the pointer; other platforms use 32-bit pointers and will use a Python integer.

For the '?' format character, the return value is either True or False. When packing, the truth value of the argument object is used. Either 0 or 1 in the native or standard bool representation will be packed, and any non-zero value will be True when unpacking.


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM