Protocol Buffer

本文轉載自查看原文 2013-12-06 10:38 7542 .net/ Protocol Buffer

Protocol buffers是google使用的一種結構化數據序列化編碼解碼方式，采用簡單的二進制格式，他比XML、JSON格式體積更小，編碼解碼效率更高下面是項目官方網站與XML對比的描述： # are 3 to 10 times smaller # are 20 to 100 times faster 這里有一個.NET環境下的對比測試： Results of Northwind database rows serialization benchmarks，用的是.NET下面的實現ProtoBuf.net protobuf項目(C++)，.NET下的實現有： protobuf-net、 protobuf-csharp-port。另外一個.NET的項目是 Proto#，不過作者似乎沒有維護了
使用方式簡介
首先定義消息類型： message Person { required string name = 1; required int32 id = 2; optional string email = 3;
enum PhoneType { MOBILE = 0; HOME = 1; WORK = 2; }
message PhoneNumber { required string number = 1; optional PhoneType type = 2 [default = HOME]; }
repeated PhoneNumber phone = 4; } Field Rules: 屬性規則，required: 必須的屬性；optional: 可選屬性；repeated: 可重復多個的屬性 Field Type: 屬性數據類型，標量值類型（scalar value types）支持double, float, int32, int64, uint32, uint64, sint32, sint64, fixed32, fixed64, bool, string, bytes等，另外支持枚舉、嵌套/引用的消息類型等 Field Tags: 屬性標簽（例如name=1中的1），使用正整數表示，在序列化的二進制中使用這個標簽來標記屬性，比使用屬性名稱體積更小詳細的語法參考官方網站： Language Guide
消息類型定義在.proto文件中，使用protoc.exe根據.proto文件生成C++、Java、Python等類文件，這些類文件中定義了表示消息的對象，以及用於編碼、解碼的方法
體積方面，首先從上面消息類型的定義中可以看出，使用屬性標簽代替屬性名稱可以減小體積，另外在編碼協議上對各種數據類型的處理，也盡量采用了壓縮的表示方式以減小體積。速度方面，二進制協議比基於文本的解析更有優勢
編碼協議簡介 - 2.3.0
詳細的編碼協議參考官方網站的 Encoding Base 128 Varints 32位整數使用4字節存儲，32位的整數值1同樣要使用4個字節，比較浪費空間。Varint采用變長字節的方式存儲整數，將高位為0的字節去掉，節約空間高位為0的字節去掉以后，用來存儲整數的每一個字節，其最高有效位（most significant bit）用作標識位，0表示這是整數的最后一個字節，1表示不是最后一個字節；其他7位用於存儲整數的數值。字節序采用little-endian 示例：整數1，Varint的二進制值為0000 0001。因為1個字節就足夠，所以最高有效位為0，后7位則為1的原碼形式整數300，Varint需要2字節表示，二進制值為1010 1100 0000 0010。第一個字節最高有效位設為1，最后一個字節最高有效位設為0。解碼過程如下： a). 首先每個字節去掉最高有效位，得到：010 1100 000 0010 b). 按照little-endian方式處理字節序，得到：000 0010 010 1100 c). 二進制值100101100即為300
ZigZag編碼 Varint對於無符號整數有效，對負數無法進行壓縮，protocol buffer對有符號整數采用ZigZag編碼后，再以varint形式存儲對32位有符號數，ZigZag編碼算法為 (n << 1) ^ (n >> 31)，對64位有符號數的算法為(n << 1) ^ (n >> 63) 注意：32位有符號數右移31位后，對於正數所有位為0，對於負數所有位為1 編碼后的效果是0=>0, -1=>1, 1=>2, -2=>3, 2=>4……，即將無符號數編碼為有符號數表示，這樣就能有效發揮varint的優勢了
Protocol buffer用32位表示float和fixed32，用64位表示double和fixed64 String, bytes, 嵌入式消息等數據均采用定長數據類型（length-delimited）表示，這類數據在開始位置使用一個varint表示數據的字節長度，后面接着是數據值
消息結構 消息的所有屬性都序列化為key-value pair（鍵-值對）的字節流形式，字節流中不包含屬性的名稱和聲明的類型，這些信息必須從定義的消息類型中獲取 key里面包含2個東西，一個是在消息類型里面為該屬性指定的field tag，另一個是protocol buffer協議的封裝類型（wire type）。這2個部分都是正整數，使用 (field_tag << 3) | wire_type 方式生成一個正整數，然后使用base 128 varint方式表示。key后面跟着是屬性的值 wire type：

Type	Meaning	Used For
0	Varint	int32, int64, uint32, uint64, sint32, sint64, bool, enum
1	64-bit	fixed64, sfixed64, double
2	Length-delimited	string, bytes, embedded messages, packed repeated fields
3	Start group	groups (deprecated)
4	End group	groups (deprecated)
5	32-bit	fixed32, sfixed32, float

示例： 消息類型如下 message Test1 { required int32 attr = 1; } 創建一個Test1的對象，將其屬性attr的值設置為150，則對該對象編碼過程如下屬性數據類型為int32，其wire type為0，所以key值為 (1 << 3 ) | 0 => 0000 1000 屬性值150采用Varint編碼 150 => 10010110 //二進制 => 000 0001 001 0110 //7位一組分開 => 001 0110 000 0001 //little-endian字節序 => 1001 0110 0000 0001 //設置最高標識位 => 96 01 //16進制所以這個Test1對象編碼后的16進制值為：08 96 01
如果有嵌入式消息類型定義如下 message Test3 { required Test1 c = 3; } 編碼后的16進制值形如：1A 03 08 96 01，其中08 96 01就是上面示例的Test1對象，在Test3的屬性中他與字符串的處理方式一樣，前面的03就是表示其長度的varint
protobuf-csharp-port的使用方式
protobuf-csharp-port跟protobuf的使用方式一樣，即在開發過程中使用protoc.exe、ProtoGen.exe生成用於序列化、反序列化時的消息對象，在運行時通過這些對象進行編碼解碼從 GitHub下載項目源代碼（目前還沒有發布包），項目中帶有示例AddressBook 生成消息通訊用的C#類分2個步驟步驟1：使用lib目錄下的protoc.exe生成二進制表示 protoc --descriptor_set_out=addressbook.protobin --proto_path=..\protos --include_imports ..\protos\tutorial\addressbook.proto 步驟2：使用編譯生成的ProtoGen.exe從二進制表示生成C#類 ProtoGen.exe addressbook.protobin 會生成幾個.cs文件，其中包括AddressBookProtos.cs，這個就是在addressbook.proto中定義的消息類型運行時的項目需要引用編譯生成的Google.ProtocolBuffers.dll，使用AddressBookProtos.cs完成編碼解碼操作，詳細用法查看示例項目AddressBook 運行AddressBook.exe如下圖：

輸入的對象序列化為二進制后，默認保存在addressbook.data文件中，可以使用ProtoDump.exe讀取這個二進制文件：

protobuf-net的使用方式 - r282
protobuf-net的使用與Google的protobuf完全不一樣，他采用.NET的編程方式，可以非常方便的在.NET的序列化場景下使用，支持WCF的DataContact，WCF程序幾乎不需要什么修改就能使用protobuf-net 下載protobuf-net，項目引用protobuf-net.dll，測試對象定義如下：

 
                 [ProtoContract] 
                
                 public 
                 class 
                 TestObject 
                
                 { 
                
                 [ProtoMember(1)] 
                
                 public 
                 string 
                 StringAttr1 {  
                 get 
                 ;  
                 set 
                 ; } 
                
                 [ProtoMember(2)] 
                
                 public 
                 string 
                 StringAttr2 {  
                 get 
                 ;  
                 set 
                 ; } 
                
                 [ProtoMember(3)] 
                
                 public 
                 int 
                 IntAttr {  
                 get 
                 ;  
                 set 
                 ; } 
                
                 [ProtoMember(4)] 
                
                 public 
                 long 
                 LongAttr {  
                 get 
                 ;  
                 set 
                 ; } 
                
                 [ProtoMember(5)] 
                
                 public 
                 decimal 
                 DecimalAttr {  
                 get 
                 ;  
                 set 
                 ; } 
                
                 [ProtoMember(6)] 
                
                 public 
                 float 
                 FloatAttr {  
                 get 
                 ;  
                 set 
                 ; } 
                
                 [ProtoMember(7)] 
                
                 public 
                 int 
                 [] ArrayAttr {  
                 get 
                 ;  
                 set 
                 ; } 
                
                 [ProtoMember(8)] 
                
                 public 
                 IList< 
                 string 
                 > ListAttr {  
                 get 
                 ;  
                 set 
                 ; } 
                
                 [ProtoMember(9)] 
                
                 public 
                 InnerObject EmbeddedAttr {  
                 get 
                 ;  
                 set 
                 ; } 
                
                 public 
                 override 
                 string 
                 ToString() 
                
                 { 
                
                 StringBuilder sb =  
                 new 
                 StringBuilder() 
                
                 .Append( 
                 "TestObject {\r\n" 
                 ) 
                
                 .Append( 
                 "   StringAttr1: \"" 
                 ).Append( 
                 this 
                 .StringAttr1).Append( 
                 "\",\r\n" 
                 ) 
                
                 .Append( 
                 "   StringAttr2: \"" 
                 ).Append( 
                 this 
                 .StringAttr2).Append( 
                 "\",\r\n" 
                 ) 
                
                 .Append( 
                 "   IntAttr: " 
                 ).Append( 
                 this 
                 .IntAttr).Append( 
                 ",\r\n" 
                 ) 
                
                 .Append( 
                 "   LongAttr: " 
                 ).Append( 
                 this 
                 .LongAttr).Append( 
                 ",\r\n" 
                 ) 
                
                 .Append( 
                 "   DecimalAttr: " 
                 ).Append( 
                 this 
                 .DecimalAttr).Append( 
                 ",\r\n" 
                 ) 
                
                 .Append( 
                 "   FloatAttr: " 
                 ).Append( 
                 this 
                 .FloatAttr).Append( 
                 ",\r\n" 
                 ); 
                
                 if 
                 ( 
                 this 
                 .ArrayAttr !=  
                 null 
                 ) 
                
                 { 
                
                 sb.Append( 
                 "   ArrayAttr: [ " 
                 ); 
                
                 foreach 
                 ( 
                 int 
                 i  
                 in 
                 this 
                 .ArrayAttr) sb.Append(i).Append( 
                 ", " 
                 ); 
                
                 sb.Remove(sb.Length - 2, 2); 
                
                 sb.Append( 
                 " ],\r\n" 
                 ); 
                
                 } 
                
                 if 
                 ( 
                 this 
                 .ListAttr !=  
                 null 
                 ) 
                
                 { 
                
                 sb.Append( 
                 "   ListAttr: [ " 
                 ); 
                
                 foreach 
                 ( 
                 string 
                 s  
                 in 
                 this 
                 .ListAttr) sb.Append( 
                 '"' 
                 ).Append(s).Append("\ 
                 ", " 
                 ); 
                
                 sb.Remove(sb.Length - 2, 2); 
                
                 sb.Append( 
                 " ],\r\n" 
                 ); 
                
                 } 
                
                 if 
                 ( 
                 this 
                 .EmbeddedAttr !=  
                 null 
                 ) 
                
                 sb.Append( 
                 "   EmbeddedAttr: " 
                 ).Append( 
                 this 
                 .EmbeddedAttr.ToString()).Append( 
                 "\r\n" 
                 ); 
                
                 return 
                 sb.Append( 
                 "}" 
                 ).ToString(); 
                
                 } 
                
                 } 
                
                 [ProtoContract] 
                
                 public 
                 class 
                 InnerObject 
                
                 { 
                
                 [ProtoMember(1)] 
                
                 public 
                 string 
                 Attr1 {  
                 get 
                 ;  
                 set 
                 ; } 
                
                 [ProtoMember(2)] 
                
                 public 
                 DateTime Attr2 {  
                 get 
                 ;  
                 set 
                 ; } 
                
                 [ProtoMember(3)] 
                
                 public 
                 bool 
                 Attr3 {  
                 get 
                 ;  
                 set 
                 ; } 
                
                 [ProtoMember(6)] 
                
                 public 
                 byte 
                 Attr4 {  
                 get 
                 ;  
                 set 
                 ; } 
                
                 [ProtoMember(9)] 
                
                 public 
                 sbyte 
                 Attr5 {  
                 get 
                 ;  
                 set 
                 ; } 
                
                 public 
                 override 
                 string 
                 ToString() 
                
                 { 
                
                 return 
                 new 
                 StringBuilder() 
                
                 .Append( 
                 "{\r\n" 
                 ) 
                
                 .Append( 
                 "      Attr1: \"" 
                 ).Append( 
                 this 
                 .Attr1).Append( 
                 "\",\r\n" 
                 ) 
                
                 .Append( 
                 "      Attr2: \"" 
                 ).Append( 
                 this 
                 .Attr2.ToString( 
                 "yyyy-MM-dd" 
                 )).Append( 
                 "\",\r\n" 
                 ) 
                
                 .Append( 
                 "      Attr3: " 
                 ).Append( 
                 this 
                 .Attr3).Append( 
                 ",\r\n" 
                 ) 
                
                 .Append( 
                 "      Attr4: " 
                 ).Append( 
                 this 
                 .Attr4).Append( 
                 ",\r\n" 
                 ) 
                
                 .Append( 
                 "      Attr5: " 
                 ).Append( 
                 this 
                 .Attr5).Append( 
                 "\r\n" 
                 ) 
                
                 .Append( 
                 "   }" 
                 ).ToString(); 
                
                 } 
                
                 }

測試代碼如下：

 
                 using 
                 (MemoryStream ms =  
                 new 
                 MemoryStream()) 
                
                 { 
                
                 TestObject obj =  
                 new 
                 TestObject() 
                
                 { 
                
                 StringAttr1 =  
                 "string 1" 
                 , 
                
                 StringAttr2 =  
                 "string 2" 
                 , 
                
                 IntAttr = 300, 
                
                 LongAttr = 1, 
                
                 DecimalAttr = 34.10091M, 
                
                 FloatAttr = 12.3f, 
                
                 ArrayAttr =  
                 new 
                 int 
                 [] { 600, -9, 0 }, 
                
                 ListAttr =  
                 new 
                 List< 
                 string 
                 > {  
                 "string 3" 
                 ,  
                 "string 5" 
                 }, 
                
                 EmbeddedAttr =  
                 new 
                 InnerObject() 
                
                 { 
                
                 Attr1 =  
                 "string 6" 
                 , 
                
                 Attr2 =  
                 new 
                 DateTime(2010, 2, 1), 
                
                 Attr3 =  
                 false 
                 , 
                
                 Attr4 = 8, 
                
                 Attr5 = -63 
                
                 } 
                
                 }; 
                
                 Serializer.Serialize<TestObject>(ms, obj); 
                
                 ms.Flush(); 
                
                 ms.Position = 0; 
                
                 TestObject obj2 = Serializer.Deserialize<TestObject>(ms); 
                
                 Console.WriteLine(obj2); 
                
                 Console.ReadKey(); 
                
                 }

運行結果：

附錄
原碼、反碼、補碼 對有符號數，最高位是符號位。正數的原碼反碼和補碼都是一樣的，就是本身。負數的反碼是原碼求反，補碼是反碼加1。例如-1的原碼是1000 0001，反碼是1111 1110，補碼是1111 1111。負數都是用補碼表示，從正數的原碼推負數的二進制表示（補碼）時，只須將正數各個位（包括符合位）取反加1 補碼有2種，即one's complement (1's complement，1的補碼) 和 two's complement (2's complement，2的補碼) 。按照定義，one's complement就是對各個位取反，two's complement是對各個位取反后加1。例如在8位處理器情況下，9的二進制是0000 1001，one's complement是1111 0110，two's complement是1111 0111 采用one's complement表示負數時存在正0 (0x00)和負0 (0xff)，並且有符號數相加必須采用end-around carry（循環進位）處理，例如

相加之后發生溢出，則必須將溢出位加到最低位上，這樣導致有符號數相加和無符號數相加算法不一致，而采用two's complement表示時不存在這些問題關於2的補碼表示可以參考阮一峰的關於2的補碼一文，更專業的說明可以參考wikipedia上的 Method of complements：二進制的基數補碼（radix complement）叫做2的補碼，二進制的基數減一補碼（diminished radix complement）叫做1的補碼；十進制的基數補碼叫做10的補碼，基數減一補碼叫做9的補碼

轉自: http://www.cnblogs.com/RicCC/archive/2010/03/10/protocol-buffers.html

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Xcode use Protocol buffer Protocol Buffer技術 protocol buffer 基礎 Protocol Buffer格式傳輸 [Hadoop] - Protocol Buffer安裝 Protocol Buffer的安裝與使用 Google Protocol Buffer 協議 protocol buffer的高效編碼方式 Google Protocol Buffer 簡單介紹 js中應用protocol buffer