grpc系列- protobuf詳解

本文轉載自查看原文 2020-12-30 13:46 2437 protobuf/ golang/ go protobuf/ go/ Protocol Buffer/ grpc

Protocol Buffers 是一種與語言、平台無關，可擴展的序列化結構化數據的方法，常用於通信協議，數據存儲等等。相較於 JSON、XML，它更小、更快、更簡單，因此也更受開發人員的青眯。

基本語法

syntax = “proto3”;
package model;
service MyServ {
  rpc Query(Request) returns(Reply);
}
message Student {  
  int64 id = 1;  
  string name = 2;  
  int32 age = 3;
}

定義完 proto文件后，生成相應語言的代碼

protoc --proto_path=. --go_out=plugins=grpc,paths=source_relative:. xxxx.proto

--proto_path 或者 -I 參數用以指定所編譯源碼（包括直接編譯的和被導入的 proto 文件）的搜索路徑
--go_out 參數之間用逗號隔開，最后用冒號來指定代碼目錄架構的生成位置，--go_out=plugins=grpc參數來生成gRPC相關代碼，如果不加plugins=grpc，就只生成message數據

eg：--go_out=plugins=grpc,paths=import:. 。注意一下 paths 參數，他有兩個選項，import 和 source_relative 。默認為 import ，代表按照生成的 go 代碼的包的全路徑去創建目錄層級，source_relative 代表按照 proto 源文件的目錄層級去創建 go 代碼的目錄層級，如果目錄已存在則不用創建
protoc是通過插件機制實現對不同語言的支持。比如 --xxx_out 參數，那么protoc將首先查詢是否有內置的xxx插件，如果沒有內置的xxx插件那么將繼續查詢當前系統中是否存在protoc-gen-xxx命名的可執行程序。
例如，生成 c++代碼

protoc -I . --grpc_out=. --plugin=protoc-gen-grpc=`which grpc_cpp_plugin` --cpp_out=. *.proto

導入依賴的`proto`文件

為了方便，會把公共的一些字段放到一個proto文件里，如果有需要，就把這個proto文件impot進去，比如，我現在的組織結構好下

common.proto 文件里只有個簡單的message

syntax = "proto3";
package protos;
option go_package = "protos";
option java_package = "com.proto";


message Result {
  string code = 1;
  string desc = 2;
  bytes data = 3;
}

目錄api里student_api.proto
在這個文件里，我們導入了common.proto，還有其他需要的文件

syntax = "proto3";
package api;
option go_package = "protos/api";
option java_package = "com.proto.api";

import "protos/common.proto";
import "protos/model/students.proto";
import "google/protobuf/empty.proto";


service StudentSrv {
  rpc NewStudent(model.Student) returns (protos.Result);
  rpc StudentByID(QueryStudent) returns (QueryStudentResponse);

  rpc AllStudent(google.protobuf.Empty) returns(stream QueryStudentResponse);
  rpc StudentInfo(stream QueryStudent) returns(stream QueryStudentResponse);
}

message QueryStudent {
  int64 id = 1;
}

message QueryStudentResponse {
  repeated model.Student studentList = 1;
}

在執行protoc的時候，我們要指定這些需文件的查找路徑，在項目的根目錄里執行protoc進行代碼生成

protoc  -I=. --go_out=plugins=grpc:. --go_opt=paths=source_relative protos/api/*.proto

上面的-I指定了當前目錄，就是說可以從當前目錄開始找proto文件

protoc 生成了什么

以 student.proto為例

syntax = "proto3";
package model;
option go_package = "protos/model";
option java_package = "com.proto.model";

message Student {
  int64 id = 1;
  string name = 2;
  int32 age = 3;
}

message StudentList {
  string class = 1;
  repeated Student students = 2;
  string teacher = 3;
  repeated int64 score = 4;
}

執行完protoc后，大概看一下生成的的go文件

type Student struct {
	state         protoimpl.MessageState
	sizeCache     protoimpl.SizeCache
	unknownFields protoimpl.UnknownFields

	Id   int64  `protobuf:"varint,1,opt,name=id,proto3" json:"id,omitempty"`
	Name string `protobuf:"bytes,2,opt,name=name,proto3" json:"name,omitempty"`
	Age  int32  `protobuf:"varint,3,opt,name=age,proto3" json:"age,omitempty"`
}

state 保存 proto文件的反射信息 sizeCache序列化的數據總長度 unknownFields 不能解析的字段
剩下的字段是我們message里定義的信息，主要看一下tag信息
protobuf:"varint,1,opt,name=id,proto3" json:"id,omitempty"，說明這個字段是protobuf的varint類型，index為1 name為id，使用proto3協議
還有一個byte數組的file_protos_model_students_proto_rawDesc

一眼看上去就有點蒙，這一坨是什么？開源的好處就是，我可以很清楚的看清他是做什么的，

這個file_protos_model_students_proto_rawDesc是proto里數據的描述信息。如 proto的路徑、包名稱，message信息等等。
file_protos_model_students_proto_rawDesc描述信息有什么用呢？
當我們在執行proto.Marshal的時候，會對傳入的參數Message進行驗證，比如每個message字段的index、數據類型，是否和file_protos_model_students_proto_rawDesc一致。如果不一致就說明是有問題的。

protobuf支持的數據類型

protobuf目前支持這5種數據類型，還有2個是已經廢棄了。protobuf是語言無關的，也就是說，無論具體的語言支持哪些數據類型，在marshal的時候都要轉換成這幾種，在unmarshal的時候再轉換成具體語言的類型
我們把一個結構轉換成json

Student {   
	Id:   1,   
	Name: "孫悟空",   
	Age:  300,
}

{
  "id": 1,
  "name": "孫悟空",
  "age": 300
}

轉換成 protobuf 數據格式

1000 1 10010 1001 11100101 10101101 10011001 11100110 10000010 10011111 11100111 10101001 10111010 11000 10101100 10

轉換成十進制

8 1 8 9 229 173 153 230 130 159 231 169 186 24 172 2

json一眼就能看懂是什么，protobuf數據格式看不明白，下面來解釋這些數據都是什么。

index 和類型

先說一下第一個byte 1000 這個表示的是字段的index和類型，
protobuf 把一個字段的 index 和類型放在了一起

(field_number << 3) | wire_type

最后3個bit為類型，前面的bit為index
0000 1000 首位為標識位，index為 1 后三位為wire_type:0（Varint類型）再比如 10010 index: 2 wire_type: 2（Length-delimited類型）

Varint類型

Varint數據類型，最高位(msb)標志位，為1說明后面還有byte，0說明后面沒有byte，使用后面的7個Bit位存儲數值
Id: 1 protobuf對應的數據是0000 0001 這個很好理解
Age: 300 protobuf對應的數據是1010 1100 0000 0010，這個是怎么計算的呢？

protobuf數據 1010 1100 0000 0010
去掉最高位    010 1100   000 0010
連接剩余      0100101100
計算          256 + 32 + 8 + 4 = 300

Length-delimited 類型

字符串內存的表現形式，在protobuf中一個漢字占3個byte
看一下“孫悟空”內存的數據

11100101 10101101 10011001 11100110 10000010 10011111 11100111 10101001 10111010

“孫悟空”前面的一個byte：1001
這個數值有什么意義？對，字符串長度 9
Length-delimited類型的數據前面的byte是數據的長度，后面是具體的數據信息。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 ProtoBuf 與 gRPC grpc與protobuf使用在 Java 中使用 gRPC 和 ProtoBuf Go微服務 grpc/protobuf java protobuf 生成grpc 代碼 gRPC詳解 gRPC詳解 Google 新實現的Protobuf RPC: grpc Protobuf + gRPC Android Studio接入指南 ubuntu go grpc protobuf 環境的搭建