1. 介紹

TensorFlow 主要有兩個關於模型量化壓縮的包，一個是 tensorflow/tensorflow/lite，另一個是 tensorflow/tensorflow/contrib/quantize。之前兩個包都在 contrib 包下，最近更新 lite 包才被移出到主目錄，目前 lite 的版本應該時比較正式了。

2. 區別

按照 tensorflow/tensorflow/lite/tutorials/post_training_quant.ipynb 中所描述（文中 quantization aware training 即指 quantize 包）：

In contrast to quantization aware training , the weights are quantized post training and the activations are quantized dynamically at inference in this method. Therefore, the model weights are not retrained to compensate for quantization induced errors. It is important to check the accuracy of the quantized model to ensure that the degradation is acceptable.

兩者的區別是：

lite: 在訓練完成后量化，不能對量化后的模型進行微調，需要考慮精度下降的程度能否接受;
quantize: 可以在量化后再進行微調。

3. quantize 相關

quantize包相關的文檔比較少，僅有在 API 頁面中的 4 個函數的說明(Defined in quantize_graph.py)：

create_eval_graph(input_graph=None)
為了模擬量化就地重繪 eval input_graph。
create_training_graph(input_graph=None, quant_delay=0)
為了模擬量化就地重繪 training input_graph。此函數需要在向 graph 中插入梯度操作之前調用。對於已經過訓練的模型，推薦 quant_delay 取默認值；對於重頭開始訓練的模型，quant_delay 需要設置為模型迭代至收斂的步數，量化會在這一步開始，並對模型進行微調，若不提供 quant_delay，訓練很可能會失敗。
experimental_create_eval_graph(input_graph=None, weight_bits=8, activation_bits=8, quant_delay=None, scope=None)
暫時不知道 experimental 和上面的有什么區別...以后再補
experimental_create_training_graph(input_graph=None, weight_bits=8, activation_bits=8, quant_delay=0, freeze_bn_delay=None, scope=None)

4. 待補充

該來的還是會來的。
RTFS.

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Tensorflow Lite -- camera demo tensorflow lite c++ TensorFlow Lite for Android示例 [TF Lite] TensorFlow Lite with OpenGL ES [TF Lite] Build Training Platform for TensorFlow Lite Model [TF Lite] How to convert a custom model to TensorFlow Lite tensorflow lite 編譯和安裝二使用bazel編譯 tensorflow lite 之生成 tflite 模型文件 Tensorflow lite Android 人臉檢測demo AXI4-lite協議介紹