1. 介绍

TensorFlow 主要有两个关于模型量化压缩的包，一个是 tensorflow/tensorflow/lite，另一个是 tensorflow/tensorflow/contrib/quantize。之前两个包都在 contrib 包下，最近更新 lite 包才被移出到主目录，目前 lite 的版本应该时比较正式了。

2. 区别

按照 tensorflow/tensorflow/lite/tutorials/post_training_quant.ipynb 中所描述（文中 quantization aware training 即指 quantize 包）：

In contrast to quantization aware training , the weights are quantized post training and the activations are quantized dynamically at inference in this method. Therefore, the model weights are not retrained to compensate for quantization induced errors. It is important to check the accuracy of the quantized model to ensure that the degradation is acceptable.

两者的区别是：

lite: 在训练完成后量化，不能对量化后的模型进行微调，需要考虑精度下降的程度能否接受;
quantize: 可以在量化后再进行微调。

3. quantize 相关

quantize包相关的文档比较少，仅有在 API 页面中的 4 个函数的说明(Defined in quantize_graph.py)：

create_eval_graph(input_graph=None)
为了模拟量化就地重绘 eval input_graph。
create_training_graph(input_graph=None, quant_delay=0)
为了模拟量化就地重绘 training input_graph。此函数需要在向 graph 中插入梯度操作之前调用。对于已经过训练的模型，推荐 quant_delay 取默认值；对于重头开始训练的模型，quant_delay 需要设置为模型迭代至收敛的步数，量化会在这一步开始，并对模型进行微调，若不提供 quant_delay，训练很可能会失败。
experimental_create_eval_graph(input_graph=None, weight_bits=8, activation_bits=8, quant_delay=None, scope=None)
暂时不知道 experimental 和上面的有什么区别...以后再补
experimental_create_training_graph(input_graph=None, weight_bits=8, activation_bits=8, quant_delay=0, freeze_bn_delay=None, scope=None)

4. 待补充

该来的还是会来的。
RTFS.

免责声明！

本站转载的文章为个人学习借鉴使用，本站对版权不负任何法律责任。如果侵犯了您的隐私权益，请联系本站邮箱yoyou2525@163.com删除。

猜您在找 Tensorflow Lite -- camera demo tensorflow lite c++ TensorFlow Lite for Android示例 [TF Lite] TensorFlow Lite with OpenGL ES [TF Lite] Build Training Platform for TensorFlow Lite Model [TF Lite] How to convert a custom model to TensorFlow Lite tensorflow lite 编译和安装二使用bazel编译 tensorflow lite 之生成 tflite 模型文件 Tensorflow lite Android 人脸检测demo AXI4-lite协议介绍