使用docker+tensorflow-serving进行模型热部署

本文转载自查看原文 2020-05-30 20:39 877

部署多个模型

（1）直接部署两个模型faster-rcnn与retina，构建代码的文件夹。
文件夹结构为：

mutimodel

├── faster_rcnn
│ └── 1
│ ├── assets
│ ├── saved_model.pb
│ └── variables
│ ├── variables.data-00000-of-00001
│ └── variables.index
├── model.config
├── retina
│ └── 1
│ ├── assets
│ ├── saved_model.pb
│ └── variables
│ ├── variables.data-00000-of-00001
│ └── variables.index

model.config的内容为：

model_config_list {
  config {
    name: 'rcnn',
    model_platform: "tensorflow",
    base_path: '/models/mutimodel/faster_rcnn'  # 这里的base_path是docker中的映射路径
  },
  config {
    name: 'retina',
    model_platform: "tensorflow",
    base_path: '/models/mutimodel/retina'  # 这里的base_path是docker中的映射路径
  }
}

（2）启动docker
sudo docker run -p 8501:8501 -p 8500:8500 --mount type=bind,source=/home/techi/techi/code/model_saved_files/mutimodel,target=/models/mutimodel -t tensorflow/serving --model_config_file=/models/mutimodel/model.config
其中，target为docker挂载地址，model_config_file对应的地址也是挂载地址

编写客户端代码

import tensorflow as tf
from tensorflow_serving.apis import predict_pb2
from tensorflow_serving.apis import prediction_service_pb2_grpc

options = [('grpc.max_send_message_length', 1000 * 1024 * 1024),
           ('grpc.max_receive_message_length', 1000 * 1024 * 1024)]

channel = grpc.insecure_channel('xxx', options=options)

stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)

(_, _), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_test = tf.reshape(x_test, (10000, -1))
# print(x_test)
x_test = tf.cast(x_test, dtype=tf.float32)
x_test = (x_test - 127.5) / 127.5

request = predict_pb2.PredictRequest()
request.model_spec.name = "yoloV3"
request.model_spec.signature_name = 'serving_default'
request.inputs['input_1'].CopyFrom(tf.make_tensor_proto(x_test[:20], shape=(20, 784)))  # 在远程传输中，使用tensorproto格式进行传输
response = stub.Predict(request, 10.0)
output = tf.make_ndarray(response.outputs['dense_2'])  # 这里的名称使用 saved_model_cli show --dir='相对路径/绝对路径' --all 进行查看
output = tf.argmax(output, axis=-1)
print(output)
print(y_test[:20])

在docker hub中有四中镜像。

:latest
:lasest-gpu
:lasest-devel
:lasest-devel-gpu
两种划分共有4种镜像。CPU、GPU、稳定版、开发版。稳定版在dockerfile中启动tensorflow_model_server服务，创建容器即启动服务。开发版需要在创建容器之后，进入容器，手动启动tensorflow_model_server服务。

如果需要修改相关参数配置，可以通过如下命令：
docker run -p 8501:8501 -v /root/mutimodel/linear_model:/models/linear_model -t --entrypoint=tensorflow_model_server tensorflow/serving --port=8500 --model_name=linear_model --model_base_path=/models/linear_model --rest_api_port=8501
-t --entrypoint=tensorflow_model_server tensorflow/serving：如果使用稳定版的docker，启动docker之后是不能进入容器内部bash环境的，–entrypoint的作用是允许你“间接”进入容器内部，然后调用tensorflow_model_server命令来启动TensorFlow Serving，这样才能输入后面的参数。
tensorflow serving的详细参数如下：
Flags:
    --port=8500                         int32   Port to listen on for gRPC API
    --grpc_socket_path=""               string  If non-empty, listen to a UNIX socket for gRPC API on the given path. Can be either relative or absolute path.
    --rest_api_port=0                   int32   Port to listen on for HTTP/REST API. If set to zero HTTP/REST API will not be exported. This port must be different than the one specified in --port.
    --rest_api_num_threads=16           int32   Number of threads for HTTP/REST API processing. If not set, will be auto set based on number of CPUs.
    --rest_api_timeout_in_ms=30000      int32   Timeout for HTTP/REST API calls.
    --enable_batching=false             bool    enable batching
    --batching_parameters_file=""       string  If non-empty, read an ascii BatchingParameters protobuf from the supplied file name and use the contained values instead of the defaults.
    --model_config_file=""              string  If non-empty, read an ascii ModelServerConfig protobuf from the supplied file name, and serve the models in that file. This config file can be used to specify multiple models to serve and other advanced parameters including non-default version policy. (If used, --model_name, --model_base_path are ignored.)
    --model_name="default"              string  name of model (ignored if --model_config_file flag is set)
    --model_base_path=""                string  path to export (ignored if --model_config_file flag is set, otherwise required)
    --max_num_load_retries=5            int32   maximum number of times it retries loading a model after the first failure, before giving up. If set to 0, a load is attempted only once. Default: 5
    --load_retry_interval_micros=60000000   int64   The interval, in microseconds, between each servable load retry. If set negative, it doesn't wait. Default: 1 minute
    --file_system_poll_wait_seconds=1   int32   Interval in seconds between each poll of the filesystem for new model version. If set to zero poll will be exactly done once and not periodically. Setting this to negative value will disable polling entirely causing ModelServer to indefinitely wait for a new model at startup. Negative values are reserved for testing purposes only.
    --flush_filesystem_caches=true      bool    If true (the default), filesystem caches will be flushed after the initial load of all servables, and after each subsequent individual servable reload (if the number of load threads is 1). This reduces memory consumption of the model server, at the potential cost of cache misses if model files are accessed after servables are loaded.
    --tensorflow_session_parallelism=0  int64   Number of threads to use for running a Tensorflow session. Auto-configured by default.Note that this option is ignored if --platform_config_file is non-empty.
    --tensorflow_intra_op_parallelism=0 int64   Number of threads to use to parallelize the executionof an individual op. Auto-configured by default.Note that this option is ignored if --platform_config_file is non-empty.
    --tensorflow_inter_op_parallelism=0 int64   Controls the number of operators that can be executed simultaneously. Auto-configured by default.Note that this option is ignored if --platform_config_file is non-empty.
    --ssl_config_file=""                string  If non-empty, read an ascii SSLConfig protobuf from the supplied file name and set up a secure gRPC channel
    --platform_config_file=""           string  If non-empty, read an ascii PlatformConfigMap protobuf from the supplied file name, and use that platform config instead of the Tensorflow platform. (If used, --enable_batching is ignored.)
    --per_process_gpu_memory_fraction=0.000000  float   Fraction that each process occupies of the GPU memory space the value is between 0.0 and 1.0 (with 0.0 as the default) If 1.0, the server will allocate all the memory when the server starts, If 0.0, Tensorflow will automatically select a value.
    --saved_model_tags="serve"          string  Comma-separated set of tags corresponding to the meta graph def to load from SavedModel.
    --grpc_channel_arguments=""         string  A comma separated list of arguments to be passed to the grpc server. (e.g. grpc.max_connection_age_ms=2000)
    --enable_model_warmup=true          bool    Enables model warmup, which triggers lazy initializations (such as TF optimizations) at load time, to reduce first request latency.
    --version=false                     bool    Display version

关于设置别名：
Please note that labels can only be assigned to model versions that are already loaded and available for serving. Once a model version is available, one may reload the model config on the fly to assign a label to it. This can be achieved using aHandleReloadConfigRequest RPC or if the server is set up to periodically poll the filesystem for the config file, as described above.

If you would like to assign a label to a version that is not yet loaded (for ex. by supplying both the model version and the label at startup time) then you must set the --allow_version_labels_for_unavailable_models flag to true, which allows new labels to be assigned to model versions that are not loaded yet.

docker run -p 8501:8501 -p 8500:8500 -v /root/mutimodel/:/models/mutimodel  -t tensorflow/serving --model_config_file=/models/mutimodel/model.config --allow_version_labels_for_unavailable_models=true

免责声明！

本站转载的文章为个人学习借鉴使用，本站对版权不负任何法律责任。如果侵犯了您的隐私权益，请联系本站邮箱yoyou2525@163.com删除。

猜您在找 使用tensorflow-serving部署tensorflow模型 docker部署tensorflow serving以及模型替换 tensorflow serving 模型部署模型部署 TensorFlow Serving tensorflow serving 模型部署【tensorflow2.0】使用tensorflow-serving部署模型 tensorflow 模型保存与加载和TensorFlow serving + grpc + docker项目部署【模型部署】TF Serving 的使用 Tensorflow Serving 模型部署和服务用Docker容器自带的tensorflow serving部署模型对外服务（成功率100%）