TensorFlow Object Detection API中的Faster R-CNN /SSD模型參數調整

本文轉載自查看原文 2019-11-05 16:45 981

關於TensorFlow Object Detection API配置，可以參考之前的文章https://becominghuman.ai/tensorflow-object-detection-api-tutorial-training-and-evaluating-custom-object-detector-ed2594afcf73

在本文中，我將討論如何更改預訓練模型的配置。本文的目的是您可以根據您的應用程序配置TensorFlow/models，而API將不再是一個黑盒！

本文的概述：

了解協議緩沖區和proto文件。
利用proto文件知識，我們如何了解模型的配置文件
遵循3個步驟來更新模型的參數
其他示例：

更改重量初始值設定項
更改體重優化器
評估預訓練模型

協議緩沖區

要修改模型，我們需要了解它的內部機制。TensorFlow對象檢測API使用協議緩沖區（Protocol Buffers），這是與語言無關，與平台無關且可擴展的機制，用於序列化結構化數據。就像XML規模較小，但更快，更簡單。API使用協議緩沖區語言的proto2版本。我將嘗試解釋更新預配置模型所需的語言。有關協議緩沖區語言的更多詳細信息，請參閱此文檔和Python教程。

協議緩沖區的工作可分為以下三個步驟：

在.proto文件中定義消息格式。該文件的行為就像所有消息的藍圖一樣，它顯示消息所接受的所有參數是什么，參數的數據類型應該是什么，參數是必需的還是可選的，參數的標記號是什么，什么是參數的默認值等。API的protos文件可在此處找到。為了理解，我使用grid_anchor_generator.proto文件。

syntax = "proto2";

package object_detection.protos;

// Configuration proto for GridAnchorGenerator. See
// anchor_generators/grid_anchor_generator.py for details.
message GridAnchorGenerator {
   // Anchor height in pixels.
  optional int32 height = 1 [default = 256];

  // Anchor width in pixels.
  optional int32 width = 2 [default = 256];

  // Anchor stride in height dimension in pixels.
  optional int32 height_stride = 3 [default = 16];

  // Anchor stride in width dimension in pixels.
  optional int32 width_stride = 4 [default = 16];

  // Anchor height offset in pixels.
  optional int32 height_offset = 5 [default = 0];

  // Anchor width offset in pixels.
  optional int32 width_offset = 6 [default = 0];

  // At any given location, len(scales) * len(aspect_ratios) anchors are
  // generated with all possible combinations of scales and aspect ratios.

  // List of scales for the anchors.
  repeated float scales = 7;

  // List of aspect ratios for the anchors.
  repeated float aspect_ratios = 8;
}

它是從線30-33的參數明確scales，並aspect_ratios是強制性的消息GridAnchorGenerator，而參數的其余部分都是可選的，如果不通過，將采取默認值。

定義消息格式后，我們需要編譯協議緩沖區。該編譯器將從文件生成類.proto文件。在安裝API的過程中，我們運行了以下命令，該命令將編譯協議緩沖區：

# From tensorflow/models/research/
protoc object_detection/protos/*.proto --python_out=.

在定義和編譯協議緩沖區之后，我們需要使用Python協議緩沖區API來寫入和讀取消息。在我們的例子中，我們可以將配置文件視為協議緩沖區API，它可以在不考慮TensorFlow API的內部機制的情況下寫入和讀取消息。換句話說，我們可以通過適當地更改配置文件來更新預訓練模型的參數。
了解配置文件

顯然，配置文件可以幫助我們根據需要更改模型的參數。彈出的下一個問題是如何更改模型的參數？本節和下一部分將回答這個問題，在這里proto文件的知識將很方便。出於演示目的，我正在使用faster_rcnn_resnet50_pets.config文件。

# Faster R-CNN with Resnet-50 (v1), configured for Oxford-IIIT Pets Dataset.
# Users should configure the fine_tune_checkpoint field in the train config as
# well as the label_map_path and input_path fields in the train_input_reader and
# eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that
# should be configured.

model {
  faster_rcnn {
    num_classes: 37
    image_resizer {
      keep_aspect_ratio_resizer {
        min_dimension: 600
        max_dimension: 1024
      }
    }
    feature_extractor {
      type: 'faster_rcnn_resnet50'
      first_stage_features_stride: 16
    }
    first_stage_anchor_generator {
      grid_anchor_generator {
        scales: [0.25, 0.5, 1.0, 2.0]
        aspect_ratios: [0.5, 1.0, 2.0]
        height_stride: 16
        width_stride: 16
      }
    }
    first_stage_box_predictor_conv_hyperparams {
      op: CONV
      regularizer {
        l2_regularizer {
          weight: 0.0
        }
      }
      initializer {
        truncated_normal_initializer {
          stddev: 0.01
        }
      }
    }
    first_stage_nms_score_threshold: 0.0
    first_stage_nms_iou_threshold: 0.7
    first_stage_max_proposals: 300
    first_stage_localization_loss_weight: 2.0
    first_stage_objectness_loss_weight: 1.0
    initial_crop_size: 14
    maxpool_kernel_size: 2
    maxpool_stride: 2
    second_stage_box_predictor {
      mask_rcnn_box_predictor {
        use_dropout: false
        dropout_keep_probability: 1.0
        fc_hyperparams {
          op: FC
          regularizer {
            l2_regularizer {
              weight: 0.0
            }
          }
          initializer {
            variance_scaling_initializer {
              factor: 1.0
              uniform: true
              mode: FAN_AVG
            }
          }
        }
      }
    }
    second_stage_post_processing {
      batch_non_max_suppression {
        score_threshold: 0.0
        iou_threshold: 0.6
        max_detections_per_class: 100
        max_total_detections: 300
      }
      score_converter: SOFTMAX
    }
    second_stage_localization_loss_weight: 2.0
    second_stage_classification_loss_weight: 1.0
  }
}

train_config: {
  batch_size: 1
  optimizer {
    momentum_optimizer: {
      learning_rate: {
        manual_step_learning_rate {
          initial_learning_rate: 0.0003
          schedule {
            step: 900000
            learning_rate: .00003
          }
          schedule {
            step: 1200000
            learning_rate: .000003
          }
        }
      }
      momentum_optimizer_value: 0.9
    }
    use_moving_average: false
  }
  gradient_clipping_by_norm: 10.0
  fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt"
  from_detection_checkpoint: true
  # Note: The below line limits the training process to 200K steps, which we
  # empirically found to be sufficient enough to train the pets dataset. This
  # effectively bypasses the learning rate schedule (the learning rate will
  # never decay). Remove the below line to train indefinitely.
  num_steps: 200000
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
  max_number_of_boxes: 50
}

train_input_reader: {
  tf_record_input_reader {
    input_path: "PATH_TO_BE_CONFIGURED/pet_train.record"
  }
  label_map_path: "PATH_TO_BE_CONFIGURED/pet_label_map.pbtxt"
}

eval_config: {
  num_examples: 2000
  # Note: The below line limits the evaluation process to 10 evaluations.
  # Remove the below line to evaluate indefinitely.
  max_evals: 10
}

eval_input_reader: {
  tf_record_input_reader {
    input_path: "PATH_TO_BE_CONFIGURED/pet_val.record"
  }
  label_map_path: "PATH_TO_BE_CONFIGURED/pet_label_map.pbtxt"
  shuffle: false
  num_readers: 1
}

第7至10行表示這num_classes是faster_rcnnmessage 的參數之一，而后者又是message的參數model。同樣，optimizer是父train_config消息的子消息，而message的batch_size另一個參數train_config。我們可以通過簽出相應的protos文件來驗證這一點。

syntax = "proto2";

package object_detection.protos;

import "object_detection/protos/anchor_generator.proto";
import "object_detection/protos/box_predictor.proto";
import "object_detection/protos/hyperparams.proto";
import "object_detection/protos/image_resizer.proto";
import "object_detection/protos/losses.proto";
import "object_detection/protos/post_processing.proto";

// Configuration for Faster R-CNN models.
// See meta_architectures/faster_rcnn_meta_arch.py and models/model_builder.py
//
// Naming conventions:
// Faster R-CNN models have two stages: a first stage region proposal network
// (or RPN) and a second stage box classifier.  We thus use the prefixes
// `first_stage_` and `second_stage_` to indicate the stage to which each
// parameter pertains when relevant.
message FasterRcnn {

  // Whether to construct only the Region Proposal Network (RPN).
  optional int32 number_of_stages = 1 [default=2];

  // Number of classes to predict.
  optional int32 num_classes = 3;
  
  // Image resizer for preprocessing the input image.
  optional ImageResizer image_resizer = 4;

從第20行和第26行可以明顯看出，這num_classes是optional消息的參數之一faster_rcnn。我希望到目前為止的討論有助於理解配置文件的組織。現在，是時候正確更新模型的參數之一了。

步驟1：確定要更新的參數

假設我們需要更新fast_rcnn_resnet50_pets.config文件的image_resizer第10行中提到的參數。

步驟2：在存儲庫中搜索給定參數

目標是找到proto參數文件。為此，我們需要在存儲庫中搜索。
我們需要搜索以下代碼：
```
parameter_name path:research/object_detection/protos
#in our case parameter_name="image_resizer" thus,
image_resizer path:research/object_detection/protos
```
在此path:research/object_detection/protos限制搜索域。在此處可以找到有關如何在GitHub上搜索的更多信息。搜索的輸出image_resizer path:research/object_detection/protos如下所示：

從輸出中很明顯，要更新image_resizer參數，我們需要分析image_resizer.proto文件。

步驟3：分析proto檔案

syntax = "proto2";

package object_detection.protos;

// Configuration proto for image resizing operations.
// See builders/image_resizer_builder.py for details.
message ImageResizer {
  oneof image_resizer_oneof {
    KeepAspectRatioResizer keep_aspect_ratio_resizer = 1;
    FixedShapeResizer fixed_shape_resizer = 2;
  }
}

// Enumeration type for image resizing methods provided in TensorFlow.
enum ResizeType {
  BILINEAR = 0; // Corresponds to tf.image.ResizeMethod.BILINEAR
  NEAREST_NEIGHBOR = 1; // Corresponds to tf.image.ResizeMethod.NEAREST_NEIGHBOR
  BICUBIC = 2; // Corresponds to tf.image.ResizeMethod.BICUBIC
  AREA = 3; // Corresponds to tf.image.ResizeMethod.AREA
}

// Configuration proto for image resizer that keeps aspect ratio.
message KeepAspectRatioResizer {
  // Desired size of the smaller image dimension in pixels.
  optional int32 min_dimension = 1 [default = 600];

  // Desired size of the larger image dimension in pixels.
  optional int32 max_dimension = 2 [default = 1024];

  // Desired method when resizing image.
  optional ResizeType resize_method = 3 [default = BILINEAR];

  // Whether to pad the image with zeros so the output spatial size is
  // [max_dimension, max_dimension]. Note that the zeros are padded to the
  // bottom and the right of the resized image.
  optional bool pad_to_max_dimension = 4 [default = false];

  // Whether to also resize the image channels from 3 to 1 (RGB to grayscale).
  optional bool convert_to_grayscale = 5 [default = false];

  // Per-channel pad value. This is only used when pad_to_max_dimension is True.
  // If unspecified, a default pad value of 0 is applied to all channels.
  repeated float per_channel_pad_value = 6;
}

// Configuration proto for image resizer that resizes to a fixed shape.
message FixedShapeResizer {
  // Desired height of image in pixels.
  optional int32 height = 1 [default = 300];

  // Desired width of image in pixels.
  optional int32 width = 2 [default = 300];

  // Desired method when resizing image.
  optional ResizeType resize_method = 3 [default = BILINEAR];

  // Whether to also resize the image channels from 3 to 1 (RGB to grayscale).
  optional bool convert_to_grayscale = 4 [default = false];
}

從第8-10行可以看出，我們可以使用keep_aspect_ratio_resizer或調整圖像的大小fixed_shape_resizer。在分析行23-44，我們可以觀察到的消息keep_aspect_ratio_resizer有參數：min_dimension，max_dimension，resize_method，pad_to_max_dimension，convert_to_grayscale，和per_channel_pad_value。此外，fixed_shape_resizer有參數：height，width，resize_method，和convert_to_grayscale。proto文件中提到了所有參數的數據類型。因此，要更改image_resizer類型，我們可以在配置文件中更改以下幾行。

#before
image_resizer {
keep_aspect_ratio_resizer {
min_dimension: 600 
max_dimension: 1024
    }
}
#after
image_resizer {
fixed_shape_resizer {
height: 600
width: 500
resize_method: AREA
  }
}

上面的代碼將使用AREA調整大小方法將圖像調整為500 * 600。TensorFlow中可用的各種調整大小的方法可以在這里找到。

其他例子

我們可以使用上一節中討論的步驟更新/添加任何參數。我將在此處演示一些經常使用的示例，但是上面討論的步驟可能有助於更新/添加模型的任何參數。

更改重量初始化器
- 決定更改fast_rcnn_resnet50_pets.config文件的initializer第35行的參數。
- initializer path:research/object_detection/protos在存儲庫中搜索。根據搜索結果，很明顯我們需要分析hyperparams.proto文件。
- - hyperparams.proto文件中的第68–74行說明了initializer配置。
  - ```
  message Initializer {
    oneof initializer_oneof {
      TruncatedNormalInitializer truncated_normal_initializer = 1;
      VarianceScalingInitializer variance_scaling_initializer = 2;
      RandomNormalInitializer random_normal_initializer = 3;
    }
  }
```
  我們可以使用random_normal_intializer代替truncated_normal_initializer，因為我們需要分析hyperparams.proto文件中的第99–102行。
- message RandomNormalInitializer {
  optional float mean = 1 [default = 0.0];
  optional float stddev = 2 [default = 1.0];
  }
- 顯然random_normal_intializer有兩個參數mean和stddev。我們可以將配置文件中的以下幾行更改為use random_normal_intializer。
- ```
#before
initializer {
    truncated_normal_initializer {
        stddev: 0.01
       }
}
#after
initializer {
    random_normal_intializer{
       mean: 1 
       stddev: 0.5
       }
}
```
    更改體重優化器
    - 決定更改faster_rcnn_resnet50_pets.config文件的第87行momentum_optimizer的父消息的參數。optimizer
    - optimizer path:research/object_detection/protos在存儲庫中搜索。根據搜索結果，很明顯我們需要分析optimizer.proto文件。
    - - optimizer.proto文件中的9-14行，解釋optimizer配置。
      message Optimizer { oneof optimizer { RMSPropOptimizer rms_prop_optimizer = 1; MomentumOptimizer momentum_optimizer = 2; AdamOptimizer adam_optimizer = 3; }
      顯然，代替momentum_optimizer我們可以使用adam_optimizer已被證明是良好的優化程序。為此，我們需要在f aster_rcnn_resnet50_pets.config文件中進行以下更改。
```
#before
optimizer {  
  momentum_optimizer: {
      learning_rate: {
           manual_step_learning_rate {
          initial_learning_rate: 0.0003
          schedule {
            step: 900000
            learning_rate: .00003
          }
          schedule {
            step: 1200000
            learning_rate: .000003
          }
        }
      }
      momentum_optimizer_value: 0.9
    }
#after
optimizer {
  adam_optimizer: {
      learning_rate: {
       manual_step_learning_rate {
          initial_learning_rate: 0.0003
          schedule {
            step: 900000
            learning_rate: .00003
          }
          schedule {
            step: 1200000
            learning_rate: .000003
          }
        }
      }
    }
```
  評估預訓練模型
  
  Eval等待300秒，以檢查訓練模型是否已更新！如果您的GPU不錯，那么您可以同時進行訓練和評估！通常，資源將被耗盡。為了克服這個問題，我們可以先訓練模型，將其保存在目錄中，然后再評估模型。為了稍后進行評估，我們需要在配置文件中進行以下更改：
- ```
#Before
eval_config: {
  num_examples: 2000
  # Note: The below line limits the evaluation process to 10 evaluations.
  # Remove the below line to evaluate indefinitely.
  max_evals: 10
}
#after
eval_config: {
num_examples: 10
num_visualizations: 10
eval_interval_secs: 0
}
```
  num_visualizations應該等於要評估的數量！可視化的數量越多，評估所需的時間就越多。如果您的GPU具有足夠的能力同時進行訓練和評估，則可以保留eval_interval_secs: 300。此參數決定運行評估的頻率。我按照上面討論的3個步驟得出了這個結論。
  
  簡而言之，協議緩沖區的知識幫助我們理解了模型參數是以消息形式傳遞的，並且可以更新我們可以引用的.proto文件的參數。討論了3個簡單的步驟來找到.proto用於更新參數的正確文件。
  
  請在注釋的配置文件中提及您要更新/添加的任何參數。
- 關注【OpenCV與AI深度學習】獲得更多資訊
  
  掃描下面二維碼即可關注

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 TensorFlow Object Detection API中的Faster R-CNN /SSD模型參數調整 tensorflow object detection faster r-cnn 中keep_aspect_ratio_resizer是什么意思 Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks論文理解深度學習論文翻譯解析（四）：Faster R-CNN: Down the rabbit hole of modern object detection 目標檢測（四）Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks 【CV論文閱讀】Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks 中文版 Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks 論文閱讀之：Is Faster R-CNN Doing Well for Pedestrian Detection? 目標檢測算法--Faster R-CNN、SSD、YOLO Faster R-CNN