『計算機視覺』Mask-RCNN_訓練網絡其三：訓練Model

本文轉載自查看原文 2018-11-20 10:29 4156 工程及算法實現/ TensorFlow

一、模型初始化

1、創建模型並載入預訓練參數

准備了數據集后，我們開始構建model，training網絡結構上一節已經介紹完了，現在我們看一看訓練時如何調用training結構的網絡。

如上所示，我們首先建立圖結構（詳見上節『計算機視覺』Mask-RCNN_訓練網絡其二：train網絡結構），然后選擇初始化參數方案

例子（train_shape.ipynb）中使用的是COCO預訓練模型，如果想要"Finds the last checkpoint file of the last trained model in the
model directory"，那么選擇"last"選項。

載入參數方法如下，注意幾個之前接觸不多的操作，

載入h5文件使用模塊為h5py
keras model有屬性.layers以list形式返回全部的層對象
keras.engine下的saving模塊load_weights_from_hdf5_group_by_name按照名字對應，而load_weights_from_hdf5_group按照記錄順序對應

    def load_weights(self, filepath, by_name=False, exclude=None):
        """Modified version of the corresponding Keras function with
        the addition of multi-GPU support and the ability to exclude
        some layers from loading.
        exclude: list of layer names to exclude
        """
        import h5py
        # Conditional import to support versions of Keras before 2.2
        # TODO: remove in about 6 months (end of 2018)
        try:
            from keras.engine import saving
        except ImportError:
            # Keras before 2.2 used the 'topology' namespace.
            from keras.engine import topology as saving

        if exclude:
            by_name = True

        if h5py is None:
            raise ImportError('`load_weights` requires h5py.')
        f = h5py.File(filepath, mode='r')
        if 'layer_names' not in f.attrs and 'model_weights' in f:
            f = f['model_weights']

        # In multi-GPU training, we wrap the model. Get layers
        # of the inner model because they have the weights.
        keras_model = self.keras_model
        layers = keras_model.inner_model.layers if hasattr(keras_model, "inner_model")\
            else keras_model.layers

        # Exclude some layers
        if exclude:
            layers = filter(lambda l: l.name not in exclude, layers)

        if by_name:
            saving.load_weights_from_hdf5_group_by_name(f, layers)
        else:
            saving.load_weights_from_hdf5_group(f, layers)
        if hasattr(f, 'close'):
            f.close()

        # Update the log directory
        self.set_log_dir(filepath)

2、從h5文件一窺load模式

keras model的層

對於layer對象我們有一下幾點說明

layer.name：查詢層對象的節點名稱

layer.trainable：層對象是否可訓練

對於TimeDistributed對象，其.layer方法返回對象才是我們要設定的層對象

載入模型並查看layers如下，

查看名稱如下，名稱即我們在build函數中為每個層設置的名稱，和TensorFlow一樣，參數載入依賴於此。

h5 文件記錄

載入h5文件並查看，f.attrs記錄了三個值，第一個為字符串list，后兩個均為字符串，對於"layer_names"我們如下嘗試，其記錄了各個層的name字符串（h5記錄的都是二進制形式，需要轉碼）

在keras.engine的saving方法中，可以看到后兩個記錄的解析，實際測試一個是keras的版本號，一個會返回b'tensorflow'

        if 'keras_version' in f.attrs:
        original_keras_version = f.attrs['keras_version'].decode('utf8')
    else:
        original_keras_version = '1'
    if 'backend' in f.attrs:
        original_backend = f.attrs['backend'].decode('utf8')
    else:
        original_backend = None

"layer_names"記錄的字符串們可以視為h5文件索引，其索引對象為子h5對象，子h5對象有attrs："weight_names"，也是字符串list，可以索引子h5對象，其索引出的便是參數值，示意如下：

實際的載入參數時，keras API已經封裝的很好了，不需要我們自己取對應h5中的名稱和網絡中的名稱，然后更新參數值。交由saving.load_weights_from_hdf5_group_by_name(f, layers)，輸入f句柄，輸入需要載入參數的層對象即可對應名字完成載入。

二、模型訓練

本部不講解網絡結構，主要介紹的是訓練步驟，網絡結構介紹見：『計算機視覺』Mask-RCNN_訓練網絡其二：train網絡結構

模型訓練有兩種模式，

Only the heads. Here we're freezing all the backbone layers and training only the randomly initialized layers (i.e. the ones that we didn't use pre-trained weights from MS COCO). To train only the head layers, pass layers='heads' to the train() function.
Fine-tune all layers. For this simple example it's not necessary, but we're including it to show the process. Simply pass layers="all to train all layers.

1、train方法文檔

train方法聲明如下

def train(self, train_dataset, val_dataset, learning_rate, epochs, layers,
              augmentation=None, custom_callbacks=None, no_augmentation_sources=None)

文檔說明如下

"""Train the model.
   train_dataset, val_dataset: Training and validation Dataset objects.
   learning_rate: The learning rate to train with
   epochs: Number of training epochs. Note that previous training epochs
           are considered to be done alreay, so this actually determines
           the epochs to train in total rather than in this particaular
           call.
   layers: Allows selecting which layers to train. It can be:
       - A regular expression to match layer names to train
       - One of these predefined values:
         heads: The RPN, classifier and mask heads of the network
         all: All the layers
         3+: Train Resnet stage 3 and up
         4+: Train Resnet stage 4 and up
         5+: Train Resnet stage 5 and up
   augmentation: Optional. An imgaug (https://github.com/aleju/imgaug)
       augmentation. For example, passing imgaug.augmenters.Fliplr(0.5)
       flips images right/left 50% of the time. You can pass complex
       augmentations as well. This augmentation applies 50% of the
       time, and when it does it flips images right/left half the time
       and adds a Gaussian blur with a random sigma in range 0 to 5.

           augmentation = imgaug.augmenters.Sometimes(0.5, [
               imgaug.augmenters.Fliplr(0.5),
               imgaug.augmenters.GaussianBlur(sigma=(0.0, 5.0))
           ])
custom_callbacks: Optional. Add custom callbacks to be called
    with the keras fit_generator method. Must be list of type keras.callbacks.
   no_augmentation_sources: Optional. List of sources to exclude for
       augmentation. A source is string that identifies a dataset and is
       defined in the Dataset class.
"""

2、模型准備&數據准備

首先對模型設置進行准備。

指定訓練層時既可以輸入層名層，也可以輸入預定的字符串，輸入預定字符串則其解析規則見下面開頭幾行。最終獲取layers變量記錄要訓練層的名字（或者正則表達式）
然后准備數據，data_generator函數涉及預處理流程很繁瑣（見model.py），可以自行查閱
生成文件保存目錄

        assert self.mode == "training", "Create model in training mode."

        # Pre-defined layer regular expressions
        layer_regex = {
            # all layers but the backbone
            "heads": r"(mrcnn\_.*)|(rpn\_.*)|(fpn\_.*)",
            # From a specific Resnet stage and up
            "3+": r"(res3.*)|(bn3.*)|(res4.*)|(bn4.*)|(res5.*)|(bn5.*)|(mrcnn\_.*)|(rpn\_.*)|(fpn\_.*)",
            "4+": r"(res4.*)|(bn4.*)|(res5.*)|(bn5.*)|(mrcnn\_.*)|(rpn\_.*)|(fpn\_.*)",
            "5+": r"(res5.*)|(bn5.*)|(mrcnn\_.*)|(rpn\_.*)|(fpn\_.*)",
            # All layers
            "all": ".*",
        }
        if layers in layer_regex.keys():
            layers = layer_regex[layers]

        # Data generators
        train_generator = data_generator(train_dataset, self.config, shuffle=True,
                                         augmentation=augmentation,
                                         batch_size=self.config.BATCH_SIZE,
                                         no_augmentation_sources=no_augmentation_sources)
        val_generator = data_generator(val_dataset, self.config, shuffle=True,
                                       batch_size=self.config.BATCH_SIZE)

        # Create log_dir if it does not exist
        if not os.path.exists(self.log_dir):
            os.makedirs(self.log_dir)

3、model處理

這里主要的步驟就是

將前一步的可訓練層名稱傳入函數self.set_trainable(layers)，設置對應層對象的trainable屬性為True
self.compile方法設定優化器，綜合各個loss給出整體優化對象，最后編譯model

        # Callbacks
        callbacks = [
            keras.callbacks.TensorBoard(log_dir=self.log_dir,
                                        histogram_freq=0, write_graph=True, write_images=False),
            keras.callbacks.ModelCheckpoint(self.checkpoint_path,
                                            verbose=0, save_weights_only=True),
        ]

        # Add custom callbacks to the list
        if custom_callbacks:
            callbacks += custom_callbacks

        # Train
        log("\nStarting at epoch {}. LR={}\n".format(self.epoch, learning_rate))
        log("Checkpoint Path: {}".format(self.checkpoint_path))
        self.set_trainable(layers)
        self.compile(learning_rate, self.config.LEARNING_MOMENTUM)

        # Work-around for Windows: Keras fails on Windows when using
        # multiprocessing workers. See discussion here:
        # https://github.com/matterport/Mask_RCNN/issues/13#issuecomment-353124009
        if os.name is 'nt':
            workers = 0
        else:
            workers = multiprocessing.cpu_count()  # 單機默認為0

self.compile方法

    def compile(self, learning_rate, momentum):
        """Gets the model ready for training. Adds losses, regularization, and
        metrics. Then calls the Keras compile() function.
        """
        # Optimizer object
        optimizer = keras.optimizers.SGD(
            lr=learning_rate, momentum=momentum,
            clipnorm=self.config.GRADIENT_CLIP_NORM)
        # Add Losses
        # First, clear previously set losses to avoid duplication
        self.keras_model._losses = []
        self.keras_model._per_input_losses = {}
        loss_names = [
            "rpn_class_loss",  "rpn_bbox_loss",
            "mrcnn_class_loss", "mrcnn_bbox_loss", "mrcnn_mask_loss"]
        for name in loss_names:
            layer = self.keras_model.get_layer(name)
            if layer.output in self.keras_model.losses:
                continue
            loss = (
                tf.reduce_mean(layer.output, keep_dims=True)
                * self.config.LOSS_WEIGHTS.get(name, 1.))
            self.keras_model.add_loss(loss)

        # Add L2 Regularization
        # Skip gamma and beta weights of batch normalization layers.
        reg_losses = [
            keras.regularizers.l2(self.config.WEIGHT_DECAY)(w) / tf.cast(tf.size(w), tf.float32)
            for w in self.keras_model.trainable_weights
            if 'gamma' not in w.name and 'beta' not in w.name]
        self.keras_model.add_loss(tf.add_n(reg_losses))

        # Compile
        self.keras_model.compile(
            optimizer=optimizer,
            loss=[None] * len(self.keras_model.outputs))

        # Add metrics for losses
        for name in loss_names:
            if name in self.keras_model.metrics_names:
                continue
            layer = self.keras_model.get_layer(name)
            self.keras_model.metrics_names.append(name)
            loss = (
                tf.reduce_mean(layer.output, keepdims=True)
                * self.config.LOSS_WEIGHTS.get(name, 1.))
            self.keras_model.metrics_tensors.append(loss)

self.set_trainable方法

    def set_trainable(self, layer_regex, keras_model=None, indent=0, verbose=1):
        """Sets model layers as trainable if their names match
        the given regular expression.
        """
        # Print message on the first call (but not on recursive calls)
        if verbose > 0 and keras_model is None:
            log("Selecting layers to train")

        keras_model = keras_model or self.keras_model

        # In multi-GPU training, we wrap the model. Get layers
        # of the inner model because they have the weights.
        layers = keras_model.inner_model.layers if hasattr(keras_model, "inner_model")\
            else keras_model.layers

        for layer in layers:
            # Is the layer a model?
            if layer.__class__.__name__ == 'Model':  # 不同層隸屬不同的class，但Model class是單一的
                print("In model: ", layer.name)
                self.set_trainable(
                    layer_regex, keras_model=layer, indent=indent + 4)
                continue

            if not layer.weights:
                continue
            # Is it trainable?
            trainable = bool(re.fullmatch(layer_regex, layer.name))
            # Update layer. If layer is a container, update inner layer.
            if layer.__class__.__name__ == 'TimeDistributed':
                layer.layer.trainable = trainable
            else:
                layer.trainable = trainable
            # Print trainable layer names
            if trainable and verbose > 0:
                log("{}{:20}   ({})".format(" " * indent, layer.name,
                                            layer.__class__.__name__))

4、訓練model

最簡單的一步了，調用keras接口訓練即可，上一步定義的callbacks也是在這里傳入

        self.keras_model.fit_generator(
            train_generator,
            initial_epoch=self.epoch,
            epochs=epochs,
            steps_per_epoch=self.config.STEPS_PER_EPOCH,
            callbacks=callbacks,
            validation_data=val_generator,
            validation_steps=self.config.VALIDATION_STEPS,
            max_queue_size=100,
            workers=workers,
            use_multiprocessing=True,
        )
        self.epoch = max(self.epoch, epochs)

至此，train方法便自動的開始了模型的訓練工作。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 『計算機視覺』Mask-RCNN_推斷網絡其三：RPN錨框處理和Proposal生成『計算機視覺』Mask-RCNN_錨框生成『計算機視覺』Mask-RCNN_項目文檔翻譯『計算機視覺』經典RCNN_其一：從RCNN到Faster-RCNN 『計算機視覺』經典RCNN_其二：Faster-RCNN 《Python計算機視覺編程》計算機視覺整理庫『計算機視覺』空洞卷積計算機視覺中的濾波【計算機視覺】如何使用opencv自帶工具訓練人臉檢測分類器