目的: 要求使用CNN來處理識別不同大小的彩色圖像。
1. 分析問題
使用卷積神經網絡處理彩色圖像會遇到兩個挑戰:
1. 照片大小不同
2. 顏色是彩色的
對於第一個問題,將所有處理照片都調整成相同大小 。
對於第二個問題:將照片分成3維數據, 長,寬,深度
其中長與寬表示照片大小 , 深度表示RGP顏色。
執行卷積過程
![]()
圖1-1 執行卷積過程 |
彩色圖像通過RGB表示顏色, 所有深度為3, 每個通道顏色都有自己的核過濾器,將所有值都相加再加上bias值。
MaxPooling 過程
過程同樣, 將執行卷積的三維的卷積結果使用RGB分別進行運算
![]()
圖1-2 執行MaxPooling過程 |
2. Coading
第一步:導入包頭
from __future__ import absolute_import, division, print_function, unicode_literals import tensorflow as tf from tensorflow.keras.preprocessing.image import ImageDataGenerator # Import TensorFlow Datasets import tensorflow_datasets as tfds tfds.disable_progress_bar() # Helper libraries import numpy as np import matplotlib.pyplot as plt import os import logging logger = tf.get_logger() logger.setLevel(logging.ERROR)
注釋: 理解
![]()
理解 ImageDataGenerator 函數接口 |
ImageDataGenerator 函數可以幫助你自動的標注該圖片的類型
第二步:載入並下載數據
_URL = 'https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip' zip_dir = tf.keras.utils.get_file('cats_and_dogs_filterted.zip', origin=_URL, extract=True) zip_dir_base = os.path.dirname(zip_dir) base_dir = os.path.join(os.path.dirname(zip_dir), 'cats_and_dogs_filtered') train_dir = os.path.join(base_dir, 'train') validation_dir = os.path.join(base_dir, 'validation') train_cats_dir = os.path.join(train_dir, 'cats') # directory with our training cat pictures train_dogs_dir = os.path.join(train_dir, 'dogs') # directory with our training dog pictures validation_cats_dir = os.path.join(validation_dir, 'cats') # directory with our validation cat pictures validation_dogs_dir = os.path.join(validation_dir, 'dogs') # directory with our validation dog pictures num_cats_tr = len(os.listdir(train_cats_dir)) num_dogs_tr = len(os.listdir(train_dogs_dir)) num_cats_val = len(os.listdir(validation_cats_dir)) num_dogs_val = len(os.listdir(validation_dogs_dir)) total_train = num_cats_tr + num_dogs_tr total_val = num_cats_val + num_dogs_val print('total training cat images:', num_cats_tr) print('total training dog images:', num_dogs_tr) print('total validation cat images:', num_cats_val) print('total validation dog images:', num_dogs_val) print("--") print("Total training images:", total_train) print("Total validation images:", total_val)
輸出結果
Downloading data from https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip
68608000/68606236 [==============================] - 20s 0us/step
total training cat images: 1000
total training dog images: 1000
total validation cat images: 500
total validation dog images: 500
--
Total training images: 2000
Total validation images: 1000
第三步: 設置參數
BATCH_SIZE = 100 # Number of training examples to process before updating our models variables IMG_SHAPE = 150 # Our training data consists of images with width of 150 pixels and height of 150 pixels
第四步:准備數據
包括以下步驟
* 從硬盤上讀取數據
* 將這些照片信息轉換成RGB格式的信息
* 轉換成floating 張量格式
* 將RGB值[0,255] 轉換成[0,1]
train_image_generator = ImageDataGenerator(rescale=1./255) # Generator for our training data validation_image_generator = ImageDataGenerator(rescale=1./255) # Generator for our validation data
flow_from_directory 方法將會從硬盤上面讀取數據
train_data_gen = train_image_generator.flow_from_directory(batch_size=BATCH_SIZE, directory=train_dir, shuffle=True, target_size=(IMG_SHAPE,IMG_SHAPE), #(150,150) class_mode='binary')
第五步: 查看數據
sample_training_images, _ = next(train_data_gen) # This function will plot images in the form of a grid with 1 row and 5 columns where images are placed in each column. def plotImages(images_arr): fig, axes = plt.subplots(1, 5, figsize=(20,20)) axes = axes.flatten() for img, ax in zip(images_arr, axes): ax.imshow(img) plt.tight_layout() plt.show() plotImages(sample_training_images[:5]) # Plot images 0-4
對於過擬合問題的解法
* 圖像增強
* 丟棄
第六步: 定義模型
模型由4個卷積塊組成,每個卷積都有max pool 層。
然后使用512個單元格的神經組成密集層,激活函數使用relu
輸出使用2個概率, 總和值為1
model = tf.keras.models.Sequential([ tf.keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(150, 150, 3)), tf.keras.layers.MaxPooling2D(2, 2), tf.keras.layers.Conv2D(64, (3,3), activation='relu'), tf.keras.layers.MaxPooling2D(2,2), tf.keras.layers.Conv2D(128, (3,3), activation='relu'), tf.keras.layers.MaxPooling2D(2,2), tf.keras.layers.Conv2D(128, (3,3), activation='relu'), tf.keras.layers.MaxPooling2D(2,2), tf.keras.layers.Flatten(), tf.keras.layers.Dense(512, activation='relu'), tf.keras.layers.Dense(2, activation='softmax') ])
在處理二元分類問題時,另一個常見做法是
分類器由一個 Dense
層(具有 1
個輸出單元)和一個 sigmoid
激活函數組成
tf.keras.layers.Dense(1, activation='sigmoid')
並且將loss 函數修改成“binary_crossentropy”
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
第七步:編譯模型
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
第八步: 查看模型匯總信息
model.summary()
打印信息如下
Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d (Conv2D) (None, 148, 148, 32) 896 _________________________________________________________________ max_pooling2d (MaxPooling2D) (None, 74, 74, 32) 0 _________________________________________________________________ conv2d_1 (Conv2D) (None, 72, 72, 64) 18496 _________________________________________________________________ max_pooling2d_1 (MaxPooling2 (None, 36, 36, 64) 0 _________________________________________________________________ conv2d_2 (Conv2D) (None, 34, 34, 128) 73856 _________________________________________________________________ max_pooling2d_2 (MaxPooling2 (None, 17, 17, 128) 0 _________________________________________________________________ conv2d_3 (Conv2D) (None, 15, 15, 128) 147584 _________________________________________________________________ max_pooling2d_3 (MaxPooling2 (None, 7, 7, 128) 0 _________________________________________________________________ flatten (Flatten) (None, 6272) 0 _________________________________________________________________ dense (Dense) (None, 512) 3211776 _________________________________________________________________ dense_1 (Dense) (None, 2) 1026 ================================================================= Total params: 3,453,634 Trainable params: 3,453,634 Non-trainable params: 0 _________________________________________________________________
第九步:訓練模型
EPOCHS = 100 history = model.fit_generator( train_data_gen, steps_per_epoch=int(np.ceil(total_train / float(BATCH_SIZE))), epochs=EPOCHS, validation_data=val_data_gen, validation_steps=int(np.ceil(total_val / float(BATCH_SIZE))) )
運行結果
Epoch 96/100
20/20 [==============================] - 95s 5s/step - loss: 1.4625e-05 - accuracy: 1.0000 - val_loss: 1.9852 - val_accuracy: 0.7500
Epoch 97/100
20/20 [==============================] - 95s 5s/step - loss: 1.4207e-05 - accuracy: 1.0000 - val_loss: 1.9879 - val_accuracy: 0.7500
Epoch 98/100
20/20 [==============================] - 96s 5s/step - loss: 1.3850e-05 - accuracy: 1.0000 - val_loss: 1.9903 - val_accuracy: 0.7510
Epoch 99/100
20/20 [==============================] - 95s 5s/step - loss: 1.3508e-05 - accuracy: 1.0000 - val_loss: 1.9930 - val_accuracy: 0.7490
Epoch 100/100
20/20 [==============================] - 96s 5s/step - loss: 1.3158e-05 - accuracy: 1.0000 - val_loss: 1.9955 - val_accuracy: 0.7500
關於過擬合問題
![]()
|
第十步:可視化模型
acc = history.history['accuracy'] val_acc = history.history['val_accuracy'] loss = history.history['loss'] val_loss = history.history['val_loss'] epochs_range = range(EPOCHS) plt.figure(figsize=(8, 8)) plt.subplot(1, 2, 1) plt.plot(epochs_range, acc, label='Training Accuracy') plt.plot(epochs_range, val_acc, label='Validation Accuracy') plt.legend(loc='lower right') plt.title('Training and Validation Accuracy') plt.subplot(1, 2, 2) plt.plot(epochs_range, loss, label='Training Loss') plt.plot(epochs_range, val_loss, label='Validation Loss') plt.legend(loc='upper right') plt.title('Training and Validation Loss') plt.savefig('./foo.png') plt.show()
運行結果
![]()
|