1、前言
在工業產品缺陷檢測中,基於傳統的圖像特征的缺陷分類的准確率達不到實際生產的要求,因此想采用CNN來進行缺陷分類。
傳統缺陷分類思路:
1、缺陷圖片分離:先采用復雜的圖像處理方法,將缺陷從采集的圖像中分離處理;
2、特征向量構建:通過對不同缺陷種類的特征進行分析,定義需要提取的n維特征(比如缺陷長、寬、對比度、紋理特征、熵、梯度等),構成一組描述缺陷的
特征向量;特征向量的構建需要對實際的問題有很深入的分析,並且需要有很深厚的圖像處理知識;這也是傳統分類問題中最難的部分。
3、特征向量歸一化:由於特征向量每個維度的度量差別很大(比如缺陷長50像素,對比度0.03),因此需要進行特征縮放,特征歸一化;
4、人工標記缺陷:將缺陷圖片存儲在人工標記的文件夾內;
5、采用SVM對缺陷進行分類,分類准確率85%左右。
2、CNN網絡構建
在缺陷圖片分離和人工標記后,構建CNN網絡模型;由於工業檢測中對實時性要求很高,因此想采用比較簡單的網絡結構來提高訓練的速度和檢測速度;
網絡構建:本文采用LeNet網絡結構的基本思路,構建一個簡單的網絡
圖1:Tensorflow輸出的網絡模型
3、模型訓練和測試
3.1 原始模型測試
開始以為模型可能會出現過擬合的問題,不過從精度和損失曲線看來,沒有過擬合問題,到是模型初始迭代的時候陷入了一個局部循環狀態,可能是沒有得到特別好的特征或者是隨機選擇訓練模型的數據集沒有完全分散,也有可能是訓練的次數太少了。訓練集上的准確率有點低,因此需要用更好的模型,但是模型怎么改呢??盡管CNN可以自己訓練出FIlters,但是依然不能很清晰的看到圖像被濾波后是怎么樣的狀態(圖2,圖3),對於一直做圖像底層算法的人來說,有點很不爽。
圖2 :卷積第一層
圖3:Relu激活函數層
通過分析圖2,發現濾波整體效果還不錯,缺陷的地方都能清晰的反映出來;但是本來輸入的缺陷是往下凹的,濾波后的缺陷很多是向上凸的,不符合實際情況。
分析圖3,發現經過Relu激活函數后,只留下了很明顯向下凹的缺陷特征圖片,但是有效的特征圖片(FeatureMap)太少,只有2個。
圖4: 上凸圖片數據
圖5:下凹圖片數據
為了能得到更多的符合實際的缺陷特征圖片,考慮到需要更加突出缺陷邊緣,以致不被周圍大片圖像的干擾,因此決定將卷積核變小;卷積核由默認的5x5改為3x3.
3.2 優化卷積核大小后
模型整體的精度有明顯的上升,經過Relu后的有效FeatureMap增加了。有點疑問的是validation數據集的准確率比訓練還高5-8個點???
4、Code
# -*- coding: utf-8 -*- # @Time : 18-7-25 下午2:33 # @Author : DuanBin # @Email : 20092758@cqu.edu.cn # @File : catl_train.py # @Software: PyCharm # USAGE # python catl_train.py --dataset data --model catl.model # import the necessary packages from keras.preprocessing.image import ImageDataGenerator from keras.optimizers import Adam from sklearn.model_selection import train_test_split from keras.preprocessing.image import img_to_array from keras.utils import to_categorical from keras.models import Model from keras.models import load_model from lenet import LeNet from imutils import paths import matplotlib.pyplot as plt import numpy as np import argparse import random import cv2 import os # set the matplotlib backend so figures can be saved in the background import matplotlib matplotlib.use("Agg") dataPath = "data" modelPath = "catl_5_5.model" plotPath = "catl_plot_5_5_blog.png" # initialize the number of epochs to train for, initia learning rate, # and batch size EPOCHS = 50 INIT_LR = 0.001 BS = 3 classNumber = 3 imageDepth = 1 # initialize the data and labels print("[INFO] loading images...") data = [] labels = [] # grab the image paths and randomly shuffle them imagePaths = sorted(list(paths.list_images(dataPath))) # args["dataset"]))) random.seed(42) random.shuffle(imagePaths) # loop over the input images for imagePath in imagePaths: # load the image, pre-process it, and store it in the data list image = cv2.imread(imagePath, 0) image = cv2.resize(image, (28, 28)) image = img_to_array(image) data.append(image) # extract the class label from the image path and update the # labels list label = imagePath.split(os.path.sep)[-2] if label == "dity": label = 0 elif label == "tan": label = 1 elif label == "valley": label = 2 labels.append(label) # scale the raw pixel intensities to the range [0, 1] data = np.array(data, dtype="float") / 255.0 labels = np.array(labels) # partition the data into training and testing splits using 75% of # the data for training and the remaining 25% for testing (trainX, testX, trainY, testY) = train_test_split(data, labels, test_size=0.3, random_state=42) print(trainX.shape) # convert the labels from integers to vectors trainY = to_categorical(trainY, num_classes=classNumber) testY = to_categorical(testY, num_classes=classNumber) print(trainY.shape) print(testX.shape) # construct the image generator for data augmentation aug = ImageDataGenerator(rotation_range=30, width_shift_range=0.1, height_shift_range=0.1, shear_range=0.2, zoom_range=0.2, horizontal_flip=True, fill_mode="nearest") # # initialize the model print("[INFO] compiling model...") model = LeNet.build(width=28, height=28, depth=imageDepth, classes=classNumber) opt = Adam(lr=INIT_LR, decay=INIT_LR / EPOCHS) model.compile(loss="categorical_crossentropy", optimizer=opt, metrics=["accuracy"]) model.summary() # train the network print("[INFO] training network...") H = model.fit_generator(aug.flow(trainX, trainY, batch_size=BS), validation_data=(testX, testY), steps_per_epoch=len(trainX) // BS, epochs=EPOCHS, verbose=1) # save the model to disk print("[INFO] serializing network...") model.save(modelPath) # args["model"]) model.save_weights("catl_5_5_wight.h5") # plot the training loss and accuracy plt.style.use("ggplot") plt.figure() N = EPOCHS plt.plot(np.arange(0, N), H.history["loss"], label="train_loss") plt.plot(np.arange(0, N), H.history["val_loss"], label="val_loss") plt.plot(np.arange(0, N), H.history["acc"], label="train_acc") plt.plot(np.arange(0, N), H.history["val_acc"], label="val_acc") plt.title("Training Loss and Accuracy") plt.xlabel("Epoch #") plt.ylabel("Loss/Accuracy") plt.legend(loc="lower left") plt.savefig(plotPath) # args["plot"]) plt.show() layer_outputs = [layer.output for layer in model.layers] activation_model = Model(inputs=model.input, outputs=layer_outputs) activations = activation_model.predict(testX[0].reshape(1, 28, 28, 1)) def display_activation(activations, col_size, row_size, act_index): activation = activations[act_index] activation_index = 0 fig, ax = plt.subplots(row_size, col_size, figsize=(row_size * 2.5, col_size * 1.5)) for row in range(0, row_size): for col in range(0, col_size): ax[row][col].imshow(activation[0, :, :, activation_index], cmap='gray') activation_index += 1 plt.show() display_activation(activations, 4, 5, 1)
# import the necessary packages from keras.models import Sequential from keras.layers.convolutional import Conv2D from keras.layers.convolutional import MaxPooling2D from keras.layers.core import Activation from keras.layers.core import Flatten from keras.layers.core import Dense from keras.layers.core import Dropout from tensorflow.keras import backend as K class LeNet: @staticmethod def build(width, height, depth, classes): # initialize the model model = Sequential() inputShape = (height, width, depth) # if we are using "channels first", update the input shape if K.image_data_format() == "channels_first": inputShape = (depth, height, width) else: inputShape = (width, height, depth) # first set of CONV => RELU => POOL layers model.add(Conv2D(20, (3, 3), padding="same", input_shape=inputShape)) model.add(Activation("relu")) model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2))) # second set of CONV => RELU => POOL layers model.add(Conv2D(50, (3, 3), padding="same")) model.add(Activation("relu")) model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2))) # first (and only) set of FC => RELU layers model.add(Flatten()) model.add(Dense(500)) model.add(Activation("relu")) # softmax classifier model.add(Dense(classes)) model.add(Activation("softmax")) # return the constructed network architecture return model