聲明:這是我的個人學習筆記,大佬可以點評,指導,不喜勿噴。實現過程參考自夜雨飄零的博客以及實現代碼。框架是百度開源的框架paddlepaddle。
1.預備工作
這是我上學期一直沒有去填補的坑,之前想通過傳統機器學習方法來實現,不過沒做完。暑假難得回一次家,所以我想該把我沒做完的坑填完吧。
代碼到現在為止已經寫完了,不過還是存在坑的,比如哈士奇它會識別成貓。。。。
依賴的平台是百度的AIStudio,因為本地電腦960M的顯卡受不了呀。
配置環境如下圖所示。
1.1 數據集准備
我采用的是貓狗大戰的數據集,從官方下載而來。
https://www.microsoft.com/en-us/download/confirmation.aspx?id=54765
下載完成后,把它解壓到合適的位置。文件夾的結構是:
AiStuido上的解壓命令是
!unzip -qo /home/aistudio/data/data10141/origin.zip -d /home/aistudio/data/datas
PetImages
┣ Cat
┗ Dog
1.2 數據預處理
1.2.1 刪除無用的圖片
因為貓狗分類中可能會出現很多因素影響分類效果,所以在進行訓練之前,我們需要將圖片進行一定的預處理。以提高訓練的准確精度。
大概思路:
-
如果圖片格式不是JPEG同時也不是PNG就刪除圖片
-
刪除灰度圖
-
如果圖片大小為0(因為貓狗大戰的數據集里面存在大小為0B的圖片)
具體實現代碼如下
# 刪除不是JPEG或者PNG格式的圖片 def delete_error_image(father_path): print(father_path) # 獲取父級目錄的所有文件以及文件夾 try: image_dirs = os.listdir(father_path) for image_dir in image_dirs: image_dir = os.path.join(father_path, image_dir) # 如果是文件夾就繼續獲取文件夾中的圖片 if os.path.isdir(image_dir): images = os.listdir(image_dir) for image in images: image = os.path.join(image_dir, image) try: # 獲取圖片的類型 image_type = imghdr.what(image) # 如果圖片格式不是JPEG同時也不是PNG就刪除圖片 if image_type is not 'jpeg' and image_type is not 'png': os.remove(image) print('已刪除:%s' % image) continue # 刪除灰度圖 img = np.array(Image.open(image)) if len(img.shape) is 2: os.remove(image) print('已刪除:%s' % image) # 如果圖片大小為0(因為貓狗大戰的數據集里面存在大小為0B的圖片) if img.size == (0,0): os.remove(image) print('已刪除:%s' % image) except: os.remove(image) print('已刪除:%s' % image) except: pass
主要利用的框架庫是numpy,PIL,os庫。
運行效果如下圖所示:
因為,當前數據集是我已經處理過的,所以說不存在以上情況的圖片。
1.2.2 reshape
通過查看我們知道所有圖片的尺寸是不一樣的,這就需要我們通過代碼對於圖片尺寸進行一定的調整。此處我統一調整成了224,224。(前提是之前要成功刪除了一些無用的圖片,否則在此步驟處理圖片的時候會報錯。)
# 預處理圖片
def load_image(file,f):
img = Image.open(file)
# 統一圖像大小
img = img.resize((224, 224), Image.ANTIALIAS)
# 輸出處理日志,便於排錯。
f.write("the file path is " + str(file) + ",the size is " + str(img.size) + "\n")
img = img.convert("RGB")
img.save(file)
1.2.3 數據集划分
讀取前8000張代碼進行訓練,(增大數據集,提高模型的准確性!)
def __load_data_set():
srcDog = os.listdir(dog_origin_path)
# 讀取前8000張圖片復制到訓練集中
fnames = ['{}.jpg'.format(i) for i in range(0,8000)]
i = 0
for fname in fnames:
src = os.path.join(dog_origin_path, srcDog[i])
dst = os.path.join(dog_train_path, "dog." + fname)
shutil.copyfile(src, dst)
i+=1
srcCat = os.listdir(cat_origin_path)
j = 0
for fname in fnames:
src = os.path.join(cat_origin_path, srcCat[j])
dst = os.path.join(cat_train_path, "cat." + fname)
shutil.copyfile(src, dst)
j+=1
print('total training cat images:',len(os.listdir(cat_train_path)))
print('total training dog images:',len(os.listdir(dog_train_path)))
輸出結果
已經生成了。
1.2.4 創建數據列表
#創建數據列表
import json
import os
def create_data_list(data_root_path):
with open(data_root_path + "test.list", 'w') as f:
pass
with open(data_root_path + "train.list", 'w') as f:
pass
# 所有類別的信息
class_detail = []
# 獲取所有類別
class_dirs = os.listdir(data_root_path)
# 類別標簽
class_label = 0
# 獲取總類別的名稱
father_paths = data_root_path.split('/')
while True:
if father_paths[len(father_paths) - 1] == '':
del father_paths[len(father_paths) - 1]
else:
break
father_path = father_paths[len(father_paths) - 1]
all_class_images = 0
other_file = 0
# 讀取每個類別
for class_dir in class_dirs:
if class_dir == 'test.list' or class_dir == "train.list" or class_dir == 'readme.json':
other_file += 1
continue
print('正在讀取類別:%s' % class_dir)
# 每個類別的信息
class_detail_list = {}
test_sum = 0
trainer_sum = 0
# 統計每個類別有多少張圖片
class_sum = 0
# 獲取類別路徑
path = data_root_path + "/" + class_dir
# 獲取所有圖片
img_paths = os.listdir(path)
for img_path in img_paths:
# 每張圖片的路徑
name_path = class_dir + '/' + img_path
# 如果不存在這個文件夾,就創建
if not os.path.exists(data_root_path):
os.makedirs(data_root_path)
# 每10張圖片取一個做測試數據
if class_sum % 10 == 0:
test_sum += 1
with open(data_root_path + "test.list", 'a') as f:
f.write(name_path + "\t%d" % class_label + "\n")
else:
trainer_sum += 1
with open(data_root_path + "train.list", 'a') as f:
f.write(name_path + "\t%d" % class_label + "\n")
class_sum += 1
all_class_images += 1
# 說明的json文件的class_detail數據
class_detail_list['class_name'] = class_dir
class_detail_list['class_label'] = class_label
class_detail_list['class_test_images'] = test_sum
class_detail_list['class_trainer_images'] = trainer_sum
class_detail.append(class_detail_list)
class_label += 1
# 獲取類別數量
all_class_sum = len(class_dirs) - other_file
# 說明的json文件信息
readjson = {}
readjson['all_class_name'] = father_path
readjson['all_class_sum'] = all_class_sum
readjson['all_class_images'] = all_class_images
readjson['class_detail'] = class_detail
jsons = json.dumps(readjson, sort_keys=True, indent=4, separators=(',', ': '))
with open(data_root_path + "readme.json", 'w') as f:
f.write(jsons)
print('圖像列表已生成')
輸出結果:
2.訓練
2.1 模型
我們針對移動端以及嵌入式視覺的應用提出了一類有效的模型叫MobileNets。MobileNets基於一種流線型結構使用深度可分離卷積來構造輕型權重深度神經網絡。我們介紹兩個能夠有效權衡延遲和准確率的簡單的全局超參數。這些超參數允許模型構造器能夠根據特定問題選擇合適大小的模型。我們在資源和准確率的權衡方面做了大量的實驗並且相較於其他在ImageNet分類任務上著名的模型有很好的表現。然后,我們演示了MobileNets在廣泛應用上的有效性,使用實例包含目標檢測、細粒度分類、人臉屬性以及大規模地理位置信息。\
實現代碼是現成的,直接在百度官方網站上獲取的。
2.2 定義訓練
以下代碼是定義訓練的代碼,基本上代碼都是類似的,完成一個訓練,都需要定義這些東西。
# 定義輸入層(此處我們沒有處理灰度圖,所以還是三通道)
image = fluid.layers.data(name='image', shape=[3, crop_size, crop_size], dtype='float32')
label = fluid.layers.data(name='label', shape=[1], dtype='int64')
# 獲取分類器(貓狗屬於二分類)
model = net(image, 2)
# 獲取損失函數和准確率函數
cost = fluid.layers.cross_entropy(input=model, label=label)
avg_cost = fluid.layers.mean(cost)
acc = fluid.layers.accuracy(input=model, label=label)
# 獲取訓練和測試程序
test_program = fluid.default_main_program().clone(for_test=True)
# 定義優化方法(設置學習率,和規則化函數,預防過擬合事件的發生)
optimizer = fluid.optimizer.AdamOptimizer(learning_rate=1e-3,
regularization=fluid.regularizer.L2DecayRegularizer(1e-4))
opts = optimizer.minimize(avg_cost)
# 獲取自定義數據
train_reader = paddle.batch(train_reader('data/datas/train/train.list', crop_size, resize_size), batch_size=64)
test_reader = paddle.batch(test_reader('data/datas/train/test.list', crop_size), batch_size=64)
# 定義一個使用GPU的執行器(處理圖像,CPU太慢(本地筆記本親測))
place = fluid.CUDAPlace(0)
# place = fluid.CPUPlace()
exe = fluid.Executor(place)
# 進行參數初始化
exe.run(fluid.default_startup_program())
# 定義輸入數據維度
feeder = fluid.DataFeeder(place=place, feed_list=[image, label])
2.3 訓練
代碼
# 訓練10次
for pass_id in range(20):
# 進行訓練
#
for batch_id, data in enumerate(train_reader()):
train_cost, train_acc = exe.run(program=fluid.default_main_program(),
feed=feeder.feed(data),
fetch_list=[avg_cost, acc])
# 每100個batch打印一次信息
if batch_id % 100 == 0:
print('Pass:%d, Batch:%d, Cost:%0.5f, Accuracy:%0.5f' %
(pass_id, batch_id, train_cost[0], train_acc[0]))
# 進行測試
test_accs = []
test_costs = []
for batch_id, data in enumerate(test_reader()):
test_cost, test_acc = exe.run(program=test_program,
feed=feeder.feed(data),
fetch_list=[avg_cost, acc])
test_accs.append(test_acc[0])
test_costs.append(test_cost[0])
# 求測試結果的平均值
test_cost = (sum(test_costs) / len(test_costs))
test_acc = (sum(test_accs) / len(test_accs))
print('Test:%d, Cost:%0.5f, Accuracy:%0.5f' % (pass_id, test_cost, test_acc))
# 保存預測模型
save_path = 'infer_model/'
# 刪除舊的模型文件
shutil.rmtree(save_path, ignore_errors=True)
# 創建保持模型文件目錄
os.makedirs(save_path)
# 保存預測模型
fluid.io.save_inference_model(save_path, feeded_var_names=[image.name], target_vars=[model], executor=exe)
輸出結果
Pass:0, Batch:0, Cost:0.67030, Accuracy:0.59375
Pass:0, Batch:100, Cost:0.73609, Accuracy:0.62500
Pass:0, Batch:200, Cost:0.65755, Accuracy:0.62500
Test:0, Cost:0.69298, Accuracy:0.61500
Pass:1, Batch:0, Cost:0.70980, Accuracy:0.56250
Pass:1, Batch:100, Cost:0.67554, Accuracy:0.54688
Pass:1, Batch:200, Cost:0.64920, Accuracy:0.56250
Test:1, Cost:0.69018, Accuracy:0.60250
Pass:2, Batch:0, Cost:0.51662, Accuracy:0.79688
Pass:2, Batch:100, Cost:0.62268, Accuracy:0.60938
Pass:2, Batch:200, Cost:0.58238, Accuracy:0.68750
Test:2, Cost:0.61693, Accuracy:0.67188
Pass:3, Batch:0, Cost:0.61814, Accuracy:0.65625
Pass:3, Batch:100, Cost:0.52823, Accuracy:0.76562
Pass:3, Batch:200, Cost:0.50346, Accuracy:0.75000
Test:3, Cost:0.56010, Accuracy:0.69437
Pass:4, Batch:0, Cost:0.51497, Accuracy:0.70312
Pass:4, Batch:100, Cost:0.54908, Accuracy:0.75000
Pass:4, Batch:200, Cost:0.44495, Accuracy:0.82812
Test:4, Cost:0.51263, Accuracy:0.73750
Pass:5, Batch:0, Cost:0.53596, Accuracy:0.76562
Pass:5, Batch:100, Cost:0.57464, Accuracy:0.75000
Pass:5, Batch:200, Cost:0.67699, Accuracy:0.65625
Test:5, Cost:0.53518, Accuracy:0.74000
Pass:6, Batch:0, Cost:0.46548, Accuracy:0.79688
Pass:6, Batch:100, Cost:0.54030, Accuracy:0.70312
Pass:6, Batch:200, Cost:0.48817, Accuracy:0.78125
Test:6, Cost:0.48508, Accuracy:0.77312
Pass:7, Batch:0, Cost:0.41523, Accuracy:0.84375
Pass:7, Batch:100, Cost:0.47442, Accuracy:0.73438
Pass:7, Batch:200, Cost:0.45649, Accuracy:0.76562
Test:7, Cost:0.44587, Accuracy:0.78375
Pass:8, Batch:0, Cost:0.42541, Accuracy:0.81250
Pass:8, Batch:100, Cost:0.38169, Accuracy:0.81250
Pass:8, Batch:200, Cost:0.54646, Accuracy:0.71875
Test:8, Cost:0.54019, Accuracy:0.74187
Pass:9, Batch:0, Cost:0.41468, Accuracy:0.82812
Pass:9, Batch:100, Cost:0.50506, Accuracy:0.78125
Pass:9, Batch:200, Cost:0.26215, Accuracy:0.93750
Test:9, Cost:0.44446, Accuracy:0.78875
Pass:10, Batch:0, Cost:0.45576, Accuracy:0.76562
Pass:10, Batch:100, Cost:0.35473, Accuracy:0.79688
Pass:10, Batch:200, Cost:0.45957, Accuracy:0.73438
Test:10, Cost:0.44609, Accuracy:0.79812
Pass:11, Batch:0, Cost:0.43150, Accuracy:0.76562
Pass:11, Batch:100, Cost:0.48615, Accuracy:0.79688
Pass:11, Batch:200, Cost:0.25434, Accuracy:0.87500
Test:11, Cost:0.40623, Accuracy:0.82125
Pass:12, Batch:0, Cost:0.31509, Accuracy:0.89062
Pass:12, Batch:100, Cost:0.35438, Accuracy:0.90625
Pass:12, Batch:200, Cost:0.44042, Accuracy:0.82812
Test:12, Cost:0.38933, Accuracy:0.82688
Pass:13, Batch:0, Cost:0.35025, Accuracy:0.84375
Pass:13, Batch:100, Cost:0.39380, Accuracy:0.82812
Pass:13, Batch:200, Cost:0.29557, Accuracy:0.85938
Test:13, Cost:0.40181, Accuracy:0.83000
Pass:14, Batch:0, Cost:0.22922, Accuracy:0.90625
Pass:14, Batch:100, Cost:0.49781, Accuracy:0.84375
Pass:14, Batch:200, Cost:0.23470, Accuracy:0.85938
Test:14, Cost:0.44674, Accuracy:0.81375
Pass:15, Batch:0, Cost:0.32143, Accuracy:0.85938
Pass:15, Batch:100, Cost:0.31085, Accuracy:0.87500
Pass:15, Batch:200, Cost:0.36961, Accuracy:0.82812
Test:15, Cost:0.41548, Accuracy:0.82812
Pass:16, Batch:0, Cost:0.24269, Accuracy:0.90625
Pass:16, Batch:100, Cost:0.29280, Accuracy:0.82812
Pass:16, Batch:200, Cost:0.19174, Accuracy:0.92188
Test:16, Cost:0.32385, Accuracy:0.86375
Pass:17, Batch:0, Cost:0.28380, Accuracy:0.85938
Pass:17, Batch:100, Cost:0.30588, Accuracy:0.81250
Pass:17, Batch:200, Cost:0.32704, Accuracy:0.85938
Test:17, Cost:0.31492, Accuracy:0.86000
Pass:18, Batch:0, Cost:0.29551, Accuracy:0.85938
Pass:18, Batch:100, Cost:0.18694, Accuracy:0.90625
Pass:18, Batch:200, Cost:0.25631, Accuracy:0.85938
Test:18, Cost:0.29839, Accuracy:0.87125
Pass:19, Batch:0, Cost:0.16484, Accuracy:0.95312
Pass:19, Batch:100, Cost:0.11558, Accuracy:0.96875
Pass:19, Batch:200, Cost:0.17472, Accuracy:0.90625
Test:19, Cost:0.25351, Accuracy:0.88813
模型訓練之后最終的acc是0.88813。整體還是不錯的。
3.預測
這里我准備了1只貓,1只狗來進行測試(不要用哈士奇,,效果不行,這個坑后面解決)。
預測代碼如下
import paddle.fluid as fluid
from PIL import Image
import numpy as np
# 創建執行器
place = fluid.CUDAPlace(0)
exe = fluid.Executor(place)
exe.run(fluid.default_startup_program())
infer_image = 'work/train/test2.jpg'
# 保存預測模型路徑
save_path = 'infer_model/'
# 從模型中獲取預測程序、輸入數據名稱列表、分類器
[infer_program, feeded_var_names, target_var] = fluid.io.load_inference_model(dirname=save_path, executor=exe)
# 預處理圖片
def load_image(file):
img = Image.open(file)
# 統一圖像大小
img = img.resize((224, 224), Image.ANTIALIAS)
# 轉換成numpy值
img = np.array(img).astype(np.float32)
# 轉換成CHW
img = img.transpose((2, 0, 1))
# 轉換成BGR
img = img[(2, 1, 0), :, :] / 255.0
img = np.expand_dims(img, axis=0)
return img
# 獲取圖片數據
img = load_image(infer_image)
# 執行預測
result = exe.run(program=infer_program,
feed={feeded_var_names[0]: img},
fetch_list=target_var)
# 顯示圖片並輸出結果最大的label
lab = np.argsort(result)[0][0][-1]
names = ['貓', '狗']
print('預測結果標簽為:%d, 名稱為:%s, 概率為:%f' % (lab, names[lab], result[0][0][lab]))
infer_image_show = Image.open(infer_image)
infer_image_show.show()
測試圖片
結果:
結果
4.參考文獻
https://blog.csdn.net/qq_33200967/article/details/87895105 《PaddlePaddle從入門到煉丹》十一——自定義圖像數據集識別
https://github.com/yeyupiaoling/LearnPaddle2/tree/master/note11 github地址