語義分割：基於openCV和深度學習（二）

本文轉載自查看原文 2020-05-24 12:31 740

Semantic segmentation in images with OpenCV

開始吧-打開segment.py歸檔並插入以下代碼：

Semantic segmentation with OpenCV and deep learning

# import the necessary packages

import numpy as np

import argparse

import imutils

import time

import cv2

從輸入必要的依賴包開始。對於這個腳本，推薦OpenCV 3.4.1或更高版本。可以按照一個安裝教程進行操作—只要確保在執行步驟時指定要下載和安裝的OpenCV版本。還需要安裝OpenCV便利功能包imutils-只需使用pip安裝該包：

Semantic segmentation with OpenCV and deep learning

$ pip install --upgrade imutils

如果使用的是Python虛擬環境，不要忘記在使用pip安裝imutils之前使用work-on命令！ 接下來，分析一下命令行參數：

Semantic segmentation with OpenCV and deep learning

# construct the argument parse and parse the arguments

ap = argparse.ArgumentParser()

ap.add_argument("-m", "--model", required=True,

help="path to deep learning segmentation model")

ap.add_argument("-c", "--classes", required=True,

help="path to .txt file containing class labels")

ap.add_argument("-i", "--image", required=True,

help="path to input image")

ap.add_argument("-l", "--colors", type=str,

help="path to .txt file containing colors for labels")

ap.add_argument("-w", "--width", type=int, default=500,

help="desired width (in pixels) of input image")

args = vars(ap.parse_args())

此腳本有五個命令行參數，其中兩個是可選的：

--模型：深入學習語義分割模型的途徑。

--類：包含類標簽的文本文件的路徑。

--圖像：的輸入圖像文件路徑。 -

-顏色：顏色文本文件的可選路徑。如果沒有指定文件，則將為每個類分配隨機顏色。

--寬度：可選的所需圖像寬度。默認情況下，該值為500像素。

如果不熟悉argparse和命令行參數的概念，一定要閱讀這篇深入介紹命令行參數的博客文章。 接下來，來分析類標簽文件和顏色：

Semantic segmentation with OpenCV and deep learning

# load the class label names

CLASSES = open(args["classes"]).read().strip().split("\n")

# if a colors file was supplied, load it from disk

if args["colors"]:

COLORS = open(args["colors"]).read().strip().split("\n")

COLORS = [np.array(c.split(",")).astype("int") for c in COLORS]

COLORS = np.array(COLORS, dtype="uint8")

# otherwise, we need to randomly generate RGB colors for each class

# label

else:

# initialize a list of colors to represent each class label in

# the mask (starting with 'black' for the background/unlabeled

# regions)

np.random.seed(42)

COLORS = np.random.randint(0, 255, size=(len(CLASSES) - 1, 3),

dtype="uint8")

COLORS = np.vstack([[0, 0, 0], COLORS]).astype("uint8")

從提供的文本文件中將類加載到內存中，該文件的路徑包含在命令行args字典（第23行）中。

如果文本文件中為每個類標簽提供了一組預先指定的顏色（每行一個），將它們加載到內存中（第26-29行）。否則，為每個標簽隨機生成顏色（第33-40行）。

出於測試目的（並且由於有20個類），使用OpenCV繪圖函數創建一個漂亮的顏色查找圖例：

Semantic segmentation with OpenCV and deep learning

# initialize the legend visualization

legend = np.zeros(((len(CLASSES) * 25) + 25, 300, 3), dtype="uint8")

# loop over the class names + colors

for (i, (className, color)) in enumerate(zip(CLASSES, COLORS)):

# draw the class name + color on the legend

color = [int(c) for c in color]

cv2.putText(legend, className, (5, (i * 25) + 17),

cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255), 2)

cv2.rectangle(legend, (100, (i * 25)), (300, (i * 25) + 25),

tuple(color), -1)

生成一個圖例可視化，就可以很容易地可視化地將類標簽與顏色關聯起來。圖例由類標簽及其旁邊的彩色矩形組成。這是通過創建畫布（第43行）和使用循環動態構建圖例（第46-52行）快速創建的。本文中介紹了繪畫基礎知識。

結果如下：

Figure 2: Our deep learning semantic segmentation class color legend generated with OpenCV.

下一個區塊將進行深度學習細分：

Semantic segmentation with OpenCV and deep learning

# load our serialized model from disk

print("[INFO] loading model...")

net = cv2.dnn.readNet(args["model"])

# load the input image, resize it, and construct a blob from it,

# but keeping mind mind that the original input image dimensions

# ENet was trained on was 1024x512

image = cv2.imread(args["image"])

image = imutils.resize(image, width=args["width"])

blob = cv2.dnn.blobFromImage(image, 1 / 255.0, (1024, 512), 0,

swapRB=True, crop=False)

# perform a forward pass using the segmentation model

net.setInput(blob)

start = time.time()

output = net.forward()

end = time.time()

# show the amount of time inference took

print("[INFO] inference took {:.4f} seconds".format(end - start))

為了使用Python和OpenCV對圖像進行深入的語義分割：

加載模型（第56行）。構造一個blob（第61-64行），在這篇博客文章中使用的ENet模型是在1024×512分辨率的輸入圖像上訓練的，將在這里使用相同的方法。可以在這里了解更多關於OpenCV的blob是如何工作的。將blob設置為網絡的輸入（第67行），並執行神經網絡的前向傳遞（第69行）。用時間戳將forward pass語句括起來。將經過的時間打印到第73行的終端。

在腳本的其余行中，將生成一個顏色映射，覆蓋在原始圖像上。每個像素都有一個對應的類標簽索引，能夠在屏幕上看到語義分割的結果。

首先，需要從輸出中提取卷維度信息，然后計算類圖和顏色掩碼：

Semantic segmentation with OpenCV and deep learning

# infer the total number of classes along with the spatial dimensions

# of the mask image via the shape of the output array

(numClasses, height, width) = output.shape[1:4]

# our output class ID map will be num_classes x height x width in

# size, so we take the argmax to find the class label with the

# largest probability for each and every (x, y)-coordinate in the

# image

classMap = np.argmax(output[0], axis=0)

# given the class ID map, we can map each of the class IDs to its

# corresponding color

mask = COLORS[classMap]

在第77行確定輸出體積的空間維度。接下來，讓找到輸出卷的每個（x，y）-坐標的概率最大的類標簽索引（第83行）。這就是現在所知道的類映射，它包含每個像素的類索引。給定類ID索引，可以使用NumPy數組索引“神奇地”（更不用說超級高效地）查找每個像素（第87行）對應的可視化顏色。彩色mask版將透明地覆蓋在原始圖像上。讓完成腳本：

Semantic segmentation with OpenCV and deep learning

# resize the mask and class map such that its dimensions match the

# original size of the input image (we're not using the class map

# here for anything else but this is how you would resize it just in

# case you wanted to extract specific pixels/classes)

mask = cv2.resize(mask, (image.shape[1], image.shape[0]),

interpolation=cv2.INTER_NEAREST)

classMap = cv2.resize(classMap, (image.shape[1], image.shape[0]),

interpolation=cv2.INTER_NEAREST)

# perform a weighted combination of the input image with the mask to

# form an output visualization

output = ((0.4 * image) + (0.6 * mask)).astype("uint8")

# show the input and output images

cv2.imshow("Legend", legend)

cv2.imshow("Input", image)

cv2.imshow("Output", output)

cv2.waitKey(0)

調整掩碼和類映射的大小，使它們與輸入圖像（第93-96行）具有完全相同的維度。為了保持原始的類id/mask值，使用最近鄰插值而不是三次、二次等插值是非常重要的。現在大小是正確的，創建了一個“透明的顏色覆蓋”，通過覆蓋的原始圖像（第100行）的遮罩。這使能夠輕松地可視化分割的輸出。關於透明覆蓋層以及如何構建它們的更多信息，可以在本文中找到。最后，圖例和原始+輸出圖像顯示在第103-105行的屏幕上。

單圖像分割結果

在使用本節中的命令之前，請確保獲取此博客文章的“下載”。為了方便起見，在zip文件中提供了模型+相關文件、圖像和Python腳本。在終端中提供的命令行參數對於復制結果很重要。如果不熟悉命令行參數，請在此處了解它們。准備好后，打開一個終端並導航到項目，然后執行以下命令：

Semantic segmentation with OpenCV and deep learning

$ python segment.py --model enet-cityscapes/enet-model.net \

--classes enet-cityscapes/enet-classes.txt \

--colors enet-cityscapes/enet-colors.txt \

--image images/example_01.png

[INFO] loading model...

[INFO] inference took 0.2100 seconds

圖3: OpenCV的語義分割顯示了道路、人行道、人、自行車、交通標志等等！

注意分割的精確程度-它清楚地分割類並准確地識別人和自行車（自動駕駛汽車的安全問題）。道路，人行道，汽車，甚至樹葉都被識別出來了。

嘗試另一個示例，只需將--image命令行參數更改為不同的圖像：

Semantic segmentation with OpenCV and deep learning

$ python segment.py --model enet-cityscapes/enet-model.net \

--classes enet-cityscapes/enet-classes.txt \

--colors enet-cityscapes/enet-colors.txt \

--image images/example_02.jpg

[INFO] loading model...

[INFO] inference took 0.1989 seconds

圖4中的結果展示了這個語義分割模型的准確性和清晰性。汽車、道路、樹木和天空都有清晰的標記。下面是另一個例子：

Semantic segmentation with OpenCV and deep learning

$ python segment.py --model enet-cityscapes/enet-model.net \

--classes enet-cityscapes/enet-classes.txt \

--colors enet-cityscapes/enet-colors.txt \

--image images/example_03.png

[INFO] loading model...

[INFO] inference took 0.1992 seconds

上圖是一個更復雜的場景，但ENet仍然可以分割走在車前的人。不幸的是，該模型錯誤地將道路分類為人行道，但可能是因為人們在人行道上行走。最后一個例子：

Semantic segmentation with OpenCV and deep learning

$ python segment.py --model enet-cityscapes/enet-model.net \

--classes enet-cityscapes/enet-classes.txt \

--colors enet-cityscapes/enet-colors.txt \

--image images/example_04.png

[INFO] loading model...

[INFO] inference took 0.1916 seconds

通過ENet發送的最終圖像顯示了模型如何在道路、人行道、樹葉、人物等其他場景類別中清晰地分割卡車與汽車。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 語義分割：基於openCV和深度學習（一）基於深度學習的語義分割綜述深度學習之語義分割中的度量標准深度學習在圖像語義分割中的應用基於深度學習的圖像語義分割的算法綜述學習總結《基於深度學習的圖像語義分割方法綜述》閱讀理解深度學習高分辨率遙感影像語義分割【轉】深度學習-Tensorflow2.2-圖像處理{10}-圖像語義分割-23 SENet&語義分割相關知識學習實戰深度學習（上）OpenCV庫