YOLO---YOLOv3 with OpenCV安裝與使用


Yolo v3+Opencv3.4.2安裝記錄

wp20180930

 

目錄

 一、環境要求

(1)python版本的查看

(2)opencv版本的查看

二、文件下載

三、數據自測

四、問題與解決

(1)提示【ImportError: No module named 'cv2' Python3】??

(2)Ubuntu---python2和python3多版本共存與切換??

(3)重新進行opencv3.4.2安裝??

(4)Ubuntu18.04下安裝OpenCv依賴包libjasper-dev無法安裝的問題??

(5)Ubuntu18.04下安裝OpenCv時出現CMake Error: The source directory??

 五、文件的詳細代碼

(1)object_detection_yolo.py

(2)object_detection_yolo.cpp

(3)yolo_test3.py

 

正文

說明:本文是在已訓練好的基礎上,自測數據看結果的。下面的流程,記錄一下自己的實踐過程。主要參考:

1,https://blog.csdn.net/ling_xiobai/article/details/82082614  

2,https://blog.csdn.net/haoqimao_hard/article/details/82081285

3,https://hk.saowen.com/a/8c0f58aa3914c3bef46fb29eb40c77522b25fd7c0672fc9eadb2b3fdc2a8fbfb  

 

一、環境要求

本文是在Ubuntu(僅CPU)、Opencv3.4.2以上、Python3下進行測試的。如果需要,請自行配置相應的環境。

不管是用Python 2.7+還是 Python3+, 都需要用apt-get來安裝Opencv所需要的包庫等依賴。在開始正式安裝之前, 需要弄清到底是要安裝哪一個版本的,兩個版本各有利弊。選擇一個你看着順眼的, 這個真沒有什么特別的不同,如果覺得用着Python 3+舒服, 就選擇 Python 3+; 用習慣Python 2.7+, 就裝 Python 2.7+ 版本的。但是如果平時用 Python 來做一些CS相關的開發, 譬如: Machine Learning, Data Mining, NLP或者 Deep Learning, 可能會更傾向於選擇 Python 2.7, 至少目前是這樣的情況。 這些方面的大部分庫和包都是 Python 2.7+, 譬如: NumPy, Scipyscikit-learn, 雖然社區里面都在努力地向 Python 3+ 遷移, 但是有那么一部分還是只能在 Python 2.7+下穩定工作的。

(1)Python版本的查看

個人Ubuntu18.04系統下,由於之前其他的工作需要,已經安裝了 python2python3。所以,需要進行 python2python3自由切換,詳見另外的筆記或者自行百度。

查看安裝python的版本

方式一,$: ls /usr/bin/python*

方式二,$: python2 

    $: python3

 
   

 

(2)Opencv版本的查看

一般在安裝python的時候,會安裝一些opencv相關的依賴項,我們要想知道是否已經安裝了opencv以及它的版本號,可以在終端下執行:pkg-config --modversion opencv

查看python是否支持opencv,可以打開pythonpython2或者python,在繼續執行import cv2,看是否能正常運行,提示”>>>”python支持opencv

下圖顯示的是重新安裝opencv3.4.2(詳見后面的4問題與解決)后的顯示結果,已經成功好用。

 
   

二、文件下載

需要下載yolov3.weights權重文件yolov3.cfg網絡構建文件coco.namesxxx.jpgxxx.mp4文件以及其他的object_detection_yolo.cppobject_detection_yolo.py等文件。

下載鏈接,參考:

1,https://github.com/JackKoLing/opencv_deeplearning_practice/tree/master/pracice3_opencv_yolov3 

2,https://pan.baidu.com/s/12tI6iKTzdwYdJSxgBiyayQ#list/path=%2F&parentPath=%2F,密碼:gfg1

 
   

三、數據自測

第二步后,運行一下命令:

$ cd /home/wp/opencv_DL/opencv3.4.2_yolov3

$ python3 object_detection_yolo.py --image=bird.jpg

$ python3 object_detection_yolo.py --video=run.mp4

執行命令后,就可以看到結果,並且結果保存在了同文件下了:

bird_yolo_out_py.jpgrun_yolo_out_py.avi

由於視頻檢測速度比較慢,進行改進一下,視頻每幀取兩張圖片,修改為yolo_test3.py,可以稍微提高一點速度。

$ python3 yolo_test3.py --video=run.mp4

四、問題與解決

配置yoloOpencvPython環境時,出現的問題與解決。

(1) 提示【ImportError: No module named 'cv2' Python3】???

參考https://stackoverflow.com/questions/45643650/importerror-no-module-named-cv2-python3問題類似,但是通過提問中的解決方法,沒有解決。自行下載opencv3.4.2安裝包,進行了重新安裝與配置,結果就好用了,但是用pkg-config --modversion opencv命令查看顯示opencv3.2.0,原因不明。

 
   

 (2) Ubuntu---python2python3多版本共存與切換

可以參考https://blog.csdn.net/kan2016/article/details/81639292 和 https://www.cnblogs.com/hwlong/p/9216653.html

(2.1)若沒有安裝python,則可以使用pip(也可以anacanda)安裝python

第一步,度娘ubuntu 安裝pip

# 1. 更新系統包

sudo apt-get update

sudo apt-get upgrade

# 2. 安裝Pip

sudo apt-get install python-pip

# 3. 檢查 pip 是否安裝成功

pip -V

其次,安裝python

$ sudo apt install python      #安裝python2,因為系統已經安裝了python3

$ sudo apt install python-pip   #指定python2pip,使用為pip

$ sudo apt install python3-pip  #指定為python3pip,使用為pip3

接着,查看python是否安裝成功。

 $ python --version

 $ python3 --version

 
   

 (2.2)ubuntu切換Python版本

我們可以使用 update-alternatives 來為整個系統更改 Python 版本。參考https://blog.csdn.net/cym_lmy/article/details/78315139https://www.cnblogs.com/hwlong/p/9216653.html(圖文詳情很好)。正常情況基於ubuntudebian開發的發行版本都支持。

首先,羅列出所有可用的 python 替代版本信息:

$ sudo update-alternatives --list python

update-alternatives: error: no alternatives

for python

如果出現以上所示的錯誤信息,則表示 Python 的替代版本尚未被 update-alternatives 命令識別。想解決這個問題,需要更新一下替代列表,將 python2.7 python3.6 放入其中。

 

打開終端分別輸入下面兩條命令:

$ sudo update-alternatives –install /usr/bin/python python /usr/bin/python2 1

$ sudo update-alternatives –install /usr/bin/python python /usr/bin/python3 2

 

如果需要重新切換回python只需要在終端輸入:

$ sudo update-alternatives --config python

然后選者你需要的python版本,輸入序號回車即可

再,終端輸入:

$ python

如果無誤,此時python版本應該切換到默認的python3了。

 

最后說明:移除替代版本方法。一旦我們的系統中不再存在某個 Python 的替代版本時,我們可以將其從 update-alternatives 列表中刪除掉。例如,我們可以將列表中的 python2.7 版本移除掉。

$ sudo update-alternatives --remove python /usr/bin/python2.7

update-alternatives: removing manually selected alternative - switching python to auto mode

update-alternatives: using

/usr/bin/python3.4 to provide

/usr/bin/python (python)

in auto mode

 

 
   

 

(3) 重新進行opencv3.4.2安裝???

解決Ubuntuopencv2opencv3多版本共存問題,可以參考

https://blog.csdn.net/Hansry/article/details/75309906https://blog.csdn.net/liuxiaodong400/article/details/81089058

這里,個人自己重新在python3下安裝與配置opencv3.4.2

第一步,下載opencv源碼。

opencv各版本下載地址,https://opencv.org/releases.html(官網)。點擊sources源文件下載,本人下載的是3.4.2版本的。

第二步,解壓opencv源碼。

找到下載的opencv-3.4.2文件夾,進入后:

$ unzip opencv-3.4.2.zip

 

在解壓好的文件夾中打開終端,創建文件夾並打開

mkdir build

cd build

 

第三步,安裝OpenCV依賴文件。

這一步,也可以在第一步或者第二步之前完成。

$ sudo apt-get update

$ sudo apt-get upgrade

$ sudo apt-get install build-essential

$ sudo apt-get install cmake git libgtk2.0-dev pkg-config libavcodec-dev libavformat-dev libswscale-dev

$ sudo apt-get install python-dev python-numpy libtbb2 libtbb-dev libjpeg-dev libpng-dev libtiff-dev libjasper-dev libdc1394-22-dev # 處理圖像所需的包

$ sudo apt-get install libavcodec-dev libavformat-dev libswscale-dev libv4l-dev liblapacke-dev

$ sudo apt-get install libxvidcore-dev libx264-dev # 處理視頻所需的包

$ sudo apt-get install libatlas-base-dev gfortran # 優化opencv功能

$ sudo apt-get install ffmpeg

每一步的解釋可以參考https://blog.csdn.net/abcsunl/article/details/63686496

 

第四步,用Cmake配置opencv的編譯環境。

首先,需要安裝Cmake。如果安裝過Cmake,省略這一步即可。

參考https://www.cnblogs.com/TooyLee/p/6052387.html,執行如下安裝:

准備工作:官網下載cmake-3.11.4.tar.gzhttps://cmake.org/download/),這里注意下載的版本。解壓后的文件夾需要包含bootstrap文件(本人下載了幾個版本都沒有,原來是下載的文件不對。下載最上面的就行了),如下:

 
   

 1.解壓文件tar -xvf cmake-3.11.4.tar.gz,並修改文件權限chmod -R 777 cmake-3.11.4

2.檢測gccg++是否安裝,如果沒有則需安裝gcc-g++sudo apt-get install build-essential(或者直接執行這兩條命令sudo apt-get install gcc,sudo apt-get install g++

3.進入cmake-3.6.3 進入命令 cd cmake-3.6.3

4.執行sudo ./bootstrap

5.執行sudo make

6.執行 sudo make install

7.執行 cmake –version,返回cmake版本信息,則說明安裝成功。

 
   

其次,這里配置編譯opencv (NVIDIA CUDA版本),執行如下命令:

cmake -D CMAKE_BUILD_TYPE=RELEASE \

    -D CMAKE_INSTALL_PREFIX=/home/wp/opencv3.4.2/install \

    -D INSTALL_PYTHON_EXAMPLES=ON \

    -D INSTALL_C_EXAMPLES=OFF \

    -D OPENCV_EXTRA_MODULES_PATH=~/opencv_contrib-3.2.0/modules \

    -D PYTHON3_EXECUTABLE=/usr/bin/python3 \

    -D PYTHON_INCLUDE_DIR=/usr/include/python3.6 \

    -D PYTHON_LIBRARY=/usr/lib/x86_64-linux-gnu/libpython3.6m.so \

    -D PYTHON3_NUMPY_INCLUDE_DIRS=/usr/local/lib/python3.6/dist-packages/numpy/core/include \

    -D WITH_TBB=ON \

    -D WITH_V4L=ON \

    -D WITH_QT=ON \    

    -D WITH_GTK=ON \

    -D WITH_OPENGL=ON \

    -DBUILD_EXAMPLES=ON ...

但是這個時候,總是運行不過去,馬上就碰到了個文件路徑沒有找到的問題,解決方法是去掉-D后面的空格,成功解決。在程序中輸入:

cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/usr/local  PYTHON3_EXECUTABLE=/usr/bin/python3 PYTHON_INCLUDE_DIR=/usr/include/python3.6 PYTHON_LIBRARY=/usr/lib/x86_64-linux-gnu/libpython3.6m.so PYTHON3_NUMPY_INCLUDE_DIRS=/usr/local/lib/python3.6/dist-packages/numpy/core/include ..

等待一會兒,配置完成。

 

第五步,opencv編譯。

  $ cd build

 $ sudo make -j8

$ sudo make install

等待,編譯完成。

 

第六步,opencv測試。

安裝完成以后,重啟下電腦。

如果導入cv2模塊報錯,運行下面代碼:

$ sudo pip install opencv-python

方法一:打開python console,檢測opencv的版本

import cv2

cv2.__version__

如果正確安裝的話則會輸出3.4.2

方法二:新建文件 test.py, 輸入一下內容

import cv2

if __name__ == '__main__':

    print(cv2.__version__)

 
   

 (4) Ubuntu18.04下安裝OpenCv依賴包libjasper-dev無法安裝的問題???

可以參考https://blog.csdn.net/weixin_41053564/article/details/81254410解決問題。

ubuntu18.04系統上安裝opencv但是在安裝依賴包的過程中,有一個依賴包,libjasper-dev在使用命令:

$ sudo apt-get install libjaster-dev

提示:errorE: unable to locate libjasper-dev

則通過如下方式解決:

$ sudo add-apt-repository "deb http://security.ubuntu.com/ubuntu xenial-security main"

$ sudo apt update

$ sudo apt install libjasper-dev

【不好用,可改用$ sudo apt install libjasper1 libjasper-dev

這樣,可以成功的解決問題,其中libjasper1libjasper-dev的依賴包。

 

(5) Ubuntu18.04下安裝OpenCv時出現CMake Error: The source directory

Ubuntu環境下OpenCV編譯時:CMake error the source directory does not exist,解決辦法是:去掉-D后面的空格。可以參考https://blog.csdn.net/sparkexpert/article/details/70941449https://blog.csdn.net/wangleiwavesharp/article/details/80610529

 

五、文件的詳細代碼

(5.1object_detection_yolo.py

=============object_detection_yolo.py===========

#每一步詳細解釋見網址:https://www.learnopencv.com/deep-learning-based-object-detection-using-yolov3-with-opencv-python-c/

https://hk.saowen.com/a/8c0f58aa3914c3bef46fb29eb40c77522b25fd7c0672fc9eadb2b3fdc2a8fbfb 

# This code is written at BigVision LLC. It is based on the OpenCV project. It is subject to the license terms in the LICENSE file found in this distribution and at http://opencv.org/license.html

 

# Usage example:  python3 object_detection_yolo.py --video=run.mp4

#                 python3 object_detection_yolo.py --image=bird.jpg

 

import cv2 as cv

import argparse

import sys

import numpy as np

import os.path

 

# Initialize the parameters

confThreshold = 0.5  #Confidence threshold

nmsThreshold = 0.4   #Non-maximum suppression threshold

inpWidth = 416       #Width of network's input image

inpHeight = 416      #Height of network's input image

 

parser = argparse.ArgumentParser(description='Object Detection using YOLO in OPENCV')

parser.add_argument('--image', help='Path to image file.')

parser.add_argument('--video', help='Path to video file.')

args = parser.parse_args()

        

# Load names of classes

classesFile = "coco.names";

classes = None

with open(classesFile, 'rt') as f:

    classes = f.read().rstrip('\n').split('\n')

 

# Give the configuration and weight files for the model and load the network using them.

modelConfiguration = "yolov3.cfg";

modelWeights = "yolov3.weights";

 

net = cv.dnn.readNetFromDarknet(modelConfiguration, modelWeights)

net.setPreferableBackend(cv.dnn.DNN_BACKEND_OPENCV)

net.setPreferableTarget(cv.dnn.DNN_TARGET_CPU)

 

# Get the names of the output layers

def getOutputsNames(net):

    # Get the names of all the layers in the network

    layersNames = net.getLayerNames()

    # Get the names of the output layers, i.e. the layers with unconnected outputs

    return [layersNames[i[0] - 1] for i in net.getUnconnectedOutLayers()]

 

# Draw the predicted bounding box

def drawPred(classId, conf, left, top, right, bottom):

    # Draw a bounding box.

    cv.rectangle(frame, (left, top), (right, bottom), (255, 178, 50), 3)

    

    label = '%.2f' % conf

        

    # Get the label for the class name and its confidence

    if classes:

        assert(classId < len(classes))

        label = '%s:%s' % (classes[classId], label)

 

    #Display the label at the top of the bounding box

    labelSize, baseLine = cv.getTextSize(label, cv.FONT_HERSHEY_SIMPLEX, 0.5, 1)

    top = max(top, labelSize[1])

    cv.rectangle(frame, (left, top - round(1.5*labelSize[1])), (left + round(1.5*labelSize[0]), top + baseLine), (255, 255, 255), cv.FILLED)

    cv.putText(frame, label, (left, top), cv.FONT_HERSHEY_SIMPLEX, 0.75, (0,0,0), 1)

 

# Remove the bounding boxes with low confidence using non-maxima suppression

def postprocess(frame, outs):

    frameHeight = frame.shape[0]

    frameWidth = frame.shape[1]

 

    classIds = []

    confidences = []

    boxes = []

    # Scan through all the bounding boxes output from the network and keep only the

    # ones with high confidence scores. Assign the box's class label as the class with the highest score.

    classIds = []

    confidences = []

    boxes = []

    for out in outs:

        for detection in out:

            scores = detection[5:]

            classId = np.argmax(scores)

            confidence = scores[classId]

            if confidence > confThreshold:

                center_x = int(detection[0] * frameWidth)

                center_y = int(detection[1] * frameHeight)

                width = int(detection[2] * frameWidth)

                height = int(detection[3] * frameHeight)

                left = int(center_x - width / 2)

                top = int(center_y - height / 2)

                classIds.append(classId)

                confidences.append(float(confidence))

                boxes.append([left, top, width, height])

 

    # Perform non maximum suppression to eliminate redundant overlapping boxes with

    # lower confidences.

    indices = cv.dnn.NMSBoxes(boxes, confidences, confThreshold, nmsThreshold)

    for i in indices:

        i = i[0]

        box = boxes[i]

        left = box[0]

        top = box[1]

        width = box[2]

        height = box[3]

        drawPred(classIds[i], confidences[i], left, top, left + width, top + height)

 

# Process inputs

winName = 'Deep learning object detection in OpenCV'

cv.namedWindow(winName, cv.WINDOW_NORMAL)

 

outputFile = "yolo_out_py.avi"

if (args.image):

    # Open the image file

    if not os.path.isfile(args.image):

        print("Input image file ", args.image, " doesn't exist")

        sys.exit(1)

    cap = cv.VideoCapture(args.image)

    outputFile = args.image[:-4]+'_yolo_out_py.jpg'

elif (args.video):

    # Open the video file

    if not os.path.isfile(args.video):

        print("Input video file ", args.video, " doesn't exist")

        sys.exit(1)

    cap = cv.VideoCapture(args.video)

    outputFile = args.video[:-4]+'_yolo_out_py.avi'

else:

    # Webcam input

    cap = cv.VideoCapture(0)

 

# Get the video writer initialized to save the output video

if (not args.image):

    vid_writer = cv.VideoWriter(outputFile, cv.VideoWriter_fourcc('M','J','P','G'), 30, (round(cap.get(cv.CAP_PROP_FRAME_WIDTH)),round(cap.get(cv.CAP_PROP_FRAME_HEIGHT))))

 

while cv.waitKey(1) < 0:

    

    # get frame from the video

    hasFrame, frame = cap.read()

    

    # Stop the program if reached end of video

    if not hasFrame:

        print("Done processing !!!")

        print("Output file is stored as ", outputFile)

        cv.waitKey(3000)

        break

 

    # Create a 4D blob from a frame.

    blob = cv.dnn.blobFromImage(frame, 1/255, (inpWidth, inpHeight), [0,0,0], 1, crop=False)

 

    # Sets the input to the network

    net.setInput(blob)

 

    # Runs the forward pass to get output of the output layers

    outs = net.forward(getOutputsNames(net))

 

    # Remove the bounding boxes with low confidence

    postprocess(frame, outs)

 

    # Put efficiency information. The function getPerfProfile returns the overall time for inference(t) and the timings for each of the layers(in layersTimes)

    t, _ = net.getPerfProfile()

    label = 'Inference time: %.2f ms' % (t * 1000.0 / cv.getTickFrequency())

    cv.putText(frame, label, (0, 15), cv.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255))

 

    # Write the frame with the detection boxes

    if (args.image):

        cv.imwrite(outputFile, frame.astype(np.uint8));

    else:

        vid_writer.write(frame.astype(np.uint8))

 

    cv.imshow(winName, frame)

===============================結束============================

 

(5.object_detection_yolo.cpp

=============object_detection_yolo.cpp=============

// This code is written at BigVision LLC. It is based on the OpenCV project. It is subject to the license terms in the LICENSE file found in this distribution and at http://opencv.org/license.html

 

// Usage example:  ./object_detection_yolo.out --video=run.mp4

//                 ./object_detection_yolo.out --image=bird.jpg

#include <fstream>

#include <sstream>

#include <iostream>

 

#include <opencv2/dnn.hpp>

#include <opencv2/imgproc.hpp>

#include <opencv2/highgui.hpp>

 

const char* keys =

"{help h usage ? | | Usage examples: \n\t\t./object_detection_yolo.out --image=dog.jpg \n\t\t./object_detection_yolo.out --video=run_sm.mp4}"

"{image i        |<none>| input image   }"

"{video v       |<none>| input video   }"

;

using namespace cv;

using namespace dnn;

using namespace std;

 

// Initialize the parameters

float confThreshold = 0.5; // Confidence threshold

float nmsThreshold = 0.4;  // Non-maximum suppression threshold

int inpWidth = 416;  // Width of network's input image

int inpHeight = 416; // Height of network's input image

vector<string> classes;

 

// Remove the bounding boxes with low confidence using non-maxima suppression

void postprocess(Mat& frame, const vector<Mat>& out);

 

// Draw the predicted bounding box

void drawPred(int classId, float conf, int left, int top, int right, int bottom, Mat& frame);

 

// Get the names of the output layers

vector<String> getOutputsNames(const Net& net);

 

int main(int argc, char** argv)

{

    CommandLineParser parser(argc, argv, keys);

    parser.about("Use this script to run object detection using YOLO3 in OpenCV.");

    if (parser.has("help"))

    {

        parser.printMessage();

        return 0;

    }

    // Load names of classes

    string classesFile = "coco.names";

    ifstream ifs(classesFile.c_str());

    string line;

    while (getline(ifs, line)) classes.push_back(line);

    

    // Give the configuration and weight files for the model

    String modelConfiguration = "yolov3.cfg";

    String modelWeights = "yolov3.weights";

 

    // Load the network

    Net net = readNetFromDarknet(modelConfiguration, modelWeights);

    net.setPreferableBackend(DNN_BACKEND_OPENCV);

    net.setPreferableTarget(DNN_TARGET_CPU);

    

    // Open a video file or an image file or a camera stream.

    string str, outputFile;

    VideoCapture cap;

    VideoWriter video;

    Mat frame, blob;

    

    try {

        

        outputFile = "yolo_out_cpp.avi";

        if (parser.has("image"))

        {

            // Open the image file

            str = parser.get<String>("image");

            ifstream ifile(str);

            if (!ifile) throw("error");

            cap.open(str);

            str.replace(str.end()-4, str.end(), "_yolo_out_cpp.jpg");

            outputFile = str;

        }

        else if (parser.has("video"))

        {

            // Open the video file

            str = parser.get<String>("video");

            ifstream ifile(str);

            if (!ifile) throw("error");

            cap.open(str);

            str.replace(str.end()-4, str.end(), "_yolo_out_cpp.avi");

            outputFile = str;

        }

        // Open the webcaom

        else cap.open(parser.get<int>("device"));

        

    }

    catch(...) {

        cout << "Could not open the input image/video stream" << endl;

        return 0;

    }

    

    // Get the video writer initialized to save the output video

    if (!parser.has("image")) {

        video.open(outputFile, VideoWriter::fourcc('M','J','P','G'), 28, Size(cap.get(CAP_PROP_FRAME_WIDTH), cap.get(CAP_PROP_FRAME_HEIGHT)));

    }

    

    // Create a window

    static const string kWinName = "Deep learning object detection in OpenCV";

    namedWindow(kWinName, WINDOW_NORMAL);

 

    // Process frames.

    while (waitKey(1) < 0)

    {

        // get frame from the video

        cap >> frame;

 

        // Stop the program if reached end of video

        if (frame.empty()) {

            cout << "Done processing !!!" << endl;

            cout << "Output file is stored as " << outputFile << endl;

            waitKey(3000);

            break;

        }

        // Create a 4D blob from a frame.

        blobFromImage(frame, blob, 1/255.0, cvSize(inpWidth, inpHeight), Scalar(0,0,0), true, false);

        

        //Sets the input to the network

        net.setInput(blob);

        

        // Runs the forward pass to get output of the output layers

        vector<Mat> outs;

        net.forward(outs, getOutputsNames(net));

        

        // Remove the bounding boxes with low confidence

        postprocess(frame, outs);

        

        // Put efficiency information. The function getPerfProfile returns the overall time for inference(t) and the timings for each of the layers(in layersTimes)

        vector<double> layersTimes;

        double freq = getTickFrequency() / 1000;

        double t = net.getPerfProfile(layersTimes) / freq;

        string label = format("Inference time for a frame : %.2f ms", t);

        putText(frame, label, Point(0, 15), FONT_HERSHEY_SIMPLEX, 0.5, Scalar(0, 0, 255));

        

        // Write the frame with the detection boxes

        Mat detectedFrame;

        frame.convertTo(detectedFrame, CV_8U);

        if (parser.has("image")) imwrite(outputFile, detectedFrame);

        else video.write(detectedFrame);

        

        imshow(kWinName, frame);

        

    }

    

    cap.release();

    if (!parser.has("image")) video.release();

 

    return 0;

}

 

// Remove the bounding boxes with low confidence using non-maxima suppression

void postprocess(Mat& frame, const vector<Mat>& outs)

{

    vector<int> classIds;

    vector<float> confidences;

    vector<Rect> boxes;

    

    for (size_t i = 0; i < outs.size(); ++i)

    {

        // Scan through all the bounding boxes output from the network and keep only the

        // ones with high confidence scores. Assign the box's class label as the class

        // with the highest score for the box.

        float* data = (float*)outs[i].data;

        for (int j = 0; j < outs[i].rows; ++j, data += outs[i].cols)

        {

            Mat scores = outs[i].row(j).colRange(5, outs[i].cols);

            Point classIdPoint;

            double confidence;

            // Get the value and location of the maximum score

            minMaxLoc(scores, 0, &confidence, 0, &classIdPoint);

            if (confidence > confThreshold)

            {

                int centerX = (int)(data[0] * frame.cols);

                int centerY = (int)(data[1] * frame.rows);

                int width = (int)(data[2] * frame.cols);

                int height = (int)(data[3] * frame.rows);

                int left = centerX - width / 2;

                int top = centerY - height / 2;

                

                classIds.push_back(classIdPoint.x);

                confidences.push_back((float)confidence);

                boxes.push_back(Rect(left, top, width, height));

            }

        }

    }

    

    // Perform non maximum suppression to eliminate redundant overlapping boxes with

    // lower confidences

    vector<int> indices;

    NMSBoxes(boxes, confidences, confThreshold, nmsThreshold, indices);

    for (size_t i = 0; i < indices.size(); ++i)

    {

        int idx = indices[i];

        Rect box = boxes[idx];

        drawPred(classIds[idx], confidences[idx], box.x, box.y,

                 box.x + box.width, box.y + box.height, frame);

    }

}

 

// Draw the predicted bounding box

void drawPred(int classId, float conf, int left, int top, int right, int bottom, Mat& frame)

{

    //Draw a rectangle displaying the bounding box

    rectangle(frame, Point(left, top), Point(right, bottom), Scalar(0, 0, 255));

    

    //Get the label for the class name and its confidence

    string label = format("%.2f", conf);

    if (!classes.empty())

    {

        CV_Assert(classId < (int)classes.size());

        label = classes[classId] + ":" + label;

    }

    

    //Display the label at the top of the bounding box

    int baseLine;

    Size labelSize = getTextSize(label, FONT_HERSHEY_SIMPLEX, 0.5, 1, &baseLine);

    top = max(top, labelSize.height);

    putText(frame, label, Point(left, top), FONT_HERSHEY_SIMPLEX, 0.5, Scalar(255,255,255));

}

 

// Get the names of the output layers

vector<String> getOutputsNames(const Net& net)

{

    static vector<String> names;

    if (names.empty())

    {

        //Get the indices of the output layers, i.e. the layers with unconnected outputs

        vector<int> outLayers = net.getUnconnectedOutLayers();

        

        //get the names of all the layers in the network

        vector<String> layersNames = net.getLayerNames();

        

        // Get the names of the output layers in names

        names.resize(outLayers.size());

        for (size_t i = 0; i < outLayers.size(); ++i)

        names[i] = layersNames[outLayers[i] - 1];

    }

    return names;

}

==========================結束=========================

 

(5.3yolo_test3.py

=====================yolo_test3.py==============================

## -*- coding: utf-8 -*-

# This code is written at BigVision LLC. It is based on the OpenCV project. It is subject to the license terms in the LICENSE file found in this distribution and at http://opencv.org/license.html

 

# Usage example:  python3 object_detection_yolo.py --video=run.mp4

#                 python3 object_detection_yolo.py --image=bird.jpg

 

import cv2 as cv

import argparse

import sys

import numpy as np

import os.path

import os

import time

 

# Initialize the parameters

confThreshold = 0.5  #Confidence threshold

nmsThreshold = 0.4   #Non-maximum suppression threshold

inpWidth = 416       #Width of network's input image

inpHeight = 416      #Height of network's input image

 

parser = argparse.ArgumentParser(description='Object Detection using YOLO in OPENCV')

parser.add_argument('--image', help='Path to image file.')

parser.add_argument('--video', help='Path to video file.')

args = parser.parse_args()

        

# Load names of classes

classesFile = "coco.names";

classes = None

with open(classesFile, 'rt') as f:

    classes = f.read().rstrip('\n').split('\n')

 

# Give the configuration and weight files for the model and load the network using them.

modelConfiguration = "yolov3.cfg";

modelWeights = "yolov3.weights";

 

net = cv.dnn.readNetFromDarknet(modelConfiguration, modelWeights)

net.setPreferableBackend(cv.dnn.DNN_BACKEND_OPENCV)

net.setPreferableTarget(cv.dnn.DNN_TARGET_CPU)

 

# Get the names of the output layers

def getOutputsNames(net):

    # Get the names of all the layers in the network

    layersNames = net.getLayerNames()

    # Get the names of the output layers, i.e. the layers with unconnected outputs

    return [layersNames[i[0] - 1] for i in net.getUnconnectedOutLayers()]

 

# Draw the predicted bounding box

def drawPred(classId, conf, left, top, right, bottom):

    # Draw a bounding box.

    cv.rectangle(frame, (left, top), (right, bottom), (255, 178, 50), 2)

    

    label = '%.2f' % conf

        

    # Get the label for the class name and its confidence

    if classes:

        assert(classId < len(classes))

        label = '%s:%s' % (classes[classId], label)

 

    #Display the label at the top of the bounding box

    labelSize, baseLine = cv.getTextSize(label, cv.FONT_HERSHEY_SIMPLEX, 0.5, 1)

    top = max(top, labelSize[1])

    #cv.rectangle(frame, (left, top - round(1.5*labelSize[1])), (left + round(1.5*labelSize[0]), top + baseLine), (255, 255, 255), cv.FILLED)

    #cv.putText(frame, label, (left, top), cv.FONT_HERSHEY_SIMPLEX, 0.75, (0,0,0), 1)

 

# Remove the bounding boxes with low confidence using non-maxima suppression

def postprocess(frame, outs):

    frameHeight = frame.shape[0]

    frameWidth = frame.shape[1]

 

    classIds = []

    confidences = []

    boxes = []

    # Scan through all the bounding boxes output from the network and keep only the

    # ones with high confidence scores. Assign the box's class label as the class with the highest score.

    classIds = []

    confidences = []

    boxes = []

    for out in outs:

        for detection in out:

            scores = detection[5:]

            classId = np.argmax(scores)

            confidence = scores[classId]

            if confidence > confThreshold:

                center_x = int(detection[0] * frameWidth)

                center_y = int(detection[1] * frameHeight)

                width = int(detection[2] * frameWidth)

                height = int(detection[3] * frameHeight)

                left = int(center_x - width / 2)

                top = int(center_y - height / 2)

                classIds.append(classId)

                confidences.append(float(confidence))

                boxes.append([left, top, width, height])

 

    # Perform non maximum suppression to eliminate redundant overlapping boxes with

    # lower confidences.

    indices = cv.dnn.NMSBoxes(boxes, confidences, confThreshold, nmsThreshold)

 

    for i in indices:

        i = i[0]

        box = boxes[i]

        left = box[0]

        top = box[1]

        width = box[2]

        height = box[3]

        drawPred(classIds[i], confidences[i], left, top, left + width, top + height)

 

# Process inputs

winName = 'Deep learning object detection in OpenCV'

cv.namedWindow(winName, cv.WINDOW_NORMAL)

 

outputFile = "yolo_out_py.jpg"

 

pic_number = 1

 

g = os.walk(r"./車輛圖片") 

 

for path,dir_list,file_list in g:  

    for file_name in file_list:

 

        time.sleep(2)  

        path_name = os.path.join(path, file_name)

        print(path_name)

        print(file_name)

 

        frame = cv.imread(path_name)

        print(frame.shape)

 

        outputFile = str(pic_number) + '_yolo_out_py.jpg'

        pic_number += 1

 

     # Create a 4D blob from a frame.

        blob = cv.dnn.blobFromImage(frame, 1/255, (inpWidth, inpHeight), [0,0,0], 1, crop=False)

 

    # Sets the input to the network

        net.setInput(blob)

 

     # Runs the forward pass to get output of the output layers

        outs = net.forward(getOutputsNames(net))

 

     # Remove the bounding boxes with low confidence

        postprocess(frame, outs)

 

     # Put efficiency information. The function getPerfProfile returns the overall time for inference(t) and the timings for each of the layers(in layersTimes)

        t, _ = net.getPerfProfile()

        label = 'Inference time: %.2f ms' % (t * 1000.0 / cv.getTickFrequency())

        cv.putText(frame, label, (0, 15), cv.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255))

 

     # Write the frame with the detection boxes

        cv.imwrite(outputFile, frame.astype(np.uint8));

        cv.imshow(winName, frame)

=======================================結束=================================================


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM