【深度學習】分析識別視頻中的物體

本文轉載自查看原文 2018-08-16 17:18 2563 深度學習

最近接到一個需求，要分析視頻中的物體，比如分析一段視頻中是否有人，是否有車等等。

首先想到的是深度學習，機器學習，但是之前只是稍微看了看，沒有深入學習，想要在短時間內搞定算法不太可能，於是就在github上搜索解決方案，找到不少，都是基於tensorflow的，比如yolo。

自己找了幾個測試，發現這個東西太消耗性能，比如一段12s的1080p的視頻，在2核4G加Tesla P40顯卡的環境下，居然需要18s才能分析完成，計算這么一個視頻，耗時18s，需要這么高的顯卡，顯然是不具備商業價值的，太貴。

后來找到一個分析圖片的算法，自己改造成了分析視頻的了，在4核8G無顯卡的虛擬機中，僅需6s就可以分析完成如上視頻，雖然識別率不如yolo高，但是目前看也夠用了。如果再加入抽幀的算法，比如每隔10幀抽一幀進行分析，速度還能再提升幾倍。現分享源代碼如下，希望對各位同道中人有所幫助：

#!/usr/bin/env python
# -*- coding:utf-8 -*-
import numpy as np
import cv2
#等待分析視頻路徑
video_path = "./demo.mp4"
CLASSES = ["background", "aeroplane", "bicycle", "bird", "boat",
           "bottle", "bus", "car", "cat", "chair", "cow", "diningtable",
           "dog", "horse", "motorbike", "person", "pottedplant", "sheep",
           "sofa", "train", "tvmonitor"]
COLORS = np.random.uniform(0, 255, size=(len(CLASSES), 3))
prototxt = 'demo.prototxt.txt'
model = 'demo.caffemodel'
video = video_path
video_name = video_path.split('/')[-1]
#分析結果視頻路徑
result_video = 'result/%s' %(video_name)
#篩選，物品識別概率大於0.2的會話框，可以手動改這個數值
confidence_input = 0.2
net = cv2.dnn.readNetFromCaffe(prototxt, model)
#讀取視頻
cap = cv2.VideoCapture(video)
#獲取視頻fps
fps_video = cap.get(cv2.CAP_PROP_FPS)
#設置視頻編碼器
fourcc = cv2.VideoWriter_fourcc(*"DIVX")
#設置視頻寫入參數
videoWriter = cv2.VideoWriter(result_video, fourcc, fps_video, (1920, 1080))
while (cap.isOpened()):
    ret, frame = cap.read()
    if ret == True:
        image = frame
        (h, w) = image.shape[:2]
        blob = cv2.dnn.blobFromImage(cv2.resize(image, (300, 300)), 0.007843, (300, 300), 127.5)
        net.setInput(blob)
        detections = net.forward()
        for i in np.arange(0, detections.shape[2]):
            confidence = detections[0, 0, i, 2]
            if confidence > confidence_input:
                idx = int(detections[0, 0, i, 1])
                box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])
                (startX, startY, endX, endY) = box.astype("int")
                label = "{}: {:.2f}%".format(CLASSES[idx], confidence * 100)
                cv2.rectangle(image, (startX, startY), (endX, endY),COLORS[idx], 2)
                y = startY - 15 if startY - 15 > 15 else startY + 15
                cv2.putText(image, label, (startX, y),
                cv2.FONT_HERSHEY_SIMPLEX, 0.5, COLORS[idx], 2)
        videoWriter.write(image)
    else:
        break
videoWriter.release()

兩個訓練模型放在百度雲

鏈接: https://pan.baidu.com/s/1Ozg3wgXMwlBeX4_joFVhKQ 密碼: 75hw

需要的同學自取

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 深度學習構建視頻人臉識別模型深度學習中噪聲標簽的影響和識別圖像識別中的深度學習圖像識別中的深度學習圖像識別中的深度學習轉 50行Python代碼實現視頻中物體顏色識別和跟蹤（必須以紅色為例） 50行Python代碼實現視頻中物體顏色識別和跟蹤深度學習---手寫字體識別程序分析（python）深度學習丨深度學習中GPU和顯存分析深度學習-目標檢測（物體檢測）