鏡頭分割常常被用於視頻智能剪輯、視頻關鍵幀提取等場景。
本文給出一種解決鏡頭分割問題的思路,可分為兩個步驟:
1、根據鏡頭分割算法對視頻進行分割標記
核心在於鏡頭分割算法,這里簡單描述一種算法思路:ratio = different(current_frame_histogram, prevous_frame_histogram) / avgvere_different(previous_frame_histogram),通過大量試驗找到合適的ratio 閾值,若ratio大於閾值,則從當前幀分割視頻,由於版權原因本文省略具體算法及實現。利用cv2的calcHist計算幀RGB三通道histogram的代碼如下:
for id in range(3):
self.current_hist_rgb[id] = cv2.calcHist([frame], [0], None, [256], [0, 255])
2、 根據分割標記進行實際分割
本文使用ffmpeg進行視頻分割(需安裝ffmpeg),具體命令如下
ffmpeg -ss starttime -i input.mp4 -t duration -codec copy -codec copy output.mp4 -y
命令中參數的順序不能任意調整,-ss必須是第一個參數,否則分割后的視頻可能出現黑屏,-t參數必須在-i參數后面,否則分割后視頻可能出現時長不正確的問題。從實際效果來看,分割點並不准確在-ss參數指定的時間點,而是之前最近的關鍵幀。
最后,本文采用ffmpeg-python(需要用pip安裝)來計算視頻pts,具體實現見VideoCutEngine的calcPTS方法。
實現代碼:
import cv2
import ffmpeg
import numpy as np
import sys
import os
class VideoCutEngine():
def __init__(self, input):
self.input = input
def calcPTS(self):
try:
probe = ffmpeg.probe(self.input)
except ffmpeg.Error as e:
print(e.stderr, sys.stderr)
return False, 0
video_stream = next((stream for stream in probe['streams'] if stream['codec_type'] == 'video'), None)
if video_stream is None:
return False, 1
num_frames = int(video_stream['nb_frames'])
duration = float(video_stream['duration'])
return True, num_frames * 1.0 / duration
def doCut(self, start, duration, output):
cmd = 'ffmpeg -ss {} -i {} -t {} -codec copy -codec copy {} -y'.format(start, self.input, duration, output)
ret = os.system(cmd)
return ret
class SceneSplitEngine():
def __init__(self):
self.frame = None
self.current_hist_rgb = [0, 0, 0]
self.last_hist_rgb = [0, 0, 0]
self.frame_count = 0
self.current_shot_count = 0
self.hist_diff = []
def setFrmae(self,frame):
self.frame = frame
self.frame_count += 1
self.current_shot_count += 1
def doSplit(self):
for id in range(3):
self.current_hist_rgb[id] = cv2.calcHist([frame], [0], None, [256], [0, 255])
具體算法實現省略。
input = '/data/test.mp4'
if __name__ == '__main__':
sceneSpliter = SceneSplitEngine()
videoCutter = VideoCutEngine(input)
videoCapturer = cv2.VideoCapture(input)
pts = videoCutter.calcPTS()
while True:
ret1, frame = videoCapturer.read()
if ret1 == True:
sceneSpliter.setFrmae(frame)
ret2, start, end = sceneSpliter.doSplit()
if ret2 == True:
duration = max((end -start) / 24, 1)
print(ret2, start / 24, duration)
output = '/data/output{}.mp4'.format(start / 24)
videoCutter.doCut(start / 24, duration, output)
else:
break
