mxnet decord 視頻讀取和載入


Decord Video Reader Example

import decord as de
from matplotlib import pyplot as plt
# using cpu in this example
ctx = de.cpu(0)
# example video
video = 'Javelin_standing_throw_drill.mkv'

vr = de.VideoReader(video)  # using default resolution
print('Video frames #:', len(vr))       # 視頻幀數
print('First frame shape:', vr[0].shape)      # 每幀的shape
Video frames #: 48
First frame shape: (240, 320, 3)

控制幀的尺寸:

vr = de.VideoReader(video, width=120, height=240) print('Frame shape:', vr[0].shape)
Frame shape: (240, 120, 3)

隨機訪問顯然很慢,但decord使用內部優化來確保不會在這里浪費太多精力。
返回的幀是DLPack兼容的NDArray格式(例如在TVM中使用),可轉為numpy數組。
decord中有一個橋接系統,它自動將所有輸出轉換為與深度學習框架兼容的陣列,例如MXNet、PyTorch、Tensorflow。但始終可以利用numpy數組。

frame10 = vr[10].asnumpy()
plt.imshow(frame10)
plt.show()

很容易一起獲得許多幀:

frames = vr.get_batch(range(0, len(vr) - 1, 5))
print(frames.shape)
(10, 240, 120, 3)

 

 

 

Decord Video Loader Example

import sys, os
import decord as de


# using cpu in this example
ctx = de.cpu(0)
# using batchsize = 2 and smaller resolution in this example
shape = (2, 480, 640, 3)
# using kinetics example videos
videos = ['Javelin_standing_throw_drill.mkv', 'flipping_a_pancake.mkv']
# using in-batch frame interval 5
interval = 5        # 一個batch中每兩幀的距離
# using inter-batch frame interval 20, which means batch-batch interval is 20
skip = 3      # 不同batch之間的距離


# first see how sequential read looks like
vl = de.VideoLoader(videos, ctx=ctx, shape=shape, interval=interval, skip=skip, shuffle=0)
print('num batches:', len(vl))
num batches: 9

可視化:

def disp_batches(video_loader, max_disp=5):
    %matplotlib inline
    from matplotlib import pyplot as plt
    import matplotlib.gridspec as gridspec
    cnt = 0
    vl.reset()
    for batch in vl:
        if cnt >= max_disp:
            break
        print('batch data shape:', batch[0].shape)
        print('indices:', ', '.join(['(file: {} frame: {})'.format(x, y) for x, y in batch[1].asnumpy()]))
        print('----------')
        data = batch[0].asnumpy()
        columns = 4
        rows = max(1, (data.shape[0] + 1) // columns)
        fig = plt.figure(figsize = (32,(16 // columns) * rows))
        gs = gridspec.GridSpec(rows, columns)
        for i in range(data.shape[0]):
            plt.subplot(gs[i])
            plt.axis("off")
            plt.imshow(data[i])
        cnt += 1
disp_batches(vl, 5)
batch data shape: (2, 480, 640, 3)
indices: (file: 0 frame: 0), (file: 0 frame: 7)      # 0-7共6幀;間隔為3,下一次從11開始
----------
batch data shape: (2, 480, 640, 3)
indices: (file: 0 frame: 11), (file: 0 frame: 18)    # 11-18共6幀;間隔為3,下一次從22開始...
----------
batch data shape: (2, 480, 640, 3)
indices: (file: 0 frame: 22), (file: 0 frame: 29)
----------
batch data shape: (2, 480, 640, 3)
indices: (file: 1 frame: 0), (file: 1 frame: 7)
----------
batch data shape: (2, 480, 640, 3)
indices: (file: 1 frame: 11), (file: 1 frame: 18)
----------

可以看到這個是從第一個視頻截取、然后第二個...那么可以按照如下進行shuffle:

vl = de.VideoLoader(videos, ctx=ctx, shape=shape, interval=interval, skip=skip, shuffle=2)
print('num batches:', len(vl))
disp_batches(vl, 5)
num batches: 8
batch data shape: (2, 480, 640, 3)
indices: (file: 1 frame: 33), (file: 1 frame: 40)      # file1中截取
----------
batch data shape: (2, 480, 640, 3)
indices: (file: 1 frame: 22), (file: 1 frame: 29)
----------
batch data shape: (2, 480, 640, 3) 
indices: (file: 0 frame: 11), (file: 0 frame: 18)       # file0中截取
----------
batch data shape: (2, 480, 640, 3)
indices: (file: 1 frame: 44), (file: 1 frame: 51)
----------
batch data shape: (2, 480, 640, 3)
indices: (file: 1 frame: 11), (file: 1 frame: 18)
----------

可以看到已經不是按照視頻順序截取了。

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM