iOS VideoToolbox硬編H.265（HEVC）H.264（AVC）：1 概述

本文轉載自查看原文 2016-10-15 16:09 3246 iOS音視頻/ iOS 常用必會的第三方

本文檔嘗試用Video Toolbox進行H.265（HEVC）硬件編碼，視頻源為iPhone后置攝像頭。去年做完硬解H.264，沒做編碼，技能上感覺有些缺失。正好剛才發現CMFormatDescription.h中enum : CMVideoCodecType提供了kCMVideoCodecType_HEVC枚舉值。所以呢，作死試試 iOS 9.2 硬編HEVC。

結論：不支持開發者使用H.265（HEVC），可以用H.264（AVC）。

1、讀取iPhone后置攝像頭

提示：iPhone不支持同時打開前后攝像頭。因為SoC目前通常只有一個視頻通道（Video Channel），當有兩個AVCaptureSession先后運行，前一個會自動停止，后一個會繼續運行。或者，有人想一個AVCaptureSession添加前后攝像頭作為AVCaptureDeviceInput，這樣會異常。因為兩個辦法我都試過。

iOS 8及后續版本，打開攝像頭需要用戶授權。

1.1、指定攝像頭

我使用iPhone 6p當測試機，它有兩個攝像頭，要指定需使用的攝像頭，在此使用后置攝像頭當數據源。

AVCaptureDevice *avCaptureDevice;
NSArray *cameras = [AVCaptureDevice devicesWithMediaType:AVMediaTypeVideo];
for (AVCaptureDevice *device in cameras) {    
    if (device.position == AVCaptureDevicePositionBack) {         
        avCaptureDevice = device;     
    } 
}

若想直接使用后置攝像頭，可簡化上述代碼。

AVCaptureDevice * avCaptureDevice = [AVCaptureDevice defaultDeviceWithMediaType:AVMediaTypeVideo];

1.2、打開攝像頭

對於捕獲攝像頭，整個行為都由AVCaptureSession會話類維護，簡化了編程復雜度。輸入為攝像頭，輸出為用戶需要的通道，如屏幕。

NSError *error = nil;
AVCaptureDeviceInput *videoInput = [AVCaptureDeviceInput deviceInputWithDevice:avCaptureDevice error:&error];
if (!videoInput) {    
    return; 
}
AVCaptureSession *avCaptureSession = [[AVCaptureSession alloc] init]; 
avCaptureSession.sessionPreset = AVCaptureSessionPresetHigh; // sessionPreset為AVCaptureSessionPresetHigh，可不顯式指定
[avCaptureSession addInput:videoInput];

配置好輸入，現在配置輸出，即攝像頭的輸出數據格式等。由AVCaptureDevice.formats可知當前設備支持的像素格式，對於iPhone 6，就兩個默認格式：420f和420v。需要輸出32BGRA，則需AVCaptureSession進行配置kCVPixelBufferPixelFormatTypeKey，已測可用值為

kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange，即420v
kCVPixelFormatType_420YpCbCr8BiPlanarFullRange，即420f
kCVPixelFormatType_32BGRA，iOS在內部進行YUV至BGRA格式轉換

YUV420一般用於標清視頻，YUV422用於高清視頻，這里的限制讓人感到意外。但是，在相同條件下，YUV420計算耗時和傳輸壓力比YUV422都小。

AVCaptureVideoDataOutput *avCaptureVideoDataOutput = [[AVCaptureVideoDataOutput alloc] init];
NSDictionary*settings = @{(__bridge id)kCVPixelBufferPixelFormatTypeKey: @(kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange)}; 
avCaptureVideoDataOutput.videoSettings = settings;
dispatch_queue_t queue = dispatch_queue_create("com.github.michael-lfx.back_camera_io", NULL); 
[avCaptureVideoDataOutput setSampleBufferDelegate:self queue:queue]; 
[avCaptureSession addOutput:avCaptureVideoDataOutput];

添加預覽界面。

AVCaptureVideoPreviewLayer *previewLayer = [AVCaptureVideoPreviewLayer layerWithSession:avCaptureSession]; 
previewLayer.frame = self.view.bounds; 
previewLayer.videoGravity= AVLayerVideoGravityResizeAspectFill; 
[self.view.layer addSublayer:previewLayer];

啟動會話。

[avCaptureSession startRunning];

啟動應用可看到攝像頭當前圖像。

1.3、從回調中獲取攝像頭數據

默認情況下，iPhone 6p為30 fps，意味着如下函數每秒調用30次，那么，先簡單打印攝像頭輸出數據的信息。

- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection {     
    CVPixelBufferRef pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);    
    if (CVPixelBufferIsPlanar(pixelBuffer)) {        
        NSLog(@"kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange -> planar buffer");     
    }     
    CMVideoFormatDescriptionRef desc = NULL;     
    CMVideoFormatDescriptionCreateForImageBuffer(NULL, pixelBuffer, &desc);    
    CFDictionaryRef extensions = CMFormatDescriptionGetExtensions(desc);    
    NSLog(@"extensions = %@", extensions); 
}

結果如下：

kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange -> planar buffer 
extensions = {     
    CVBytesPerRow = 2904;     
    CVImageBufferColorPrimaries = "ITU_R_709_2";     
    CVImageBufferTransferFunction = "ITU_R_709_2";     
    CVImageBufferYCbCrMatrix = "ITU_R_709_2";     
    Version = 2; 
}

在我有限的視頻基礎中，ITU_R_709_2是HD視頻的方案，一般用於YUV422，YUV至RGB的轉換矩陣和SD視頻（一般是ITU_R_601_4）並不相同。

CVPixelBufferGetPixelFormatType()可獲取攝像頭輸出的像素數據格式，和前面指定的格式一致。

在當iPhone 6上運行且將sessionPreset設置為AVCaptureSessionPreset640x480，得到如下輸出結果。

kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange -> planar buffer 
extensions = {     
    CVBytesPerRow = 964;     
    CVImageBufferColorPrimaries = "ITU_R_709_2";     
    CVImageBufferTransferFunction = "ITU_R_709_2";     
    CVImageBufferYCbCrMatrix = "ITU_R_601_4";     
    Version = 2; 
}

分析一下CVBytesPerRow。CVBytesPerRow值964與CVPixelBufferGetBytesPerRow函數返回值一致。從預置可知，Y平面為640和CVPixelBufferGetWidth、CVPixelBufferGetWidthOfPlane(0)函數返回值一致。

CVPixelBufferGetBytesPerRow文檔
The number of bytes per row of the image data. For planar buffers, this function returns a rowBytes value such that bytesPerRow * height covers the entire image, including all planes.

從上述文檔可知CVPixelBufferGetBytesPerRow返回Planar緩沖區多個通道的寬度和，在此是Y、UV通道的寬度和：Y + U + V = 640 + (640/2 + 640/2) = 1280。當然，這個計算方式是錯的。按YUV420采樣規則計算，則一個像素點用8+2+2表示，即是，每個像素點12個位，那么每行圖像實際擁有字節數為

640x12/8 = 960

與CVBytesPerRow不等。下面，再用理論公式計算圖像的體積。
通過CVPixelBufferGetHeight得到高為480，圖像體積為

640x480 + ((640/2) x (480/2)) + ((640/2) x (480/2)) 
=>640x480x3/2
=>460800

而CVPixelBufferGetDataSize返回462728，顯然不相等。就像FFmpeg出於加速讀取內存的目的，在AVFrame.data中加入填充數據，導致AVFrame.linesize >= AVFrame.width。那么，CVPixelBuffer是否存在行為呢？

size_t  extraColumnsOnLeft;
size_t extraColumnsOnRight;
size_t extraRowsOnTop;
size_t extraRowsOnBottom;

CVPixelBufferGetExtendedPixels(pixelBuffer,                               
                                &extraColumnsOnLeft,                               
                                &extraColumnsOnRight,                               
                                &extraRowsOnTop,                               
                                &extraRowsOnBottom);
NSLog(@"extra (left, right, top, bottom) = (%ld, %ld, %ld, %ld)",      
    extraColumnsOnLeft,     
    extraColumnsOnRight,     
    extraRowsOnTop,     
    extraRowsOnBottom);

上述代碼輸出結果都為0，並無拓展像素。此問題留待解決。

2、VideoToolbox HEVC、AVC編碼嘗試

iOS支持硬編H.264（AVC）的Profile與Level描述在VTCompressionProperties.h，簡單總結為：

Baseline

1 - 3
3 - [0, 2]
4 - [0, 2]
5 - [0, 2]
自動Profile、Level

Main

3 - [0, 2]
4 - [0, 2]
5 - [0, 2]
自動Profile、Level

Extended Main

5 - [0]
自動Profile、Level

High

3 - [0, 2]
4 - [0, 2]
5 - [0, 2]
自動Profile、Level

VideoToolbox編碼算法如下：

創建編碼會話
准備編碼
逐幀編碼
結束編碼

2.1、創建編碼會話

// 獲取攝像頭輸出圖像的寬高
size_t width = CVPixelBufferGetWidth(pixelBuffer);
size_t height = CVPixelBufferGetHeight(pixelBuffer);
static VTCompressionSessionRef compressionSession; 
OSStatus status =  VTCompressionSessionCreate(NULL,                                               
                                            width, height,                                               
                                            kCMVideoCodecType_H264,                                              
                                            NULL,                                              
                                            NULL,                                              
                                            NULL, &compressionOutputCallback, NULL, &compressionSession);

kCMVideoCodecType_H264改成kCMVideoCodecType_HEVC，在iOS 9.2.1 iPhone 6p、iPhone 6sp執行均返回錯誤-12908，kVTCouldNotFindVideoEncoderErr，找不到編碼器。看來iOS 9.2並不開放HEVC編碼器。

編碼回調函數定義如下：

static void compressionOutputCallback(void * CM_NULLABLE outputCallbackRefCon,                                      void * CM_NULLABLE sourceFrameRefCon,                                       
                                        OSStatus status,                                       
                                        VTEncodeInfoFlags infoFlags,                                       
                                        CM_NULLABLE CMSampleBufferRef sampleBuffer ) {    
    if (status != noErr) {        
        NSLog(@"%s with status(%d)", __FUNCTION__, status);        
        return;     
    }    
    if (infoFlags == kVTEncodeInfo_FrameDropped) {        
        NSLog(@"%s with frame dropped.", __FUNCTION__);        
        return;     
    }    
    /* ------ 輔助調試 ------ */     
    CMFormatDescriptionRef fmtDesc = CMSampleBufferGetFormatDescription(sampleBuffer);    CFDictionaryRef extensions = CMFormatDescriptionGetExtensions(fmtDesc);    NSLog(@"extensions = %@", extensions);     
    CMItemCount count = CMSampleBufferGetNumSamples(sampleBuffer);    NSLog(@"samples count = %d", count);    /* ====== 輔助調試 ====== */      
    // 推流或寫入文件
}

編碼成功時輸出如下信息：

extensions = {     
    FormatName = "H.264";     
    SampleDescriptionExtensionAtoms =     {         
        avcC = <014d0028 ffe1000b 274d0028 ab603c01 13f2a001 000428ee 3c30>;     
    }; 
} 
samples count = 1

采樣數據為1，並不意味着slice數量為1。目前沒找到輸出多slice碼流（多個I、P Slice）的參數配置。sampleBuffer的詳細信息示例如下：

CMSampleBuffer 0x126e9fd80 retainCount: 1 allocator: 0x1a227cb68     
    invalid = NO     
    dataReady = YES     
    makeDataReadyCallback = 0x0     
    makeDataReadyRefcon = 0x0     
    formatDescription = <CMVideoFormatDescription 0x126e9fd50 [0x1a227cb68]> {     
    mediaType:'vide'      
    mediaSubType:'avc1'      
    mediaSpecific: {         
        codecType: 'avc1'        dimensions: 1920 x 1080      
    }      
    extensions: {<CFBasicHash 0x126e9eae0 [0x1a227cb68]>{type = immutable dict, count = 2, entries =>    
    0 : <CFString 0x19dd523e0 [0x1a227cb68]>{contents = "SampleDescriptionExtensionAtoms"} = <CFBasicHash 0x126e9e090 [0x1a227cb68]>{type = immutable dict, count = 1, entries =>    
    2 : <CFString 0x19dd57c20 [0x1a227cb68]>{contents = "avcC"} = <CFData 0x126e9e1b0 [0x1a227cb68]>{length = 26, capacity = 26, bytes = 0x014d0028ffe1000b274d0028ab603c01 ... a001000428ee3c30} }    
    2 : <CFString 0x19dd52440 [0x1a227cb68]>{contents = "FormatName"} = H.264} } }     
    sbufToTrackReadiness = 0x0     
    numSamples = 1     
    sampleTimingArray[1] = {         
        {PTS = {196709596065916/1000000000 = 196709.596}, DTS = {INVALID}, duration = {INVALID}},     
    }     
    sampleSizeArray[1] = {         
        sampleSize = 5707,     
    }     
    sampleAttachmentsArray[1] = {         
        sample 0:            DependsOnOthers = false     
    }     
    dataBuffer = 0x126e9fc50

為方便調試，可將H264文件寫入文件，用VLC等工具分析，這是本系列文檔第二篇：
iOS VideoToolbox硬編H.265（HEVC）H.264（AVC）：2 H264數據寫入文件。

下面介紹avcC的作用。

avcC放入CFDictionaryRef然后傳遞至CMVideoFormatDescriptionCreate，創建視頻格式描述，接着創建解碼會話，開始解碼。

由此也可發現，VideoToolbox編碼輸出為avcC格式，而且VideoToolbox也只支持avcC格式的H.264。如果從網絡中得到Annex-B格式的H.264數據（一般稱作H.264裸流或Elementary Stream），用CMVideoFormatDescriptionCreateFromH264ParameterSets創建視頻格式描述更方便，同時解碼時需要將Annex-B轉換成avcC，這也是WWDC2014 513 "direct access to media encoding and decoding"中說VideoToolbox只支持MP4容器裝載的H.264數據的原因，就我所知，當寫入MP4時，Annex-B使用的起始碼（Start Code）會被寫成長度（Length）。這就是VideoToolBox硬解最容易出問題的點，我去年做硬解花了很長時間就是因為不了解H.264相關知識，各種出錯。

2.2、准備編碼

開始編碼前，可配置H.264 Profile、Level、幀間距等設置，它們最終體現在SPS、PPS，指導解碼器進行解碼操作。

VTSessionSetProperty(compressionSession, kVTCompressionPropertyKey_ProfileLevel, kVTProfileLevel_H264_Main_AutoLevel);
// 等等一系列屬性
OSStatus status = VTCompressionSessionPrepareToEncodeFrames(compressionSession);
if (status != noErr) {    
    // FAILED.
}

本系列文檔第二篇iOS VideoToolbox硬編H.265（HEVC）H.264（AVC）：2 H264數據寫入文件進一步解釋SPS、PPS。

2.3、逐幀編碼

編碼前，一般會鎖定像素緩沖區基位置，編碼完解除。同時，需要指定顯示時間戳和持續時間。

if(CVPixelBufferLockBaseAddress(pixelBuffer, 0) != kCVReturnSuccess) {    
    // FAILED.
}  
CMTime presentationTimeStamp = CMSampleBufferGetOutputPresentationTimeStamp(sampleBuffer); 
CMTime duration = CMSampleBufferGetOutputDuration(sampleBuffer);  
status = VTCompressionSessionEncodeFrame(compressionSession, pixelBuffer, presentationTimeStamp, duration, NULL, pixelBuffer, NULL);  
CVPixelBufferUnlockBaseAddress(pixelBuffer, 0);

編碼不像解碼一樣可以指定VTDecodeFrameFlags為同步操作，所以編碼的回調是異步的。異步雖然提高了代碼運行效率，同時帶來整理幀序等額外操作，讓音頻同步編碼等操作變復雜。

2.4、結束編碼

編碼結束時，調用VTCompressionSessionCompleteFrames停止編碼並指示編碼器如何處理已編碼及待編碼幀。
接着調用VTCompressionSessionInvalidate結束會話，否則硬件容易異常，需要重啟手機。
最后釋放VTCompressionSession。

3、討論

WWDC2014 513 "direct access to media encoding and decoding" 提及了在實時要求不高的場合，編碼用MultiPass可得到更好的效果。我並沒嘗試。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 iOS VideoToolbox硬編H.265（HEVC）H.264（AVC）：2 H264數據寫入文件我們解決了如何將視頻轉換為HEVC / H.265和AVC / H.264 H.265 & H.264 如何實現RTSP推送H.264、RTSP推送H.265(hevc) 什么是h.264/h.265編碼硬解軟解區別使用VideoToolbox硬編碼H.264<轉> h264, h265 和 libvpx 比較(h264/avc, hevc 和vp9比較) H.264格式，iOS硬編解碼以及 iOS 11對HEVC硬編解碼的支持 iOS-VideoToolbox硬編碼H264 H.265/HEVC編碼結構