前言
VideoToolBox是iOS8之后,蘋果開發的用於硬解碼編碼H264/H265(iOS11以后支持)的API。
對於H264還不了解的童鞋一定要先看下這邊的H264的簡介。
編碼流程
我們實現一個簡單的Demo,從攝像頭獲取到視頻數據,然后再編碼成H264裸數據保存在沙盒中。
1. 創建初始化VideoToolBox
核心代碼如下
- (void)initVideoToolBox { dispatch_sync(encodeQueue , ^{ frameNO = 0; int width = 480, height = 640; OSStatus status = VTCompressionSessionCreate(NULL, width, height, kCMVideoCodecType_H264, NULL, NULL, NULL, didCompressH264, (__bridge void *)(self), &encodingSession); NSLog(@"H264: VTCompressionSessionCreate %d", (int)status); if (status != 0) { NSLog(@"H264: Unable to create a H264 session"); return ; } // 設置實時編碼輸出(避免延遲) VTSessionSetProperty(encodingSession, kVTCompressionPropertyKey_RealTime, kCFBooleanTrue); VTSessionSetProperty(encodingSession, kVTCompressionPropertyKey_ProfileLevel, kVTProfileLevel_H264_Baseline_AutoLevel); // 設置關鍵幀(GOPsize)間隔 int frameInterval = 24; CFNumberRef frameIntervalRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberIntType, &frameInterval); VTSessionSetProperty(encodingSession, kVTCompressionPropertyKey_MaxKeyFrameInterval, frameIntervalRef); //設置期望幀率 int fps = 24; CFNumberRef fpsRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberIntType, &fps); VTSessionSetProperty(encodingSession, kVTCompressionPropertyKey_ExpectedFrameRate, fpsRef); //設置碼率,均值,單位是byte int bitRate = width * height * 3 * 4 * 8; CFNumberRef bitRateRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberSInt32Type, &bitRate); VTSessionSetProperty(encodingSession, kVTCompressionPropertyKey_AverageBitRate, bitRateRef); //設置碼率,上限,單位是bps int bitRateLimit = width * height * 3 * 4; CFNumberRef bitRateLimitRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberSInt32Type, &bitRateLimit); VTSessionSetProperty(encodingSession, kVTCompressionPropertyKey_DataRateLimits, bitRateLimitRef); //開始編碼 VTCompressionSessionPrepareToEncodeFrames(encodingSession); }); }
初始化這里設置了編碼類型kCMVideoCodecType_H264,
分辨率640 * 480,fps,GOP,碼率。
2. 從攝像頭獲取視頻數據丟給VideoToolBox編碼成H264
初始化視頻采集端核心代碼如下
//初始化攝像頭采集端 - (void)initCapture{ self.captureSession = [[AVCaptureSession alloc]init]; //設置錄制640 * 480 self.captureSession.sessionPreset = AVCaptureSessionPreset640x480; AVCaptureDevice *inputCamera = [self cameraWithPostion:AVCaptureDevicePositionBack]; self.captureDeviceInput = [[AVCaptureDeviceInput alloc] initWithDevice:inputCamera error:nil]; if ([self.captureSession canAddInput:self.captureDeviceInput]) { [self.captureSession addInput:self.captureDeviceInput]; } self.captureDeviceOutput = [[AVCaptureVideoDataOutput alloc] init]; [self.captureDeviceOutput setAlwaysDiscardsLateVideoFrames:NO]; //設置YUV420p輸出 [self.captureDeviceOutput setVideoSettings:[NSDictionary dictionaryWithObject:[NSNumber numberWithInt:kCVPixelFormatType_420YpCbCr8BiPlanarFullRange] forKey:(id)kCVPixelBufferPixelFormatTypeKey]]; [self.captureDeviceOutput setSampleBufferDelegate:self queue:captureQueue]; if ([self.captureSession canAddOutput:self.captureDeviceOutput]) { [self.captureSession addOutput:self.captureDeviceOutput]; } //建立連接 AVCaptureConnection *connection = [self.captureDeviceOutput connectionWithMediaType:AVMediaTypeVideo]; [connection setVideoOrientation:AVCaptureVideoOrientationPortrait]; }
這里需要注意設置的視頻分辨率和編碼器一致640 * 480. AVCaptureVideoDataOutput類型選用YUV420p。
攝像頭數據回調部分
- (void)captureOutput:(AVCaptureOutput *)output didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection{ dispatch_sync(encodeQueue, ^{ [self encode:sampleBuffer]; }); } //編碼sampleBuffer - (void) encode:(CMSampleBufferRef )sampleBuffer { CVImageBufferRef imageBuffer = (CVImageBufferRef)CMSampleBufferGetImageBuffer(sampleBuffer); // 幀時間,如果不設置會導致時間軸過長。 CMTime presentationTimeStamp = CMTimeMake(frameNO++, 1000); VTEncodeInfoFlags flags; OSStatus statusCode = VTCompressionSessionEncodeFrame(encodingSession, imageBuffer, presentationTimeStamp, kCMTimeInvalid, NULL, NULL, &flags); if (statusCode != noErr) { NSLog(@"H264: VTCompressionSessionEncodeFrame failed with %d", (int)statusCode); VTCompressionSessionInvalidate(encodingSession); CFRelease(encodingSession); encodingSession = NULL; return; } NSLog(@"H264: VTCompressionSessionEncodeFrame Success"); }
3.框架中出現的數據結構
CMSampleBufferRef
存放一個或者多個壓縮或未壓縮的媒體數據;
下圖列舉了兩種CMSampleBuffer。
CMTime
64位的value,32位的scale,media的時間格式;
CMBlockBuffer
這里可以叫裸數據;
CVPixelBuffer
包含未壓縮的像素數據,圖像寬度、高度等;
pixelBufferAttributes
CFDictionary包括寬高、像素格式(RGBA、YUV)、使用場景(OpenGL ES、Core Animation)
CVPixelBufferPool
CVPixelBuffer的緩沖池,因為CVPixelBuffer的創建和銷毀開銷很大
CMVideoFormatDescription
video格式,包括寬高、顏色空間、編碼格式信息等;對於H264,還包含sps和pps數據;
4. 編碼完成后的數據寫入H264
這里編碼完成我們先判斷的是否為I幀,如果是需要讀取sps和pps參數集,為什么要這樣呢?
我們先看一下一個裸數據H264(Elementary Stream)的NALU構成
H.264裸流中,不存在單獨的SPS、PPS包或幀,而是附加在I幀前面,存儲的一般形式為
00 00 00 01 SPS 00 00 00 01 PPS 00 00 00 01 I幀
前面的這些00 00數據稱為起始碼(Start Code),它們不屬於SPS、PPS的內容。
SPS(Sequence Parameter Sets)和PPS(Picture Parameter Set):H.264的SPS和PPS包含了初始化H.264解碼器所需要的信息參數,包括編碼所用的profile,level,圖像的寬和高,deblock濾波器等。
上面介紹了sps和pps是封裝在CMFormatDescriptionRef中,所以我們得先CMFormatDescriptionRef中取出sps和pps寫入h264裸流中。
這就不難理解寫入H264的流程了。
代碼如下
// 編碼完成回調 void didCompressH264(void *outputCallbackRefCon, void *sourceFrameRefCon, OSStatus status, VTEncodeInfoFlags infoFlags, CMSampleBufferRef sampleBuffer) { NSLog(@"didCompressH264 called with status %d infoFlags %d", (int)status, (int)infoFlags); if (status != 0) { return; } if (!CMSampleBufferDataIsReady(sampleBuffer)) { NSLog(@"didCompressH264 data is not ready "); return; } ViewController* encoder = (__bridge ViewController*)outputCallbackRefCon; bool keyframe = !CFDictionaryContainsKey( (CFArrayGetValueAtIndex(CMSampleBufferGetSampleAttachmentsArray(sampleBuffer, true), 0)), kCMSampleAttachmentKey_NotSync); // 判斷當前幀是否為關鍵幀 // 獲取sps & pps數據 if (keyframe) { CMFormatDescriptionRef format = CMSampleBufferGetFormatDescription(sampleBuffer); size_t sparameterSetSize, sparameterSetCount; const uint8_t *sparameterSet; OSStatus statusCode = CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 0, &sparameterSet, &sparameterSetSize, &sparameterSetCount, 0 ); if (statusCode == noErr) { // 獲得了sps,再獲取pps size_t pparameterSetSize, pparameterSetCount; const uint8_t *pparameterSet; OSStatus statusCode = CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 1, &pparameterSet, &pparameterSetSize, &pparameterSetCount, 0 ); if (statusCode == noErr) { // 獲取SPS和PPS data NSData *sps = [NSData dataWithBytes:sparameterSet length:sparameterSetSize]; NSData *pps = [NSData dataWithBytes:pparameterSet length:pparameterSetSize]; if (encoder) { [encoder gotSpsPps:sps pps:pps]; } } } } CMBlockBufferRef dataBuffer = CMSampleBufferGetDataBuffer(sampleBuffer); size_t length, totalLength; char *dataPointer; //這里獲取了數據指針,和NALU的幀總長度,前四個字節里面保存的 OSStatus statusCodeRet = CMBlockBufferGetDataPointer(dataBuffer, 0, &length, &totalLength, &dataPointer); if (statusCodeRet == noErr) { size_t bufferOffset = 0; static const int AVCCHeaderLength = 4; // 返回的nalu數據前四個字節不是0001的startcode,而是大端模式的幀長度length // 循環獲取nalu數據 while (bufferOffset < totalLength - AVCCHeaderLength) { uint32_t NALUnitLength = 0; // 讀取NALU長度的數據 memcpy(&NALUnitLength, dataPointer + bufferOffset, AVCCHeaderLength); // 從大端轉系統端 NALUnitLength = CFSwapInt32BigToHost(NALUnitLength); NSData* data = [[NSData alloc] initWithBytes:(dataPointer + bufferOffset + AVCCHeaderLength) length:NALUnitLength]; [encoder gotEncodedData:data]; // 移動到下一個NALU單元 bufferOffset += AVCCHeaderLength + NALUnitLength; } } } //填充SPS和PPS數據 - (void)gotSpsPps:(NSData*)sps pps:(NSData*)pps { NSLog(@"gotSpsPps %d %d", (int)[sps length], (int)[pps length]); const char bytes[] = "\x00\x00\x00\x01"; size_t length = (sizeof bytes) - 1; //string literals have implicit trailing '\0' NSData *ByteHeader = [NSData dataWithBytes:bytes length:length]; //寫入startcode [self.h264FileHandle writeData:ByteHeader]; [self.h264FileHandle writeData:sps]; //寫入startcode [self.h264FileHandle writeData:ByteHeader]; [self.h264FileHandle writeData:pps]; } //填充NALU數據 - (void)gotEncodedData:(NSData*)data { NSLog(@"gotEncodedData %d", (int)[data length]); if (self.h264FileHandle != NULL) { const char bytes[] = "\x00\x00\x00\x01"; size_t length = (sizeof bytes) - 1; //string literals have implicit trailing '\0' NSData *ByteHeader = [NSData dataWithBytes:bytes length:length]; //寫入startcode [self.h264FileHandle writeData:ByteHeader]; //寫入NALU數據 [self.h264FileHandle writeData:data]; } }
結束編碼后銷毀session
- (void)EndVideoToolBox { VTCompressionSessionCompleteFrames(encodingSession, kCMTimeInvalid); VTCompressionSessionInvalidate(encodingSession); CFRelease(encodingSession); encodingSession = NULL; }
這樣就完成了使用VideoToolbox 的H264編碼。編碼好的H264文件可以從沙盒中取出。
總結
僅僅看流程不看代碼肯定是學不會框架的,自己動手編碼試試吧!
Demo下載地址:iOS-VideoToolBox-demo