[時間:2016-07] [狀態:Open]
[關鍵詞:ffmpeg,libavcodec,libavformat]
FFmpeg接觸幾年了,用的比較多的是libavcodec和libavformat兩個庫,偶爾也會用用libswresample(主要處理音頻PCM的轉換,比如不同的聲道數、頻率、采樣位數、量化位數轉換)和libswscale(視頻原始數據處理,比如縮放、色度格式、量化位數的轉換)。
libavformat的主要機制
libavformat主要完成針對多媒體文件或流媒體(FFmpeg內部成為URL)的數據解析,包括數據讀取、格式分析以及包讀取;多媒體文件生成。其中主要包含幾種主要的結構體:
- AVFormatContext
最核心的結構體,對於每個URL,這里面都會demuxer/muxer,一個AVStream數組以及一個AVIOContext。 - AVStream
記錄媒體文件中包含的流信息,比如音頻、視頻或者數據流及其類型。 - AVInputFormat(demuxer)
作為解析器,是libavformat中很多的結構,其對外接口如下:
int(* read_probe )(AVProbeData *)
Tell if a given file has a chance of being parsed as this format.
int(* read_header )(struct AVFormatContext *)
Read the format header and initialize the AVFormatContext structure.
int(* read_packet )(struct AVFormatContext *, AVPacket *pkt)
Read one packet and put it in 'pkt'.
int(* read_close )(struct AVFormatContext *)
Close the stream.
int(* read_seek )(struct AVFormatContext *, int stream_index, int64_t timestamp, int flags)
Seek to a given timestamp relative to the frames in stream component stream_index.
int64_t(* read_timestamp )(struct AVFormatContext *s, int stream_index, int64_t *pos, int64_t pos_limit)
Get the next timestamp in stream[stream_index].time_base units.
- AVOutputFormat(muxer)
作為復用器,其對外接口如下:
int(* write_header )(struct AVFormatContext *)
int(* write_packet )(struct AVFormatContext *, AVPacket *pkt)
int(* write_trailer )(struct AVFormatContext *)
int(* interleave_packet )(struct AVFormatContext *, AVPacket *out, AVPacket *in, int flush)
Currently only used to set pixel format if not YUV420P.
int(* query_codec )(enum AVCodecID id, int std_compliance)
Test if the given codec can be stored in this container.
void(* get_output_timestamp )(struct AVFormatContext *s, int stream, int64_t *dts, int64_t *wall)
int(* control_message )(struct AVFormatContext *s, int type, void *data, size_t data_size)
Allows sending messages from application to device.
int(* write_uncoded_frame )(struct AVFormatContext *, int stream_index, AVFrame **frame, unsigned flags)
Write an uncoded AVFrame.
int(* get_device_list )(struct AVFormatContext *s, struct AVDeviceInfoList *device_list)
Returns device list with it properties.
int(* create_device_capabilities )(struct AVFormatContext *s, struct AVDeviceCapabilitiesQuery *caps)
Initialize device capabilities submodule.
int(* free_device_capabilities )(struct AVFormatContext *s, struct AVDeviceCapabilitiesQuery *caps)
Free device capabilities submodule.
int(* init )(struct AVFormatContext *)
Initialize format.
void(* deinit )(struct AVFormatContext *)
Deinitialize format.
int(* check_bitstream )(struct AVFormatContext *, const AVPacket *pkt)
Set up any necessary bitstream filtering and extract any extra data needed for the global header.
- AVIOContext、URLProtocol(協議解析)
這里面涉及輸入輸出的機制,與具體協議有關,比如http、tcp、udp、rtp、rtsp等。
URLProtocol的接口如下:
typedef struct URLProtocol {
const char *name;
int (*url_open)( URLContext *h, const char *url, int flags);
/**
* This callback is to be used by protocols which open further nested
* protocols. options are then to be passed to ffurl_open()/ffurl_connect()
* for those nested protocols.
*/
int (*url_open2)(URLContext *h, const char *url, int flags, AVDictionary **options);
int (*url_accept)(URLContext *s, URLContext **c);
int (*url_handshake)(URLContext *c);
/**
* Read data from the protocol.
* If data is immediately available (even less than size), EOF is
* reached or an error occurs (including EINTR), return immediately.
* Otherwise:
* In non-blocking mode, return AVERROR(EAGAIN) immediately.
* In blocking mode, wait for data/EOF/error with a short timeout (0.1s),
* and return AVERROR(EAGAIN) on timeout.
* Checking interrupt_callback, looping on EINTR and EAGAIN and until
* enough data has been read is left to the calling function; see
* retry_transfer_wrapper in avio.c.
*/
int (*url_read)( URLContext *h, unsigned char *buf, int size);
int (*url_write)(URLContext *h, const unsigned char *buf, int size);
int64_t (*url_seek)( URLContext *h, int64_t pos, int whence);
int (*url_close)(URLContext *h);
int (*url_read_pause)(URLContext *h, int pause);
int64_t (*url_read_seek)(URLContext *h, int stream_index,
int64_t timestamp, int flags);
int (*url_get_file_handle)(URLContext *h);
int (*url_get_multi_file_handle)(URLContext *h, int **handles,
int *numhandles);
int (*url_shutdown)(URLContext *h, int flags);
int priv_data_size;
const AVClass *priv_data_class;
int flags;
int (*url_check)(URLContext *h, int mask);
int (*url_open_dir)(URLContext *h);
int (*url_read_dir)(URLContext *h, AVIODirEntry **next);
int (*url_close_dir)(URLContext *h);
int (*url_delete)(URLContext *h);
int (*url_move)(URLContext *h_src, URLContext *h_dst);
const char *default_whitelist;
} URLProtocol;
- AVPacket
解復用或者復用之后的數據包,通常包含一幀視頻或者一段音頻數據。
就我而言,我用的比較多的是demuxer。通常FFmpeg的處理流程是,先通過demuxer的read_probe
函數確定URL包含的容器類型,然后調用read_header
讀取多媒體的信息頭,完成基本的初始化;之后正常讀取packet通過read_packet
和read_timestamp
;最后在讀取結束的時候調用read_close
,完成反初始化操作。另外可以通過read_seek
實現媒體文件的seek操作(快進/快退)。
libavcodec主要機制
libavcodec主要結合libavformat實現解碼、編碼及音視頻解析(單獨格式分包或打包)。其中主要包含以下結構:
- AVCodecContext
AVCodecContext是libavcodec最核心的結構體,包含編碼器或解碼器類型,一個AVCodec、一個AVHWAccel以及一些其他編解碼參數。 - AVCodec
這個結構包是編碼器/解碼器的對外的封裝格式,其對外接口包括:
void(* init_static_data )(struct AVCodec *codec)
Initialize codec static data, called from avcodec_register().
int(* init )(AVCodecContext *)
int(* encode_sub )(AVCodecContext *, uint8_t *buf, int buf_size, const struct AVSubtitle *sub)
int(* encode2 )(AVCodecContext *avctx, AVPacket *avpkt, const AVFrame *frame, int *got_packet_ptr)
Encode data to an AVPacket.
int(* decode )(AVCodecContext *, void *outdata, int *outdata_size, AVPacket *avpkt)
int(* close )(AVCodecContext *)
int(* send_frame )(AVCodecContext *avctx, const AVFrame *frame)
Decode/encode API with decoupled packet/frame dataflow.
int(* send_packet )(AVCodecContext *avctx, const AVPacket *avpkt)
int(* receive_frame )(AVCodecContext *avctx, AVFrame *frame)
int(* receive_packet )(AVCodecContext *avctx, AVPacket *avpkt)
void(* flush )(AVCodecContext *)
Flush buffers.
Frame-level threading support functions
int(* init_thread_copy )(AVCodecContext *)
If defined, called on thread contexts when they are created.
int(* update_thread_context )(AVCodecContext *dst, const AVCodecContext *src)
Copy necessary context variables from a previous thread context to the current one.
- AVHWAccel
這里包含硬件解碼的結構體,需要依賴特定硬件才可以正常運行。其統一接口如下:
int(* alloc_frame )(AVCodecContext *avctx, AVFrame *frame)
Allocate a custom buffer.
int(* start_frame )(AVCodecContext *avctx, const uint8_t *buf, uint32_t buf_size)
Called at the beginning of each frame or field picture.
int(* decode_slice )(AVCodecContext *avctx, const uint8_t *buf, uint32_t buf_size)
Callback for each slice.
int(* end_frame )(AVCodecContext *avctx)
Called at the end of each frame or field picture.
void(* decode_mb )(struct MpegEncContext *s)
Called for every Macroblock in a slice.
int(* init )(AVCodecContext *avctx)
Initialize the hwaccel private data.
int(* uninit )(AVCodecContext *avctx)
Uninitialize the hwaccel private data.
- AVCodecParserContext和AVCodecParser
這是用於特定音頻或視頻的parser,比如h264、aac等,其統一對外接口如下:
int(* parser_init )(AVCodecParserContext *s)
int(* parser_parse )(AVCodecParserContext *s, AVCodecContext *avctx, const uint8_t **poutbuf, int *poutbuf_size, const uint8_t *buf, int buf_size)
void(* parser_close )(AVCodecParserContext *s)
int(* split )(AVCodecContext *avctx, const uint8_t *buf, int buf_size)
- AVFrame
這里面存儲了音頻、視頻解碼之后的原始數據或者編碼器的輸入數據。其中存儲的音視頻數據具體格式需要參考AVFrame::format
(音頻格式是AVSampleFormat,視頻格式是AVPixelFormat)。
我使用FFmpeg的libavcodec是decoder和parser,當然為了獲取音視頻原始數據,還需要了解必要的AVFrame結構。
通常decoder的調用邏輯是通過AVCodec的init
初始化解碼器,調用decode
函數解碼數據,調用close
反初始化解碼器,在需要的時候(比如解碼結束,切換流)調用flush
清空解碼器內部緩存數據。
當然很多音視頻比特流需要通過parser才送入解碼器,這樣就可以調用parser_parse實現。
關於硬解碼的實現可能有很多細節,有興趣的可以參考下Hardware acceleration introduction with FFmpeg。