ffmpeg api 使用scale_npp的問題總結

本文轉載自查看原文 2021-06-05 17:44 3420 FFmpeg

背景

使用ffmpeg cuda解碼輸出的像素格式是119，通過av_hwframe_transfer_data()函數可以設置傳輸到內存的格式為NV12。

而最終需要的像素格式是BGR24。ffmpeg的sws_scale()函數支持NV12 YUV420 到BGR24的轉換，不支持119的轉換。

目前測試數據顯示，NV12和YUVJ420P轉換bgr24的cpu占用分別是13.2% 3.5%，即NV12轉換BGR24更慢。這也和NV12的數據組織方式有關。

查看sws_scale源碼，處理NV12和YUVJ420P的區別如下：

1、NV12初始化時設置chrToYV12，而YUVJ420P不設置這個函數指針。

2、nv12ToUV_c函數遍歷一行數據，將交錯的UV放入兩個數組；

3、chr_convert函數中，每行數據都會調用一次nv12ToUV_c；
再上一層的swscale函數還有一層循環for (; dstY < dstH; dstY++)中調用chr_convert；

nv12轉RGB比yuv420轉RGB消耗的CPU多，應該和nv12ToUV_c有關。

為嘗試解決NV12轉換BGR24的效率問題，嘗試在GPU中將NV12轉換為YUV420P，使用scale_npp的接口實現。對應的命令行如下，npp像素格式轉換：

ffmpeg -vsync 0 -hwaccel_device 2 -hwaccel cuda -hwaccel_output_format cuda -i ~/vedio/drone1.flv -vf "scale_npp=format=yuv420p,hwdownload,format=yuv420p" ff22cuda2.yuv

同時查看ffmpeg源碼，確認scale_npp支持NV12到YUV420P的轉換：

vf_scale_npp.c
static const enum AVPixelFormat supported_formats[] = {
  AV_PIX_FMT_YUV420P,
  AV_PIX_FMT_NV12,
  AV_PIX_FMT_YUV444P,
};

遇到的問題

問題一：Impossible to convert between the formats supported by the filter 'in' and the filter 'auto_scaler_0'

原因是 pix_fmt設置錯誤，需要設置為AV_PIX_FMT_CUDA

avfilter_graph_create_filter()函數作用是Create and add a filter instance into an existing graph. 參照doc/examples/filtering_video.c中的init_filters函數，調用avfilter_graph_create_filter時需設置args。

當前程序設置的是"video_size=1920x1080:pix_fmt=119:time_base=1/1000:pixel_aspect=1/1"

從AVCodecContext *dec_ctx獲取的pix_fmt在解碼前是0，需要修改為119才行。因為scale_npp的輸入是顯卡上的數據，cuda解碼的輸出格式就是119。

命令行方式使用scale_npp也必須設置-hwaccel_output_format cuda才行。

問題二：No hw context provided on input

原因是 input filter需要設置hw_frames_ctx；

1、經查看報錯代碼在libavfilter/vf_scale_npp.c中的init_processing_chain()函數.

2、查看ffmpeg命令行方式在調用scale_npp的區別，發現fftools/ffmpeg_filter.c中的configure_input_video_filter()函數，在創建filter之后設置了hw_frames_ctx;

if ((ret = avfilter_graph_create_filter(&ifilter->filter, buffer_filt, name, args.str, NULL, fg->graph)) < 0)
goto fail;
par->hw_frames_ctx = ifilter->hw_frames_ctx;
ret = av_buffersrc_parameters_set(ifilter->filter, par);
if (ret < 0)
goto fail;
av_freep(&par);

因此，在demo中通過AVBufferSrcParameters設置hw_frames_ctx即可解決此問題。

ffmeg filter源碼相關

filter相關的實現代碼都在libavfilter中，scale_npp相關的在libavfilter/vf_scale_npp.c中。

而ffmpeg命令行相關的功能代碼均在fftools命令下，包括參數解析，編解碼，縮放等。也是調用的libavcodec libavfilter等庫。

其中filter相關都在fftools/fffmpeg_filter.c文件中處理。

重要的函數調用堆棧：

(gdb) bt
#0  filter_query_formats (ctx=0x3f65f40) at libavfilter/avfiltergraph.c:333
#1  0x00000000004dc154 in query_formats (graph=graph@entry=0x3f53c80, log_ctx=log_ctx@entry=0x0) at libavfilter/avfiltergraph.c:456
#2  0x00000000004dcfaf in graph_config_formats (log_ctx=<optimized out>, graph=<optimized out>) at libavfilter/avfiltergraph.c:1166
#3  avfilter_graph_config (graphctx=0x3f53c80, log_ctx=log_ctx@entry=0x0) at libavfilter/avfiltergraph.c:1277
#4  0x00000000004a3b0d in configure_filtergraph (fg=fg@entry=0x23f1940) at fftools/ffmpeg_filter.c:1107 //初始化filter_graph
#5  0x00000000004b432d in ifilter_send_frame (frame=0x336d880, ifilter=0x284aec0) at fftools/ffmpeg.c:2166
#6  send_frame_to_filters (ist=ist@entry=0x23f4e40, decoded_frame=decoded_frame@entry=0x336d880) at fftools/ffmpeg.c:2247
#7  0x00000000004b4b81 in decode_video (ist=ist@entry=0x23f4e40, pkt=pkt@entry=0x7fffffffda00, got_output=got_output@entry=0x7fffffffd710, duration_pts=duration_pts@entry=0x7fffffffd718,
    eof=eof@entry=0, decode_failed=decode_failed@entry=0x7fffffffd714) at fftools/ffmpeg.c:2446
#8  0x00000000004b6cc2 in process_input_packet (no_eof=0, pkt=0x7fffffffd9a0, ist=0x23f4e40) at fftools/ffmpeg.c:2600
#9  process_input (file_index=<optimized out>) at fftools/ffmpeg.c:4491
#10 0x00000000004b9c53 in transcode_step () at fftools/ffmpeg.c:4611
#11 transcode () at fftools/ffmpeg.c:4665

(gdb) bt
#0  nppscale_deinterleave (ctx=ctx@entry=0x3f51a80, stage=stage@entry=0x3e836c8, out=0x3f64cc0, in=0x3f67c00) at libavfilter/vf_scale_npp.c:389
#1  0x00000000005c7840 in nppscale_scale (in=0x3f67c00, out=0x3f67e80, ctx=0x3f51a80) at libavfilter/vf_scale_npp.c:477   //執行nppscale轉換
#2  nppscale_filter_frame (link=link@entry=0x3f66b40, in=0x3f67c00) at libavfilter/vf_scale_npp.c:526
#3  0x00000000004db273 in ff_filter_frame_framed (frame=0x3f67c00, link=0x3f66b40) at libavfilter/avfilter.c:1066
#4  ff_filter_frame_to_filter (link=0x3f66b40) at libavfilter/avfilter.c:1214
#5  ff_filter_activate_default (filter=<optimized out>) at libavfilter/avfilter.c:1263
#6  ff_filter_activate (filter=<optimized out>) at libavfilter/avfilter.c:1424
#7  0x00000000004de97c in ff_filter_graph_run_once (graph=graph@entry=0x3f53c80) at libavfilter/avfiltergraph.c:1456
#8  0x00000000004df918 in push_frame (graph=0x3f53c80) at libavfilter/buffersrc.c:184
#9  av_buffersrc_add_frame_internal (ctx=ctx@entry=0x3f65f40, frame=frame@entry=0x336d880, flags=flags@entry=4) at libavfilter/buffersrc.c:247
#10 0x00000000004dff5d in av_buffersrc_add_frame_flags (ctx=0x3f65f40, frame=frame@entry=0x336d880, flags=flags@entry=4) at libavfilter/buffersrc.c:167
#11 0x00000000004b4349 in ifilter_send_frame (frame=0x336d880, ifilter=0x284aec0) at fftools/ffmpeg.c:2173
#12 send_frame_to_filters (ist=ist@entry=0x23f4e40, decoded_frame=decoded_frame@entry=0x336d880) at fftools/ffmpeg.c:2247
#13 0x00000000004b4b81 in decode_video (ist=ist@entry=0x23f4e40, pkt=pkt@entry=0x7fffffffda00, got_output=got_output@entry=0x7fffffffd710, duration_pts=duration_pts@entry=0x7fffffffd718,
    eof=eof@entry=0, decode_failed=decode_failed@entry=0x7fffffffd714) at fftools/ffmpeg.c:2446
#14 0x00000000004b6cc2 in process_input_packet (no_eof=0, pkt=0x7fffffffd9a0, ist=0x23f4e40) at fftools/ffmpeg.c:2600
#15 process_input (file_index=<optimized out>) at fftools/ffmpeg.c:4491
#16 0x00000000004b9c53 in transcode_step () at fftools/ffmpeg.c:4611
#17 transcode () at fftools/ffmpeg.c:4665

(gdb) bt
#0  hwdownload_filter_frame (link=link@entry=0x3f65700, input=0x3f67e80) at libavfilter/vf_hwdownload.c:152
#1  0x00000000004db273 in ff_filter_frame_framed (frame=0x3f67e80, link=0x3f65700) at libavfilter/avfilter.c:1066
#2  ff_filter_frame_to_filter (link=0x3f65700) at libavfilter/avfilter.c:1214
#3  ff_filter_activate_default (filter=<optimized out>) at libavfilter/avfilter.c:1263
#4  ff_filter_activate (filter=<optimized out>) at libavfilter/avfilter.c:1424
#5  0x00000000004de97c in ff_filter_graph_run_once (graph=graph@entry=0x3f53c80) at libavfilter/avfiltergraph.c:1456
#6  0x00000000004df918 in push_frame (graph=0x3f53c80) at libavfilter/buffersrc.c:184
#7  av_buffersrc_add_frame_internal (ctx=ctx@entry=0x3f65f40, frame=frame@entry=0x336d880, flags=flags@entry=4) at libavfilter/buffersrc.c:247
#8  0x00000000004dff5d in av_buffersrc_add_frame_flags (ctx=0x3f65f40, frame=frame@entry=0x336d880, flags=flags@entry=4) at libavfilter/buffersrc.c:167
#9  0x00000000004b4349 in ifilter_send_frame (frame=0x336d880, ifilter=0x284aec0) at fftools/ffmpeg.c:2173
#10 send_frame_to_filters (ist=ist@entry=0x23f4e40, decoded_frame=decoded_frame@entry=0x336d880) at fftools/ffmpeg.c:2247
#11 0x00000000004b4b81 in decode_video (ist=ist@entry=0x23f4e40, pkt=pkt@entry=0x7fffffffda00, got_output=got_output@entry=0x7fffffffd710, duration_pts=duration_pts@entry=0x7fffffffd718,
    eof=eof@entry=0, decode_failed=decode_failed@entry=0x7fffffffd714) at fftools/ffmpeg.c:2446
#12 0x00000000004b6cc2 in process_input_packet (no_eof=0, pkt=0x7fffffffd9a0, ist=0x23f4e40) at fftools/ffmpeg.c:2600
#13 process_input (file_index=<optimized out>) at fftools/ffmpeg.c:4491
#14 0x00000000004b9c53 in transcode_step () at fftools/ffmpeg.c:4611
#15 transcode () at fftools/ffmpeg.c:4665

需要關注的函數：

av_buffersrc_add_frame_flags

av_buffersink_get_frame

avfilter_graph_create_filter

avfilter_graph_parse_ptr

avfilter_graph_config

sws_scale

fftools中的函數：

int configure_filtergraph(FilterGraph *fg)

static int configure_input_video_filter(FilterGraph *fg, InputFilter *ifilter, AVFilterInOut *in)

hw_device_setup_for_filter

static int hwaccel_retrieve_data(AVCodecContext *avctx, AVFrame *input)

static int transcode_step(void)

static int ifilter_send_frame(InputFilter *ifilter, AVFrame *frame)

需要關注的結構體定義：

AVFilterGraph

AVFilterContext

AVBufferSrcParameters

AVFilterInOut

AVFilter

AVCodecContext

AVFrame

參考信息

https://www.ffmpeg.org/doxygen/3.2/vf__scale__npp_8c_source.html

https://ffmpeg.org/doxygen/3.1/swscale_8c_source.html

https://github.com/FFmpeg/FFmpeg/tree/n4.3.2

https://console.cloud.baidu-int.com/devops/icode/repos/baidu/third-party/ffmpeg/tree/ffmpeg_n4.2.3_GCC820_6U3_K3_GEN_PD_BL

https://stackoverflow.com/questions/47049312/how-can-i-convert-an-ffmpeg-avframe-with-pixel-format-av-pix-fmt-cuda-to-a-new-a
https://www.jianshu.com/p/ad05a94001b4

scale_npp: This is a scaling filter implemented in NVIDIA's Performance Primitives. It's primary dependency is the CUDA SDK, and it must be explicitly enabled by passing --enable-libnpp, --enable-cuda-nvcc and --enable-nonfree flags to ./configure at compile time when building FFmpeg from source. Use this filter in place of scale_cuda wherever possible.

像素格式說明：

AV_PIX_FMT_YUV420P = 0, ///< planar YUV 4:2:0, 12bpp, (1 Cr & Cb sample per 2x2 Y samples)
AV_PIX_FMT_RGB24 = 2, ///< packed RGB 8:8:8, 24bpp, RGBRGB...
AV_PIX_FMT_BGR24 = 3, ///< packed RGB 8:8:8, 24bpp, BGRBGR...

AV_PIX_FMT_YUVJ420P = 12, ///< planar YUV 4:2:0, 12bpp, full scale (JPEG), deprecated in favor of AV_PIX_FMT_YUV420P and setting color_range

AV_PIX_FMT_NV12 = 23, ///< planar YUV 4:2:0, 12bpp, 1 plane for Y and 1 plane for the UV components, which are interleaved (first byte U and the following byte V)

AV_PIX_FMT_CUDA = 119, ///< HW acceleration through CUDA. data[i] contain CUdeviceptr pointers exactly as for system memory frames.

ffmpeg編譯時configure配置命令（前提需要准備cuda庫，nv-codec-headers，libx264，fdk_aac等庫）：

PKG_CONFIG_PATH="$HOME/local/lib/pkgconfig" ./configure --prefix="$HOME/local" --pkg-config-flags="--static" --extra-cflags="-I$HOME/local/include" --extra-ldflags="-L$HOME/local/lib" --extra-libs=-lpthread --extra-libs=-lm --bindir="$HOME/local/bin" --enable-gpl --enable-libfdk_aac --enable-libmp3lame --enable-libx264 --enable-nonfree --enable-gpl --enable-cuda --enable-cuvid --enable-nvdec --enable-nvenc --enable-libnpp --extra-cflags=-I/usr/local/cuda/include --extra-ldflags=-L/usr/local/cuda/lib64

編譯ffmpeg因nv-codec-headers版本不對導致的報錯：

bugfix：[h264_nvenc @ 0x32c2080] Driver does not support the required nvenc API version. Required: 10.0 Found: 9.0
https://blog.csdn.net/qq_23282479/article/details/107579032
https://forums.developer.nvidia.com/t/ffmpeg-nvenc-issue-driver-does-not-support-the-required-nvenc-api-version-required-9-1-found-9-0/109348
nv-codec-headers里的README記錄了最低要求的驅動版本號（可以到github里面去看https://github.com/FFmpeg/nv-codec-headers）
如果cuda版本較低，可以在nv-codec-headers目錄下執行git checkout sdk/9.0，切換回舊版本后，make clean之后重新編譯ffmpeg即可。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 iOS: FFmpeg編譯和使用問題總結 ( 一 ) iOS: FFMpeg編譯和使用問題總結 iOS: FFmpeg編譯和使用問題總結關於FFmpeg工具的使用總結 FFmpeg工具使用總結 ffmpeg 簡單使用總結 ffmpeg的API函數用法：sws_scale函數的用法-具體應用 C#使用FFmpeg的總結在使用百度地圖api遇到的問題總結 scale的空白問題