1. 實時音視頻開發主要步驟
2. 概述
前面我們通過兩篇文章分別介紹了視頻采集的兩種方式:采集攝像頭和采集屏幕。獲取數據之后,如果是要本地顯示,那么就像我們之前做的那樣,直接渲染出來就行;方式如果是進行存儲或者進行傳輸,往往需要對數據進行編碼壓縮。
webrtc中的視頻編解碼部分的主要實現是位於modules/video_coding
. 我們初步能看到的編碼器的接口是位於api/video_codecs/video_encoder.h
,這里定義了webrtc::VideoEncoder
,不過這是個虛類,只是用於進行接口定義和部分接口的實現。
3. 創建編碼器
在media/engine/internal_encoder_factory.h
中定義了創建編碼器的通用接口:
// media/engine/internal_encoder_factory.h
namespace webrtc {
class RTC_EXPORT InternalEncoderFactory : public VideoEncoderFactory {
public:
static std::vector<SdpVideoFormat> SupportedFormats();
std::vector<SdpVideoFormat> GetSupportedFormats() const override;
CodecInfo QueryVideoEncoder(const SdpVideoFormat& format) const override;
std::unique_ptr<VideoEncoder> CreateVideoEncoder(
const SdpVideoFormat& format) override;
};
} // namespace webrtc
- 可以使用
InternalEncoderFactory::SupportedFormats
接口得到所支持的編碼器類型; - 可以使用
InternalEncoderFactory::QueryVideoEncoder
接口查詢編碼器的狀態// media/engine/internal_encoder_factory.cc VideoEncoderFactory::CodecInfo InternalEncoderFactory::QueryVideoEncoder( const SdpVideoFormat& format) const { CodecInfo info; info.is_hardware_accelerated = false; info.has_internal_source = false; return info; }
- 可以使用
InternalEncoderFactory::CreateVideoEncoder
接口來創建編碼器webrtc::VideoEncoder
實例// media/engine/internal_encoder_factory.cc std::unique_ptr<VideoEncoder> InternalEncoderFactory::CreateVideoEncoder( const SdpVideoFormat& format) { if (absl::EqualsIgnoreCase(format.name, cricket::kVp8CodecName)) return VP8Encoder::Create(); if (absl::EqualsIgnoreCase(format.name, cricket::kVp9CodecName)) return VP9Encoder::Create(cricket::VideoCodec(format)); if (absl::EqualsIgnoreCase(format.name, cricket::kH264CodecName)) return H264Encoder::Create(cricket::VideoCodec(format)); if (kIsLibaomAv1EncoderSupported && absl::EqualsIgnoreCase(format.name, cricket::kAv1CodecName)) return CreateLibaomAv1Encoder(); RTC_LOG(LS_ERROR) << "Trying to created encoder of unsupported format " << format.name; return nullptr; }
從InternalEncoderFactory::CreateVideoEncoder
的實現里我們可以看到,webrtc支持四種類型的視頻編碼器:
- VP8Encoder
- VP9Encoder
- H264Encoder
- LibaomAv1Encoder
H264
默認是不支持的,需要在編譯的時候加上特殊的編譯參數rtc_use_h264=true
4. 編碼器的接口
視頻編碼器的定義主要就是webrtc::VideoEncoder
結構體,位於api/video_codecs/video_encoder.h
:
- 可以通過
VideoEncoder::RegisterEncodeCompleteCallback
注冊一個EncodedImageCallback
對象來接收編碼后的數據// api/video_codecs/video_encoder.h class EncodedImageCallback { public: virtual ~EncodedImageCallback() {} struct Result { enum Error { OK, // Failed to send the packet. ERROR_SEND_FAILED, }; explicit Result(Error error) : error(error) {} Result(Error error, uint32_t frame_id) : error(error), frame_id(frame_id) {} Error error; // Frame ID assigned to the frame. The frame ID should be the same as the ID // seen by the receiver for this frame. RTP timestamp of the frame is used // as frame ID when RTP is used to send video. Must be used only when // error=OK. uint32_t frame_id = 0; // Tells the encoder that the next frame is should be dropped. bool drop_next_frame = false; }; // Used to signal the encoder about reason a frame is dropped. // kDroppedByMediaOptimizations - dropped by MediaOptimizations (for rate // limiting purposes). // kDroppedByEncoder - dropped by encoder's internal rate limiter. enum class DropReason : uint8_t { kDroppedByMediaOptimizations, kDroppedByEncoder }; // Callback function which is called when an image has been encoded. virtual Result OnEncodedImage( const EncodedImage& encoded_image, const CodecSpecificInfo* codec_specific_info, const RTPFragmentationHeader* fragmentation) = 0; virtual void OnDroppedFrame(DropReason reason) {} };
- 使用
VideoEncoder::Encode
接口來編碼一個VideoFrame
; - 還有其他接口可以傳遞參數給編碼器。
5. 實戰
從上面的分析看,我們在開發中如果要自己創建視頻編碼器的話,最簡單直接的辦法是直接使用InternalEncoderFactory::CreateVideoEncoder
接口來創建視頻編碼器。雖然InternalEncoderFactory
的命名看起來是不對外開放的接口,但是其實在官方的examples/unityplugin/simple_peer_connection.cc
下就有直接使用這個接口:
bool SimplePeerConnection::InitializePeerConnection(const char** turn_urls,
const int no_of_urls,
const char* username,
const char* credential,
bool is_receiver) {
bool SimplePeerConnection::InitializePeerConnection(const char** turn_urls,
const int no_of_urls,
const char* username,
const char* credential,
bool is_receiver) {
RTC_DCHECK(peer_connection_.get() == nullptr);
if (g_peer_connection_factory == nullptr) {
g_worker_thread = rtc::Thread::Create();
g_worker_thread->Start();
g_signaling_thread = rtc::Thread::Create();
g_signaling_thread->Start();
g_peer_connection_factory = webrtc::CreatePeerConnectionFactory(
g_worker_thread.get(), g_worker_thread.get(), g_signaling_thread.get(),
nullptr, webrtc::CreateBuiltinAudioEncoderFactory(),
webrtc::CreateBuiltinAudioDecoderFactory(),
std::unique_ptr<webrtc::VideoEncoderFactory>(
new webrtc::MultiplexEncoderFactory(
std::make_unique<webrtc::InternalEncoderFactory>())),
std::unique_ptr<webrtc::VideoDecoderFactory>(
new webrtc::MultiplexDecoderFactory(
std::make_unique<webrtc::InternalDecoderFactory>())),
nullptr, nullptr);
}
if (!g_peer_connection_factory.get()) {
DeletePeerConnection();
return false;
}
g_peer_count++;
if (!CreatePeerConnection(turn_urls, no_of_urls, username, credential)) {
DeletePeerConnection();
return false;
}
mandatory_receive_ = is_receiver;
return peer_connection_.get() != nullptr;
}
因此我們自己在開發時,也可以直接使用.
這里我們在之前采集屏幕的例子上繼續修改,添加一個video_encode_handler.h
類用於進行編碼:
// video_encode_handler.h
#ifndef EXAMPLES_DESKTOP_CAPTURE_VIDEO_ENCODE_HANDLER_H_
#define EXAMPLES_DESKTOP_CAPTURE_VIDEO_ENCODE_HANDLER_H_
#include "api/video/video_frame.h"
#include "api/video/video_sink_interface.h"
#include "api/video_codecs/video_encoder.h"
#include "api/video/encoded_image.h"
#include "modules/video_coding/include/video_codec_interface.h"
#include "modules/include/module_common_types.h"
namespace webrtc_demo {
class VideoEncodeHandler : public rtc::VideoSinkInterface<webrtc::VideoFrame>,
public webrtc::EncodedImageCallback {
public:
enum class VideoEncodeType {
VP8,
VP9,
H264,
UNSUPPORT_TYPE,
};
~VideoEncodeHandler();
static std::unique_ptr<VideoEncodeHandler> Create(const VideoEncodeType type);
private:
VideoEncodeHandler(const VideoEncodeType type);
// rtc::VideoSinkInterface override
void OnFrame(const webrtc::VideoFrame& frame) override;
// webrtc::EncodedImageCallback override
webrtc::EncodedImageCallback::Result OnEncodedImage(
const webrtc::EncodedImage& encoded_image,
const webrtc::CodecSpecificInfo* codec_specific_info,
const webrtc::RTPFragmentationHeader* fragmentation) override;
void OnDroppedFrame(webrtc::EncodedImageCallback::DropReason reason) override;
webrtc::VideoCodec DefaultCodecSettings(size_t width, size_t height, size_t keyFrameInterval);
void ReInitEncoder();
std::unique_ptr<webrtc::VideoEncoder> video_encoder_;
std::string encode_type_name_;
size_t frame_width_ = 0;
size_t frame_height_ = 0;
};
} // namespace webrtc_demo
#endif // EXAMPLES_DESKTOP_CAPTURE_VIDEO_ENCODE_HANDLER_H_
首先,作為一個Sink
可以注冊給抓屏對象,使得能夠獲取到抓屏得到的frame;然后將自己作為EncodedImageCallback
實例注冊給VideoEncoder
:
// video_encode_handler.cc
#include "examples/desktop_capture/video_encode_handler.h"
#include "api/video/video_codec_type.h"
#include "api/video_codecs/sdp_video_format.h"
#include "api/video_codecs/video_codec.h"
#include "media/engine/internal_encoder_factory.h"
#include "rtc_base/logging.h"
#include "test/video_codec_settings.h"
namespace webrtc_demo {
constexpr size_t kWidth = 1920;
constexpr size_t kHeight = 1080;
constexpr size_t kBaseKeyFrameInterval = 30;
const webrtc::VideoEncoder::Capabilities kCapabilities(false);
const webrtc::VideoEncoder::Settings kSettings(kCapabilities,
/*number_of_cores=*/1,
/*max_payload_size=*/0);
VideoEncodeHandler::VideoEncodeHandler(const VideoEncodeType type)
: video_encoder_(nullptr), frame_width_(kWidth), frame_height_(kHeight) {
switch (type) {
case VideoEncodeType::VP8:
encode_type_name_ = "VP8";
break;
case VideoEncodeType::VP9:
encode_type_name_ = "VP9";
break;
case VideoEncodeType::H264:
encode_type_name_ = "H264";
break;
default:
break;
}
auto support_formats = webrtc::InternalEncoderFactory::SupportedFormats();
for (auto& format : support_formats) {
RTC_LOG(LS_INFO) << "Support encode: " << format.ToString();
if (!video_encoder_ && format.name == encode_type_name_) {
RTC_LOG(LS_INFO) << "Find encode: " << format.name;
std::unique_ptr<webrtc::InternalEncoderFactory> encode_factory =
std::make_unique<webrtc::InternalEncoderFactory>();
video_encoder_ = encode_factory->CreateVideoEncoder(format);
video_encoder_->RegisterEncodeCompleteCallback(this);
ReInitEncoder();
}
}
}
void VideoEncodeHandler::ReInitEncoder() {
webrtc::VideoCodec codec_settings =
DefaultCodecSettings(frame_width_, frame_height_, kBaseKeyFrameInterval);
video_encoder_->InitEncode(&codec_settings, kSettings);
}
VideoEncodeHandler::~VideoEncodeHandler() {}
std::unique_ptr<VideoEncodeHandler> VideoEncodeHandler::Create(
const VideoEncodeType type) {
if (type >= VideoEncodeType::UNSUPPORT_TYPE) {
RTC_LOG(LS_WARNING) << "Not support encode type: " << type;
return nullptr;
}
return std::unique_ptr<VideoEncodeHandler>(new VideoEncodeHandler(type));
}
void VideoEncodeHandler::OnFrame(const webrtc::VideoFrame& frame) {
// RTC_LOG(LS_INFO) << "-----VideoEncodeHandler::OnFrame-----";
if (!video_encoder_) {
RTC_LOG(LS_ERROR) << "Encoder not valid";
return;
}
if (frame.width() != (int)frame_width_ || frame.height() != (int)frame_height_) {
frame_width_ = frame.width();
frame_height_ = frame.height();
ReInitEncoder();
}
if (video_encoder_) {
video_encoder_->Encode(frame, nullptr);
}
}
webrtc::EncodedImageCallback::Result VideoEncodeHandler::OnEncodedImage(
const webrtc::EncodedImage& encoded_image,
const webrtc::CodecSpecificInfo* codec_specific_info,
const webrtc::RTPFragmentationHeader* fragmentation) {
RTC_LOG(LS_INFO) << "-----VideoEncodeHandler::OnEncodedImage-----" << encoded_image.size() << ", " << encoded_image.Timestamp() << "--" << encoded_image._completeFrame;
return webrtc::EncodedImageCallback::Result(
webrtc::EncodedImageCallback::Result::OK);
}
void VideoEncodeHandler::OnDroppedFrame(
webrtc::EncodedImageCallback::DropReason reason) {
RTC_LOG(LS_INFO) << "-----VideoEncodeHandler::OnDroppedFrame-----";
}
webrtc::VideoCodec VideoEncodeHandler::DefaultCodecSettings(
size_t width,
size_t height,
size_t key_frame_interval) {
webrtc::VideoCodec codec_settings;
webrtc::VideoCodecType codec_type =
webrtc::PayloadStringToCodecType(encode_type_name_);
webrtc::test::CodecSettings(codec_type, &codec_settings);
codec_settings.width = static_cast<uint16_t>(width);
codec_settings.height = static_cast<uint16_t>(height);
switch (codec_settings.codecType) {
case webrtc::kVideoCodecVP8:
codec_settings.VP8()->keyFrameInterval = key_frame_interval;
codec_settings.VP8()->frameDroppingOn = true;
codec_settings.VP8()->numberOfTemporalLayers = 1;
break;
case webrtc::kVideoCodecVP9:
codec_settings.VP9()->keyFrameInterval = key_frame_interval;
codec_settings.VP9()->frameDroppingOn = true;
codec_settings.VP9()->numberOfTemporalLayers = 1;
break;
case webrtc::kVideoCodecAV1:
codec_settings.qpMax = 63;
break;
case webrtc::kVideoCodecH264:
codec_settings.H264()->keyFrameInterval = key_frame_interval;
break;
default:
break;
}
return codec_settings;
}
} // namespace webrtc_demo
webrtc::InternalEncoderFactory::SupportedFormats
先確認我們支持的編碼格式,然后創建encoder:std::unique_ptr<webrtc::InternalEncoderFactory> encode_factory = std::make_unique<webrtc::InternalEncoderFactory>(); video_encoder_ = encode_factory->CreateVideoEncoder(format); video_encoder_->RegisterEncodeCompleteCallback(this);
- 在進行
Encode
編碼之前,我們需要想初始化我們的編碼器。void VideoEncodeHandler::ReInitEncoder() { webrtc::VideoCodec codec_settings = DefaultCodecSettings(frame_width_, frame_height_, kBaseKeyFrameInterval); video_encoder_->InitEncode(&codec_settings, kSettings); }
- 獲取默認的編碼器配置是調用了
test/video_codec_settings.h
里的webrtc::test::CodecSettings
接口進行初步的初始化,實際上完全可以自己仿照着寫一個。 - 進行編碼
VideoEncoder::Encode
- 獲取編碼結果只要通過重載的
EncodedImageCallback::OnEncodedImage
函數就可以實現。
最后在main
函數中添加:
#include "examples/desktop_capture/desktop_capture.h"
#include "test/video_renderer.h"
#include "rtc_base/logging.h"
#include "examples/desktop_capture/video_encode_handler.h"
#include <thread>
int main() {
std::unique_ptr<webrtc_demo::DesktopCapture> capturer(webrtc_demo::DesktopCapture::Create(30,0));
capturer->StartCapture();
std::unique_ptr<webrtc::test::VideoRenderer> renderer(webrtc::test::VideoRenderer::Create(capturer->GetWindowTitle().c_str(), 720, 480));
capturer->AddOrUpdateSink(renderer.get(), rtc::VideoSinkWants());
std::unique_ptr<webrtc_demo::VideoEncodeHandler> video_encode(webrtc_demo::VideoEncodeHandler::Create(webrtc_demo::VideoEncodeHandler::VideoEncodeType::VP8));
capturer->AddOrUpdateSink(video_encode.get(), rtc::VideoSinkWants());
std::this_thread::sleep_for(std::chrono::seconds(30));
capturer->RemoveSink(renderer.get());
capturer->RemoveSink(video_encode.get());
RTC_LOG(WARNING) << "Demo exit";
return 0;
}
視頻編碼的代碼分支: https://github.com/243286065/webrtc-cpp-demo/tree/76e12021e0469d5108930ed4f28df308d6916791
提交commit: https://github.com/243286065/webrtc-cpp-demo/commit/76e12021e0469d5108930ed4f28df308d6916791
webrtc中其實主要使用
VideoStreamEncoder
對象來創建VideoEncoder
實例,不過基本原理差不多,只是它做了更多的封裝,就編碼器來說,和我們的VideoEncodeHandler
大致實現原理是一樣的。