unimrcp-voice-activity語音檢測


研究 unimrcp有一段時間了,其中unimrcp voice acitve的算法,是遭到大家頻繁吐槽。今天我們簡單的介紹一下unimrcp voice activity 的這個簡單粗暴的算法:

 

unimrcp 語音活動檢測是通過能量來控制的,設定幾個常量:

struct mpf_activity_detector_t {
/* 靜音檢測閥值 */
apr_size_t level_threshold;

/* 轉換成active狀態的超時時長*/
apr_size_t speech_timeout;
/* 轉換成inactive狀態超時時長 */
apr_size_t silence_timeout;
/* 沒有輸入的超時時長 */
apr_size_t noinput_timeout;

/* 當前的狀態 */
mpf_detector_state_e state;
/* 當前狀態的保持的時長 */
apr_size_t duration;
};

 

來看一下這幾個參數的初始化的值,根據實際的測試,我們后期做過改動:

/** Create activity detector */
MPF_DECLARE(mpf_activity_detector_t*) mpf_activity_detector_create(apr_pool_t *pool)
{
    mpf_activity_detector_t *detector = apr_palloc(pool,sizeof(mpf_activity_detector_t));
    detector->level_threshold = 50; /* 0 .. 255 */
    detector->speech_timeout = 300; /* 0.3 s  = 300*/
    detector->silence_timeout = 1000; /* 0.3 s  =300 */
    detector->noinput_timeout = 5000; /* 5 s =5000*/
    detector->duration = 0;
    detector->state = DETECTOR_STATE_INACTIVITY;
    return detector;
}

 

看一下重要的函數,能量的計算:

根據frame的信息,對能量進行粗暴的累加,所以對於噪音,這個算法完全不可用。后面將會介紹如何采用webrtc的voice active來取代這個算法。

static apr_size_t mpf_activity_detector_level_calculate(const mpf_frame_t *frame)
{
    apr_size_t sum = 0;
//計算多少個 apr_size_t count
= frame->codec_frame.size/2;
//初始值
const apr_int16_t *cur = frame->codec_frame.buffer;
//最后一個值
const apr_int16_t *end = cur + count; for(; cur < end; cur++) { if(*cur < 0) { sum -= *cur; } else { sum += *cur; } } //取平均值,簡單粗暴,被吐槽的原因 return sum / count; }

 

最后看一下,狀態切換的過程,下面mpf_activity_detector_process函數,主要是通過計算frame的平均值,來完成狀態切換的邏輯:

處理過程共有四個狀態:

      ACTIVITY狀態

      INACTIVITY狀態

     TRANS_ACTIVITY狀態

     TRANS_INACTIVITY狀態

     其中TRANS狀態是中間狀態,再切換為ACTIVITY狀態和INACTIVITY狀態的時,需要經過這個狀態來累計設定時長,如果滿足了,才會切換,否則不予切換。

/** Process current frame */
MPF_DECLARE(mpf_detector_event_e) mpf_activity_detector_process(mpf_activity_detector_t *detector, const mpf_frame_t *frame)
{
    mpf_detector_event_e det_event = MPF_DETECTOR_EVENT_NONE;
    apr_size_t level = 0;
    if((frame->type & MEDIA_FRAME_TYPE_AUDIO) == MEDIA_FRAME_TYPE_AUDIO) {
        /* first, calculate current activity level of processed frame */
//此處計算得到level的值
level = mpf_activity_detector_level_calculate(frame); #if 0 apt_log(APT_LOG_MARK,APT_PRIO_INFO,"Activity Detector --------------------- [%"APR_SIZE_T_FMT"]",level); #endif } /*如果當前狀態處於INACTIVITY狀態,並且level 大於我們設定的閥值,開始向活動狀態切換,但是並沒有變成活動狀態*/ if(detector->state == DETECTOR_STATE_INACTIVITY) { if(level >= detector->level_threshold) { /* start to detect activity */ mpf_activity_detector_state_change(detector,DETECTOR_STATE_ACTIVITY_TRANSITION); } else { detector->duration += CODEC_FRAME_TIME_BASE; if(detector->duration >= detector->noinput_timeout) { /* detected noinput */ det_event = MPF_DETECTOR_EVENT_NOINPUT; } } } else if(detector->state == DETECTOR_STATE_ACTIVITY_TRANSITION) {
//處於向活動狀態轉換的過程。
if(level >= detector->level_threshold) {
//如果level 大於閥值 detector
->duration += CODEC_FRAME_TIME_BASE;
//並且超過了設定了向活動狀態轉換的超時時長
if(detector->duration >= detector->speech_timeout) { /* finally detected activity */
//切換為活動狀態
det_event = MPF_DETECTOR_EVENT_ACTIVITY; mpf_activity_detector_state_change(detector,DETECTOR_STATE_ACTIVITY); } } else { /* fallback to inactivity */
//降級為非活動狀態
mpf_activity_detector_state_change(detector,DETECTOR_STATE_INACTIVITY); } } else if(detector->state == DETECTOR_STATE_ACTIVITY) {
//處於活動狀態
if(level >= detector->level_threshold) {
//如果level大於閥值,增加duration detector
->duration += CODEC_FRAME_TIME_BASE; } else { /* start to detect inactivity */
//准備轉換成inactivity狀態
mpf_activity_detector_state_change(detector,DETECTOR_STATE_INACTIVITY_TRANSITION); } } else if(detector->state == DETECTOR_STATE_INACTIVITY_TRANSITION) {
//處於inactivity transtion狀態
if(level >= detector->level_threshold) { /* fallback to activity */
//如果大於閥值了,則回歸到activity狀態
mpf_activity_detector_state_change(detector,DETECTOR_STATE_ACTIVITY); } else {
//如果檢測仍然小於閥值,增加判斷時長,如果大於設定的時長了,則進入inactivity狀態。 detector
->duration += CODEC_FRAME_TIME_BASE; if(detector->duration >= detector->silence_timeout) { /* detected inactivity */ det_event = MPF_DETECTOR_EVENT_INACTIVITY; mpf_activity_detector_state_change(detector,DETECTOR_STATE_INACTIVITY); } } } return det_event; }

 

 

 

 

 

 

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM