AudioMixer是Android的混音器,通過混音器可以把各個音軌的音頻數據混合在一起,然后輸出到音頻設備。
創建AudioMixer
AudioMixer在MixerThread的構造函數內創建:
AudioFlinger::MixerThread::MixerThread(...) { ... mAudioMixer = new AudioMixer(mNormalFrameCount, mSampleRate); ... }
這說明了一個MixerThread對應一個AudioMixer。
而且MixerThread傳了兩個參數給AudioMixer:
- mNormalFrameCount,AudioMixer會根據傳進來的mNormalFrameCount作為一次輸送數據的長度,把源buffer的音頻數據寫入目的buffer
- mSampleRate,AudioMixer會把傳進來的mSampleRate作為音頻數據輸出的采樣率
配置AudioMixer參數
在上一篇描述MixerThread的時候說過,prepareTrack_l內會配置AudioMixer的參數,現在來詳細分析一下各個參數的作用。
mAudioMixer->setBufferProvider(name, track);
設置混音的源buffer,name為傳入的索引,track即從mActiveTracks取出來的Track
關於索引name,在這里深入分析,name的獲取過程如下:
int name = track->name(); + +--> int name() const { return mName; } + +--> mName = thread->getTrackName_l(channelMask, sessionId); + +--> return mAudioMixer->getTrackName(channelMask, sessionId); + +--> uint32_t names = (~mTrackNames) & mConfiguredNames; | +--> int n = __builtin_ctz(names);
names為索引的集合,names的每一個bit代表不同的索引,names上的某個bit為1,就代表該bit可以取出來作為索引,__builtin_ctz的作用是計算names的低位0的個數,即可以取出最低位為1的bit作為索引。如下:
11111111111111111111000000000000 ^
低位有12個0,則取bit12作為索引,那么返回的索引值為1<<12
決定names的參數有兩個:
- mTrackNames:用於記錄當前的Track,初始值為0。當加入某個Track時,該Track對應的bit會被置為1.
- mConfiguredNames:用於表明該AudioMixer所支持最多的Track數目,如支持最多N個Track,那么mConfiguredNames = 1<<N – 1,此時mConfiguredNames低位的N個bit為1,高位的32-N個bit為0。mConfiguredNames的默認值為-1,即N = 32
mAudioMixer->enable(name);
enable方法只是把track的enabled置為true,然后調用invalidateState(1 << name);表明需要調用刷新函數。
void AudioMixer::enable(int name) { name -= TRACK0; track_t& track = mState.tracks[name]; if (!track.enabled) { track.enabled = true; invalidateState(1 << name); } }
mAudioMixer->setParameter(name, param, AudioMixer::VOLUME0, (void *)vl);
mAudioMixer->setParameter(name, param, AudioMixer::VOLUME1, (void *)vr);
分別設置左右聲道音量,然后調用invalidateState(1 << name);表明需要調用刷新函數。
case VOLUME0: case VOLUME1: if (track.volume[param-VOLUME0] != valueInt) { ALOGV("setParameter(VOLUME, VOLUME0/1: %04x)", valueInt); track.prevVolume[param-VOLUME0] = track.volume[param-VOLUME0] << 16; track.volume[param-VOLUME0] = valueInt; if (target == VOLUME) { track.prevVolume[param-VOLUME0] = valueInt << 16; track.volumeInc[param-VOLUME0] = 0; }
mAudioMixer->setParameter(
name,
AudioMixer::TRACK,
AudioMixer::FORMAT, (void *)track->format());
保證傳進來的PCM數據為16bit
case FORMAT: ALOG_ASSERT(valueInt == AUDIO_FORMAT_PCM_16_BIT); break;
mAudioMixer->setParameter(
name,
AudioMixer::TRACK,
AudioMixer::CHANNEL_MASK, (void *)track->channelMask());
設置通道數,mask:單音軌(mono),雙音軌(stereo)…
case CHANNEL_MASK: { audio_channel_mask_t mask = (audio_channel_mask_t) value; if (track.channelMask != mask) { uint32_t channelCount = popcount(mask); ALOG_ASSERT((channelCount <= MAX_NUM_CHANNELS_TO_DOWNMIX) && channelCount); track.channelMask = mask; //設置mask track.channelCount = channelCount; //更新音軌數目 // the mask has changed, does this track need a downmixer? initTrackDownmix(&mState.tracks[name], name, mask); ALOGV("setParameter(TRACK, CHANNEL_MASK, %x)", mask); invalidateState(1 << name); }
mAudioMixer->setParameter(
name,
AudioMixer::RESAMPLE,
AudioMixer::SAMPLE_RATE,
(void *)reqSampleRate);
設置當前track的采樣頻率為reqSampleRate,並要求AudioMixer對當前track進行重采樣,輸出頻率為當前AudioMixer的輸出頻率mSampleRate。然后調用invalidateState(1 << name);表明需要調用刷新函數。調用過程如下:
mAudioMixer->setParameter( + name, | AudioMixer::RESAMPLE, | AudioMixer::SAMPLE_RATE, | (void *)reqSampleRate); | +--> track.setResampler(uint32_t(valueInt), mSampleRate) + +--> if (sampleRate != value) { //只有輸入采樣率跟輸出采樣率不同的時候才會進行重采樣 + if (resampler == NULL) { | quality = AudioResampler::VERY_HIGH_QUALITY; //高級重采樣 | resampler = AudioResampler::create(...); //創建resampler | } |} +--> switch (quality) { | default: | case DEFAULT_QUALITY: | case LOW_QUALITY: | ALOGV("Create linear Resampler"); | resampler = new AudioResamplerOrder1(bitDepth, inChannelCount, sampleRate); | break; | case MED_QUALITY: | ALOGV("Create cubic Resampler"); | resampler = new AudioResamplerCubic(bitDepth, inChannelCount, sampleRate); | break; | case HIGH_QUALITY: | ALOGV("Create HIGH_QUALITY sinc Resampler"); | resampler = new AudioResamplerSinc(bitDepth, inChannelCount, sampleRate); | break; | case VERY_HIGH_QUALITY: //由於我們選擇的是VERY_HIGH_QUALITY,所以resampler創建的是AudioResamplerSinc | ALOGV("Create VERY_HIGH_QUALITY sinc Resampler = %d", quality); | resampler = new AudioResamplerSinc(bitDepth, inChannelCount, sampleRate, quality); | break; | } | +--> // initialize resampler resampler->init();
mAudioMixer->setParameter(
name,
AudioMixer::TRACK,
AudioMixer::MAIN_BUFFER, (void *)track->mainBuffer());
設置目的buffer。然后調用invalidateState(1 << name);表明需要調用刷新函數。
我們追蹤一下目的buffer是在哪里創建的:
track->mainBuffer() + +--> int16_t *mainBuffer() const { return mMainBuffer; }
mMainBuffer是在track創建的時候就被賦值了
sp<AudioFlinger::PlaybackThread::Track> AudioFlinger::PlaybackThread::createTrack_l(...) + +--> track = new Track(...) + +--> AudioFlinger::PlaybackThread::Track::Track(...) +:mMainBuffer(thread->mixBuffer()) | +--> int16_t *mixBuffer() const { return mMixBuffer; };
thread就是MixerThread,在MixerThread創建的同時,PlaybackThread也一同被創建。在PlaybackThread的構造函數內,申請了一塊buffer,並賦值給mMixerBuffer
AudioFlinger::MixerThread::MixerThread + +--> AudioFlinger::PlaybackThread::PlaybackThread + +--> void AudioFlinger::PlaybackThread::readOutputParameters() + +--> mAllocMixBuffer = new int8_t[mNormalFrameCount * mFrameSize + align - 1]; | +--> mMixBuffer = (int16_t *) ((((size_t)mAllocMixBuffer + align - 1) / align) * align);
這表明了一個AudioMixer對應一個mMixBuffer,經過某個AudioMixer的音頻數據最后會匯聚到一個buffer內進行輸出
invalidateState
我們上面大量提到了invalidateState可以用來表明需要調用刷新函數,現在來分析一下。
void AudioMixer::invalidateState(uint32_t mask) { if (mask) { mState.needsChanged |= mask; //mask即track->name,表明該track需要被刷新 mState.hook = process__validate; } }
由於AudioMixer進行混音處理的時候會調用process方法,而process調用的是mState.hook,所以調用invalidateState,會使得下一次的process函數會調用process__validate進行參數的刷新。process__validate分析如下:
void AudioMixer::process__validate(state_t* state, int64_t pts) { ALOGW_IF(!state->needsChanged, "in process__validate() but nothing's invalid"); uint32_t changed = state->needsChanged; //所有需要invalidate的track都在這里面 state->needsChanged = 0; // clear the validation flag // recompute which tracks are enabled / disabled uint32_t enabled = 0; uint32_t disabled = 0; while (changed) { //對於所有需要invalidate的track,取出來 const int i = 31 - __builtin_clz(changed); const uint32_t mask = 1<<i; changed &= ~mask; track_t& t = state->tracks[i]; (t.enabled ? enabled : disabled) |= mask; //通過track.enabled或者track.disabled來判斷該track是否需要混音 } state->enabledTracks &= ~disabled; //disabled mask state->enabledTracks |= enabled; //enabled mask // compute everything we need... int countActiveTracks = 0; bool all16BitsStereoNoResample = true; bool resampling = false; bool volumeRamp = false; uint32_t en = state->enabledTracks; while (en) { //對所有需要進行混音的track const int i = 31 - __builtin_clz(en); //取出最高位為1的bit en &= ~(1<<i); //把這一位置為0 countActiveTracks++; track_t& t = state->tracks[i]; //取出來track uint32_t n = 0; n |= NEEDS_CHANNEL_1 + t.channelCount - 1; //至少有一個channel需要混音 n |= NEEDS_FORMAT_16; //必須為16bit PCM n |= t.doesResample() ? NEEDS_RESAMPLE_ENABLED : NEEDS_RESAMPLE_DISABLED; //是否需要重采樣 if (t.auxLevel != 0 && t.auxBuffer != NULL) { n |= NEEDS_AUX_ENABLED; } if (t.volumeInc[0]|t.volumeInc[1]) { volumeRamp = true; } else if (!t.doesResample() && t.volumeRL == 0) { n |= NEEDS_MUTE_ENABLED; } t.needs = n; //更新track flag //下面為設置track的混音方法 if ((n & NEEDS_MUTE__MASK) == NEEDS_MUTE_ENABLED) { //mute t.hook = track__nop; } else { if ((n & NEEDS_AUX__MASK) == NEEDS_AUX_ENABLED) { all16BitsStereoNoResample = false; } if ((n & NEEDS_RESAMPLE__MASK) == NEEDS_RESAMPLE_ENABLED) { //重采樣 all16BitsStereoNoResample = false; resampling = true; t.hook = track__genericResample; ALOGV_IF((n & NEEDS_CHANNEL_COUNT__MASK) > NEEDS_CHANNEL_2, "Track %d needs downmix + resample", i); } else { if ((n & NEEDS_CHANNEL_COUNT__MASK) == NEEDS_CHANNEL_1){ //單聲道 t.hook = track__16BitsMono; all16BitsStereoNoResample = false; } if ((n & NEEDS_CHANNEL_COUNT__MASK) >= NEEDS_CHANNEL_2){ //雙聲道 t.hook = track__16BitsStereo; ALOGV_IF((n & NEEDS_CHANNEL_COUNT__MASK) > NEEDS_CHANNEL_2, "Track %d needs downmix", i); } } } } // select the processing hooks //下面為設置整體的混音方法,一個process__xxx內會循環調用track_xxx state->hook = process__nop; if (countActiveTracks) { if (resampling) { //重采樣,需要多一塊重采樣buffer if (!state->outputTemp) { state->outputTemp = new int32_t[MAX_NUM_CHANNELS * state->frameCount]; } if (!state->resampleTemp) { state->resampleTemp = new int32_t[MAX_NUM_CHANNELS * state->frameCount]; } state->hook = process__genericResampling; } else { if (state->outputTemp) { delete [] state->outputTemp; state->outputTemp = NULL; } if (state->resampleTemp) { delete [] state->resampleTemp; state->resampleTemp = NULL; } state->hook = process__genericNoResampling; //雙聲道process if (all16BitsStereoNoResample && !volumeRamp) { if (countActiveTracks == 1) { state->hook = process__OneTrack16BitsStereoNoResampling; //單聲道process } } } } ALOGV("mixer configuration change: %d activeTracks (%08x) " "all16BitsStereoNoResample=%d, resampling=%d, volumeRamp=%d", countActiveTracks, state->enabledTracks, all16BitsStereoNoResample, resampling, volumeRamp); state->hook(state, pts); //這里調用一次進行混音,后續會在MixerThread的threadLoop_mix內調用 // Now that the volume ramp has been done, set optimal state and // track hooks for subsequent mixer process if (countActiveTracks) { bool allMuted = true; uint32_t en = state->enabledTracks; while (en) { const int i = 31 - __builtin_clz(en); en &= ~(1<<i); track_t& t = state->tracks[i]; if (!t.doesResample() && t.volumeRL == 0) { t.needs |= NEEDS_MUTE_ENABLED; t.hook = track__nop; } else { allMuted = false; } } if (allMuted) { state->hook = process__nop; } else if (all16BitsStereoNoResample) { if (countActiveTracks == 1) { state->hook = process__OneTrack16BitsStereoNoResampling; } } } }
AudioMixer混音
關於混音,我們已經知道:混音以track為源,mainBuffer為目標,frameCount為一次混音長度。AudioMixer最多能維護32個track。track可以對應不同mainBuffer,盡管一般情況下他們的mainBuffer都是同一個。
在分析MixerThread時說過,我們調用AudioMixer的process方法進行混音的,實際上混音的方法是調用AudioMixer內的process_xxx方法,各個process方法大同小異。下面來分析process__genericResampling這個方法。
// generic code with resampling void AudioMixer::process__genericResampling(state_t* state, int64_t pts) { // this const just means that local variable outTemp doesn't change int32_t* const outTemp = state->outputTemp; //重采樣緩存 const size_t size = sizeof(int32_t) * MAX_NUM_CHANNELS * state->frameCount; size_t numFrames = state->frameCount; uint32_t e0 = state->enabledTracks; while (e0) { // process by group of tracks with same output buffer // to optimize cache use uint32_t e1 = e0, e2 = e0; int j = 31 - __builtin_clz(e1); track_t& t1 = state->tracks[j]; //取出第一個track t1 e2 &= ~(1<<j); //除了t1之外,其余的track的索引都在e2內 //對於其他的track,通過循環取出來,賦值為t2,如果t2的目標buffer與t1的不同,則把t2從e1的集合中去掉 //這么做就是為了把相同目標buffer的track取出來,一起進行混音,因為不同目標buffer的track是要混音輸出到不同buffer的 //不過實際上一般都會有相同的目標buffer,如MixerThread設定了mMixBuffer作為track的目標buffer //如果設定了eq(AudioEffect)那就有可能會出現不同目標buffer的情況? while (e2) { j = 31 - __builtin_clz(e2); e2 &= ~(1<<j); track_t& t2 = state->tracks[j]; if (CC_UNLIKELY(t2.mainBuffer != t1.mainBuffer)) { e1 &= ~(1<<j); } } e0 &= ~(e1); int32_t *out = t1.mainBuffer; memset(outTemp, 0, size); while (e1) { //對於e1內的所有track,調用t.hook進行混音 const int i = 31 - __builtin_clz(e1); e1 &= ~(1<<i); track_t& t = state->tracks[i]; int32_t *aux = NULL; if (CC_UNLIKELY((t.needs & NEEDS_AUX__MASK) == NEEDS_AUX_ENABLED)) { aux = t.auxBuffer; } // this is a little goofy, on the resampling case we don't // acquire/release the buffers because it's done by // the resampler. if ((t.needs & NEEDS_RESAMPLE__MASK) == NEEDS_RESAMPLE_ENABLED) { ALOGE("[%s:%d]", __FUNCTION__, __LINE__); t.resampler->setPTS(pts); t.hook(&t, outTemp, numFrames, state->resampleTemp, aux); //實際上重采樣會走這里,然后輸出到重采樣buffer,outTemp } else { size_t outFrames = 0; ALOGE("[%s:%d]", __FUNCTION__, __LINE__); while (outFrames < numFrames) { t.buffer.frameCount = numFrames - outFrames; int64_t outputPTS = calculateOutputPTS(t, pts, outFrames); t.bufferProvider->getNextBuffer(&t.buffer, outputPTS); t.in = t.buffer.raw; // t.in == NULL can happen if the track was flushed just after having // been enabled for mixing. if (t.in == NULL) break; if (CC_UNLIKELY(aux != NULL)) { aux += outFrames; } t.hook(&t, outTemp + outFrames*MAX_NUM_CHANNELS, t.buffer.frameCount, state->resampleTemp, aux); outFrames += t.buffer.frameCount; t.bufferProvider->releaseBuffer(&t.buffer); } } } ditherAndClamp(out, outTemp, numFrames); //把重采樣buffer內的數據輸出到out,即目標buffer } }
在process__invalidate時,設置了重采樣時track.hook函數為track__genericResample,下面看一下這個函數做了什么
void AudioMixer::track__genericResample(track_t* t, int32_t* out, size_t outFrameCount, int32_t* temp, int32_t* aux) { //設置輸入采樣率 t->resampler->setSampleRate(t->sampleRate); // ramp gain - resample to temp buffer and scale/mix in 2nd step if (aux != NULL) { // always resample with unity gain when sending to auxiliary buffer to be able // to apply send level after resampling // TODO: modify each resampler to support aux channel? t->resampler->setVolume(UNITY_GAIN, UNITY_GAIN); memset(temp, 0, outFrameCount * MAX_NUM_CHANNELS * sizeof(int32_t)); t->resampler->resample(temp, outFrameCount, t->bufferProvider); if (CC_UNLIKELY(t->volumeInc[0]|t->volumeInc[1]|t->auxInc)) { volumeRampStereo(t, out, outFrameCount, temp, aux); } else { volumeStereo(t, out, outFrameCount, temp, aux); } } else { if (CC_UNLIKELY(t->volumeInc[0]|t->volumeInc[1])) { t->resampler->setVolume(UNITY_GAIN, UNITY_GAIN); memset(temp, 0, outFrameCount * MAX_NUM_CHANNELS * sizeof(int32_t)); t->resampler->resample(temp, outFrameCount, t->bufferProvider); volumeRampStereo(t, out, outFrameCount, temp, aux); } // constant gain else { //設置音量 t->resampler->setVolume(t->volume[0], t->volume[1]); //進行重采樣 t->resampler->resample(out, outFrameCount, t->bufferProvider); } } }
最終調用了resampler的resample方法進行重采樣