AudioMixer是Android的混音器,通過混音器可以把各個音軌的音頻數據混合在一起,然后輸出到音頻設備。
創建AudioMixer
AudioMixer在MixerThread的構造函數內創建:
AudioFlinger::MixerThread::MixerThread(...)
{
...
mAudioMixer = new AudioMixer(mNormalFrameCount, mSampleRate);
...
}
這說明了一個MixerThread對應一個AudioMixer。
而且MixerThread傳了兩個參數給AudioMixer:
- mNormalFrameCount,AudioMixer會根據傳進來的mNormalFrameCount作為一次輸送數據的長度,把源buffer的音頻數據寫入目的buffer
- mSampleRate,AudioMixer會把傳進來的mSampleRate作為音頻數據輸出的采樣率
配置AudioMixer參數
在上一篇描述MixerThread的時候說過,prepareTrack_l內會配置AudioMixer的參數,現在來詳細分析一下各個參數的作用。
mAudioMixer->setBufferProvider(name, track);
設置混音的源buffer,name為傳入的索引,track即從mActiveTracks取出來的Track
關於索引name,在這里深入分析,name的獲取過程如下:
int name = track->name();
+
+--> int name() const { return mName; }
+
+--> mName = thread->getTrackName_l(channelMask, sessionId);
+
+--> return mAudioMixer->getTrackName(channelMask, sessionId);
+
+--> uint32_t names = (~mTrackNames) & mConfiguredNames;
|
+--> int n = __builtin_ctz(names);
names為索引的集合,names的每一個bit代表不同的索引,names上的某個bit為1,就代表該bit可以取出來作為索引,__builtin_ctz的作用是計算names的低位0的個數,即可以取出最低位為1的bit作為索引。如下:
11111111111111111111000000000000
^
低位有12個0,則取bit12作為索引,那么返回的索引值為1<<12
決定names的參數有兩個:
- mTrackNames:用於記錄當前的Track,初始值為0。當加入某個Track時,該Track對應的bit會被置為1.
- mConfiguredNames:用於表明該AudioMixer所支持最多的Track數目,如支持最多N個Track,那么mConfiguredNames = 1<<N – 1,此時mConfiguredNames低位的N個bit為1,高位的32-N個bit為0。mConfiguredNames的默認值為-1,即N = 32
mAudioMixer->enable(name);
enable方法只是把track的enabled置為true,然后調用invalidateState(1 << name);表明需要調用刷新函數。
void AudioMixer::enable(int name)
{
name -= TRACK0;
track_t& track = mState.tracks[name];
if (!track.enabled) {
track.enabled = true;
invalidateState(1 << name);
}
}
mAudioMixer->setParameter(name, param, AudioMixer::VOLUME0, (void *)vl);
mAudioMixer->setParameter(name, param, AudioMixer::VOLUME1, (void *)vr);
分別設置左右聲道音量,然后調用invalidateState(1 << name);表明需要調用刷新函數。
case VOLUME0:
case VOLUME1:
if (track.volume[param-VOLUME0] != valueInt) {
ALOGV("setParameter(VOLUME, VOLUME0/1: %04x)", valueInt);
track.prevVolume[param-VOLUME0] = track.volume[param-VOLUME0] << 16;
track.volume[param-VOLUME0] = valueInt;
if (target == VOLUME) {
track.prevVolume[param-VOLUME0] = valueInt << 16;
track.volumeInc[param-VOLUME0] = 0;
}
mAudioMixer->setParameter(
name,
AudioMixer::TRACK,
AudioMixer::FORMAT, (void *)track->format());
保證傳進來的PCM數據為16bit
case FORMAT:
ALOG_ASSERT(valueInt == AUDIO_FORMAT_PCM_16_BIT);
break;
mAudioMixer->setParameter(
name,
AudioMixer::TRACK,
AudioMixer::CHANNEL_MASK, (void *)track->channelMask());
設置通道數,mask:單音軌(mono),雙音軌(stereo)…
case CHANNEL_MASK: {
audio_channel_mask_t mask = (audio_channel_mask_t) value;
if (track.channelMask != mask) {
uint32_t channelCount = popcount(mask);
ALOG_ASSERT((channelCount <= MAX_NUM_CHANNELS_TO_DOWNMIX) && channelCount);
track.channelMask = mask; //設置mask
track.channelCount = channelCount; //更新音軌數目
// the mask has changed, does this track need a downmixer?
initTrackDownmix(&mState.tracks[name], name, mask);
ALOGV("setParameter(TRACK, CHANNEL_MASK, %x)", mask);
invalidateState(1 << name);
}
mAudioMixer->setParameter(
name,
AudioMixer::RESAMPLE,
AudioMixer::SAMPLE_RATE,
(void *)reqSampleRate);
設置當前track的采樣頻率為reqSampleRate,並要求AudioMixer對當前track進行重采樣,輸出頻率為當前AudioMixer的輸出頻率mSampleRate。然后調用invalidateState(1 << name);表明需要調用刷新函數。調用過程如下:
mAudioMixer->setParameter(
+ name,
| AudioMixer::RESAMPLE,
| AudioMixer::SAMPLE_RATE,
| (void *)reqSampleRate);
|
+--> track.setResampler(uint32_t(valueInt), mSampleRate)
+
+--> if (sampleRate != value) { //只有輸入采樣率跟輸出采樣率不同的時候才會進行重采樣
+ if (resampler == NULL) {
| quality = AudioResampler::VERY_HIGH_QUALITY; //高級重采樣
| resampler = AudioResampler::create(...); //創建resampler
| }
|}
+--> switch (quality) {
| default:
| case DEFAULT_QUALITY:
| case LOW_QUALITY:
| ALOGV("Create linear Resampler");
| resampler = new AudioResamplerOrder1(bitDepth, inChannelCount, sampleRate);
| break;
| case MED_QUALITY:
| ALOGV("Create cubic Resampler");
| resampler = new AudioResamplerCubic(bitDepth, inChannelCount, sampleRate);
| break;
| case HIGH_QUALITY:
| ALOGV("Create HIGH_QUALITY sinc Resampler");
| resampler = new AudioResamplerSinc(bitDepth, inChannelCount, sampleRate);
| break;
| case VERY_HIGH_QUALITY: //由於我們選擇的是VERY_HIGH_QUALITY,所以resampler創建的是AudioResamplerSinc
| ALOGV("Create VERY_HIGH_QUALITY sinc Resampler = %d", quality);
| resampler = new AudioResamplerSinc(bitDepth, inChannelCount, sampleRate, quality);
| break;
| }
|
+--> // initialize resampler
resampler->init();
mAudioMixer->setParameter(
name,
AudioMixer::TRACK,
AudioMixer::MAIN_BUFFER, (void *)track->mainBuffer());
設置目的buffer。然后調用invalidateState(1 << name);表明需要調用刷新函數。
我們追蹤一下目的buffer是在哪里創建的:
track->mainBuffer()
+
+--> int16_t *mainBuffer() const { return mMainBuffer; }
mMainBuffer是在track創建的時候就被賦值了
sp<AudioFlinger::PlaybackThread::Track> AudioFlinger::PlaybackThread::createTrack_l(...)
+
+--> track = new Track(...)
+
+--> AudioFlinger::PlaybackThread::Track::Track(...)
+:mMainBuffer(thread->mixBuffer())
|
+--> int16_t *mixBuffer() const { return mMixBuffer; };
thread就是MixerThread,在MixerThread創建的同時,PlaybackThread也一同被創建。在PlaybackThread的構造函數內,申請了一塊buffer,並賦值給mMixerBuffer
AudioFlinger::MixerThread::MixerThread
+
+--> AudioFlinger::PlaybackThread::PlaybackThread
+
+--> void AudioFlinger::PlaybackThread::readOutputParameters()
+
+--> mAllocMixBuffer = new int8_t[mNormalFrameCount * mFrameSize + align - 1];
|
+--> mMixBuffer = (int16_t *) ((((size_t)mAllocMixBuffer + align - 1) / align) * align);
這表明了一個AudioMixer對應一個mMixBuffer,經過某個AudioMixer的音頻數據最后會匯聚到一個buffer內進行輸出
invalidateState
我們上面大量提到了invalidateState可以用來表明需要調用刷新函數,現在來分析一下。
void AudioMixer::invalidateState(uint32_t mask)
{
if (mask) {
mState.needsChanged |= mask; //mask即track->name,表明該track需要被刷新
mState.hook = process__validate;
}
}
由於AudioMixer進行混音處理的時候會調用process方法,而process調用的是mState.hook,所以調用invalidateState,會使得下一次的process函數會調用process__validate進行參數的刷新。process__validate分析如下:
void AudioMixer::process__validate(state_t* state, int64_t pts)
{
ALOGW_IF(!state->needsChanged,
"in process__validate() but nothing's invalid");
uint32_t changed = state->needsChanged; //所有需要invalidate的track都在這里面
state->needsChanged = 0; // clear the validation flag
// recompute which tracks are enabled / disabled
uint32_t enabled = 0;
uint32_t disabled = 0;
while (changed) { //對於所有需要invalidate的track,取出來
const int i = 31 - __builtin_clz(changed);
const uint32_t mask = 1<<i;
changed &= ~mask;
track_t& t = state->tracks[i];
(t.enabled ? enabled : disabled) |= mask; //通過track.enabled或者track.disabled來判斷該track是否需要混音
}
state->enabledTracks &= ~disabled; //disabled mask
state->enabledTracks |= enabled; //enabled mask
// compute everything we need...
int countActiveTracks = 0;
bool all16BitsStereoNoResample = true;
bool resampling = false;
bool volumeRamp = false;
uint32_t en = state->enabledTracks;
while (en) { //對所有需要進行混音的track
const int i = 31 - __builtin_clz(en); //取出最高位為1的bit
en &= ~(1<<i); //把這一位置為0
countActiveTracks++;
track_t& t = state->tracks[i]; //取出來track
uint32_t n = 0;
n |= NEEDS_CHANNEL_1 + t.channelCount - 1; //至少有一個channel需要混音
n |= NEEDS_FORMAT_16; //必須為16bit PCM
n |= t.doesResample() ? NEEDS_RESAMPLE_ENABLED : NEEDS_RESAMPLE_DISABLED; //是否需要重采樣
if (t.auxLevel != 0 && t.auxBuffer != NULL) {
n |= NEEDS_AUX_ENABLED;
}
if (t.volumeInc[0]|t.volumeInc[1]) {
volumeRamp = true;
} else if (!t.doesResample() && t.volumeRL == 0) {
n |= NEEDS_MUTE_ENABLED;
}
t.needs = n; //更新track flag
//下面為設置track的混音方法
if ((n & NEEDS_MUTE__MASK) == NEEDS_MUTE_ENABLED) { //mute
t.hook = track__nop;
} else {
if ((n & NEEDS_AUX__MASK) == NEEDS_AUX_ENABLED) {
all16BitsStereoNoResample = false;
}
if ((n & NEEDS_RESAMPLE__MASK) == NEEDS_RESAMPLE_ENABLED) { //重采樣
all16BitsStereoNoResample = false;
resampling = true;
t.hook = track__genericResample;
ALOGV_IF((n & NEEDS_CHANNEL_COUNT__MASK) > NEEDS_CHANNEL_2,
"Track %d needs downmix + resample", i);
} else {
if ((n & NEEDS_CHANNEL_COUNT__MASK) == NEEDS_CHANNEL_1){ //單聲道
t.hook = track__16BitsMono;
all16BitsStereoNoResample = false;
}
if ((n & NEEDS_CHANNEL_COUNT__MASK) >= NEEDS_CHANNEL_2){ //雙聲道
t.hook = track__16BitsStereo;
ALOGV_IF((n & NEEDS_CHANNEL_COUNT__MASK) > NEEDS_CHANNEL_2,
"Track %d needs downmix", i);
}
}
}
}
// select the processing hooks //下面為設置整體的混音方法,一個process__xxx內會循環調用track_xxx
state->hook = process__nop;
if (countActiveTracks) {
if (resampling) { //重采樣,需要多一塊重采樣buffer
if (!state->outputTemp) {
state->outputTemp = new int32_t[MAX_NUM_CHANNELS * state->frameCount];
}
if (!state->resampleTemp) {
state->resampleTemp = new int32_t[MAX_NUM_CHANNELS * state->frameCount];
}
state->hook = process__genericResampling;
} else {
if (state->outputTemp) {
delete [] state->outputTemp;
state->outputTemp = NULL;
}
if (state->resampleTemp) {
delete [] state->resampleTemp;
state->resampleTemp = NULL;
}
state->hook = process__genericNoResampling; //雙聲道process
if (all16BitsStereoNoResample && !volumeRamp) {
if (countActiveTracks == 1) {
state->hook = process__OneTrack16BitsStereoNoResampling; //單聲道process
}
}
}
}
ALOGV("mixer configuration change: %d activeTracks (%08x) "
"all16BitsStereoNoResample=%d, resampling=%d, volumeRamp=%d",
countActiveTracks, state->enabledTracks,
all16BitsStereoNoResample, resampling, volumeRamp);
state->hook(state, pts); //這里調用一次進行混音,后續會在MixerThread的threadLoop_mix內調用
// Now that the volume ramp has been done, set optimal state and
// track hooks for subsequent mixer process
if (countActiveTracks) {
bool allMuted = true;
uint32_t en = state->enabledTracks;
while (en) {
const int i = 31 - __builtin_clz(en);
en &= ~(1<<i);
track_t& t = state->tracks[i];
if (!t.doesResample() && t.volumeRL == 0)
{
t.needs |= NEEDS_MUTE_ENABLED;
t.hook = track__nop;
} else {
allMuted = false;
}
}
if (allMuted) {
state->hook = process__nop;
} else if (all16BitsStereoNoResample) {
if (countActiveTracks == 1) {
state->hook = process__OneTrack16BitsStereoNoResampling;
}
}
}
}
AudioMixer混音
關於混音,我們已經知道:混音以track為源,mainBuffer為目標,frameCount為一次混音長度。AudioMixer最多能維護32個track。track可以對應不同mainBuffer,盡管一般情況下他們的mainBuffer都是同一個。

在分析MixerThread時說過,我們調用AudioMixer的process方法進行混音的,實際上混音的方法是調用AudioMixer內的process_xxx方法,各個process方法大同小異。下面來分析process__genericResampling這個方法。
// generic code with resampling
void AudioMixer::process__genericResampling(state_t* state, int64_t pts)
{
// this const just means that local variable outTemp doesn't change
int32_t* const outTemp = state->outputTemp; //重采樣緩存
const size_t size = sizeof(int32_t) * MAX_NUM_CHANNELS * state->frameCount;
size_t numFrames = state->frameCount;
uint32_t e0 = state->enabledTracks;
while (e0) {
// process by group of tracks with same output buffer
// to optimize cache use
uint32_t e1 = e0, e2 = e0;
int j = 31 - __builtin_clz(e1);
track_t& t1 = state->tracks[j]; //取出第一個track t1
e2 &= ~(1<<j); //除了t1之外,其余的track的索引都在e2內
//對於其他的track,通過循環取出來,賦值為t2,如果t2的目標buffer與t1的不同,則把t2從e1的集合中去掉
//這么做就是為了把相同目標buffer的track取出來,一起進行混音,因為不同目標buffer的track是要混音輸出到不同buffer的
//不過實際上一般都會有相同的目標buffer,如MixerThread設定了mMixBuffer作為track的目標buffer
//如果設定了eq(AudioEffect)那就有可能會出現不同目標buffer的情況?
while (e2) {
j = 31 - __builtin_clz(e2);
e2 &= ~(1<<j);
track_t& t2 = state->tracks[j];
if (CC_UNLIKELY(t2.mainBuffer != t1.mainBuffer)) {
e1 &= ~(1<<j);
}
}
e0 &= ~(e1);
int32_t *out = t1.mainBuffer;
memset(outTemp, 0, size);
while (e1) { //對於e1內的所有track,調用t.hook進行混音
const int i = 31 - __builtin_clz(e1);
e1 &= ~(1<<i);
track_t& t = state->tracks[i];
int32_t *aux = NULL;
if (CC_UNLIKELY((t.needs & NEEDS_AUX__MASK) == NEEDS_AUX_ENABLED)) {
aux = t.auxBuffer;
}
// this is a little goofy, on the resampling case we don't
// acquire/release the buffers because it's done by
// the resampler.
if ((t.needs & NEEDS_RESAMPLE__MASK) == NEEDS_RESAMPLE_ENABLED) {
ALOGE("[%s:%d]", __FUNCTION__, __LINE__);
t.resampler->setPTS(pts);
t.hook(&t, outTemp, numFrames, state->resampleTemp, aux); //實際上重采樣會走這里,然后輸出到重采樣buffer,outTemp
} else {
size_t outFrames = 0;
ALOGE("[%s:%d]", __FUNCTION__, __LINE__);
while (outFrames < numFrames) {
t.buffer.frameCount = numFrames - outFrames;
int64_t outputPTS = calculateOutputPTS(t, pts, outFrames);
t.bufferProvider->getNextBuffer(&t.buffer, outputPTS);
t.in = t.buffer.raw;
// t.in == NULL can happen if the track was flushed just after having
// been enabled for mixing.
if (t.in == NULL) break;
if (CC_UNLIKELY(aux != NULL)) {
aux += outFrames;
}
t.hook(&t, outTemp + outFrames*MAX_NUM_CHANNELS, t.buffer.frameCount,
state->resampleTemp, aux);
outFrames += t.buffer.frameCount;
t.bufferProvider->releaseBuffer(&t.buffer);
}
}
}
ditherAndClamp(out, outTemp, numFrames); //把重采樣buffer內的數據輸出到out,即目標buffer
}
}
在process__invalidate時,設置了重采樣時track.hook函數為track__genericResample,下面看一下這個函數做了什么
void AudioMixer::track__genericResample(track_t* t, int32_t* out, size_t outFrameCount,
int32_t* temp, int32_t* aux)
{
//設置輸入采樣率
t->resampler->setSampleRate(t->sampleRate);
// ramp gain - resample to temp buffer and scale/mix in 2nd step
if (aux != NULL) {
// always resample with unity gain when sending to auxiliary buffer to be able
// to apply send level after resampling
// TODO: modify each resampler to support aux channel?
t->resampler->setVolume(UNITY_GAIN, UNITY_GAIN);
memset(temp, 0, outFrameCount * MAX_NUM_CHANNELS * sizeof(int32_t));
t->resampler->resample(temp, outFrameCount, t->bufferProvider);
if (CC_UNLIKELY(t->volumeInc[0]|t->volumeInc[1]|t->auxInc)) {
volumeRampStereo(t, out, outFrameCount, temp, aux);
} else {
volumeStereo(t, out, outFrameCount, temp, aux);
}
} else {
if (CC_UNLIKELY(t->volumeInc[0]|t->volumeInc[1])) {
t->resampler->setVolume(UNITY_GAIN, UNITY_GAIN);
memset(temp, 0, outFrameCount * MAX_NUM_CHANNELS * sizeof(int32_t));
t->resampler->resample(temp, outFrameCount, t->bufferProvider);
volumeRampStereo(t, out, outFrameCount, temp, aux);
}
// constant gain
else {
//設置音量
t->resampler->setVolume(t->volume[0], t->volume[1]);
//進行重采樣
t->resampler->resample(out, outFrameCount, t->bufferProvider);
}
}
}
最終調用了resampler的resample方法進行重采樣

