我們經常聽的音樂文件格式主要有:mp3,flac和wav等,但是大家有想過這些音頻文件通過音頻解碼器解碼后的數據格式是怎么樣的?廢話不多說,接下來介紹能被設備播放的音頻原始數據格式PCM。
一、PCM音頻
PCM音頻,中文名稱:脈沖編碼調制,是用於將波形表示的模擬音頻信號轉換為數字1和0表示的數字音頻信號,而不壓縮也不丟失信息的處理技術。如下是使用Audacity音頻處理軟件截取1~2s的時間段內音頻波形圖:
我們再對這個區間的波形圖進行放大:
上面一個像火柴棒的是什么?它其實就是一個樣本,這里就需要提到PCM音頻中涉及的幾個重要參數:樣本,采樣頻率,位深和通道。
二、樣本
樣本:sample,將模擬音頻信號振幅通過量化編碼方式轉換為數字音頻信號的數據大小,如下圖所示:
三、采樣率
采樣率:sampel rate,每秒鍾采集的樣本數,采樣頻率一般越大,轉換失真越小。常見的采用率有:44100HZ,48000HZ和91KHZ。如下圖是Audacity提供的采樣率:
四、位深
位深度:bit depth,定義了可以存儲的數字電平的數量;位深越大,存儲的信息越詳細,保真度越好。如下圖所示:
五、聲道
聲道:channel,表示音頻采集源的個數,比如:單聲道,采集源只有一個;雙聲道(立體聲),采集源有兩個,分別為左和右;多聲道(環繞聲),采集源有多個。聲道越多,播放的音頻效果越立體,但是存儲的數據越大。
Audacity提供的聲道列表如下:
位深8bits的樣本數據存放格式如下:
單聲道:
雙聲道:
三聲道:
對於多聲道(大於一個聲道)的PCM音頻,在實際編碼中,我們也按照單聲道處理,統稱為一個采樣,如上面的單聲道,雙聲道和三聲道的一個采樣大小分別為:8bits,18bits和24bits;可以將多聲道分離為一個個單聲道。
六、參數之間的關系
通過上面的介紹,我們可以得出如下關系:
- sample_bits = depth_bits
- channels_sample_bits = sample_bits* channel
- samples_bits_per_second = sample_rate * depth_bits * channel = sampe_rate * sample_bits * channel;
七、小試牛刀
PCM格式音頻樣本參數:44100HZ + 16bits + 2channel。
分離雙聲道為單聲道:
1 void SeparateChannel() 2 { 3 const auto originalFilename = R"(44100_16_2.pcm)"; 4 ifstream inFStream(originalFilename, ios_base::binary); 5 6 if (!inFStream.is_open()) { 7 cout << "read <" << originalFilename << "> failed.\n"; 8 return; 9 } 10 const auto separateFilename1 = R"(44100_16_1_1.pcm)"; 11 const auto spearateFilename2 = R"(44100_16_1_2.pcm)"; 12 ofstream outFStream1(separateFilename1, ios_base::out | ios_base::binary); 13 ofstream outFStream2(spearateFilename2, ios_base::out | ios_base::binary); 14 auto fileBuf = inFStream.rdbuf(); 15 unique_ptr<char[]> buf = make_unique<char[]>(kSampleBitsPer / 8); 16 while (!inFStream.eof()){ 17 memset(buf.get(), 0xff, kSampleBitsPer / 8); 18 auto len = fileBuf->sgetn(buf.get(), kSampleBitsPer / 8); 19 20 if (len != kSampleBitsPer / 8) { 21 break; 22 } 23 24 outFStream1.write(buf.get(), kDstSampleBitsPer / 8); 25 outFStream2.flush(); 26 outFStream2.write(buf.get() + kDstSampleBitsPer / 8, kDstSampleBitsPer / 8); 27 outFStream2.flush(); 28 } 29 30 outFStream1.close(); 31 outFStream2.close(); 32 inFStream.close(); 33 }
生成音頻文件波形圖如下:
16bits轉換為8bits
1 void SeparateBitsDeepth() 2 { 3 const auto originalFilename = R"(44100_16_1_1.pcm)"; 4 ifstream inFStream; 5 auto fileBuf = inFStream.rdbuf(); 6 fileBuf->open(originalFilename, ios_base::binary | ios_base::in); 7 if (!fileBuf->is_open()) { 8 cout << "read <" << originalFilename << "> failed.\n"; 9 return; 10 } 11 12 const auto outFStreamFilename1 = R"(44100_8_1.pcm)"; 13 ofstream outFStream(outFStreamFilename1, ios_base::binary); 14 unique_ptr<char[]> buf = make_unique<char[]>(kDstSampleBitsPer / 8); 15 while (!inFStream.eof()) { 16 memset(buf.get(), 0xff, kDstSampleBitsPer / 8); 17 auto len = fileBuf->sgetn(buf.get(), kDstSampleBitsPer / 8); 18 if (len != kDstSampleBitsPer / 8) { 19 break; 20 } 21 22 auto sample8BitsU = ((*reinterpret_cast<uint16_t*>(buf.get()) >> 8)) + 128; // 這一步 23 outFStream.write(reinterpret_cast<char*>(&sample8BitsU), 1); 24 outFStream.flush(); 25 } 26 27 outFStream.close(); 28 inFStream.close(); 29 }
音量減半:
1 void Half() 2 { 3 const auto originalFilename = R"(44100_16_1_2.pcm)"; 4 ifstream inFStream; 5 auto fileBuf = inFStream.rdbuf(); 6 fileBuf->open(originalFilename, ios_base::binary | ios_base::in); 7 if (!fileBuf->is_open()) { 8 cout << "read <" << originalFilename << "> failed.\n"; 9 return; 10 } 11 12 const auto outFStreamFilename1 = R"(44100_16_1_2_half.pcm)"; 13 ofstream outFStream(outFStreamFilename1, ios_base::binary); 14 unique_ptr<char[]> buf = make_unique<char[]>(2); 15 while (!inFStream.eof()) { 16 memset(buf.get(), 0xff, 2); 17 auto len = fileBuf->sgetn(buf.get(), 2); 18 if (len != 2) { 19 break; 20 } 21 22 auto sampleL = reinterpret_cast<short*>(buf.get()); 23 *sampleL = *sampleL / 2; // 將樣本數大小取一半 24 outFStream.write(buf.get(), 2); 25 outFStream.flush(); 26 } 27 28 outFStream.close(); 29 inFStream.close(); 30 }
音頻雙倍數:
1 void DoubleSpeed() 2 { 3 const auto originalFilename = R"(44100_16_1_1.pcm)"; 4 ifstream inFStream; 5 auto fileBuf = inFStream.rdbuf(); 6 fileBuf->open(originalFilename, ios_base::binary | ios_base::in); 7 if (!fileBuf->is_open()) { 8 cout << "read <" << originalFilename << "> failed.\n"; 9 return; 10 } 11 12 const auto outFStreamFilename1 = R"(44100_16_1_1_double_speed.pcm)"; 13 ofstream outFStream(outFStreamFilename1, ios_base::binary); 14 unique_ptr<char[]> buf = make_unique<char[]>(2); 15 int cnt = 0; 16 while (!inFStream.eof()) { 17 memset(buf.get(), 0xff, 2); 18 auto len = fileBuf->sgetn(buf.get(), 2); 19 if (len != 2) { 20 break; 21 } 22 // 奇數采樣 23 if (++cnt % 2 != 0) { 24 outFStream.write(buf.get(), 2); 25 outFStream.flush(); 26 } 27 28 } 29 30 outFStream.close(); 31 inFStream.close(); 32 }
參考:
- https://samplerateconverter.com/educational/pcm-audio#what-pcm
- 視音頻數據處理入門:PCM音頻采樣數據處理_雷霄驊(leixiaohua1020)的專欄-CSDN博客_pcm數據