持續更新中……
python相關音頻處理:【librosa】及其在音頻處理中的應用。
簡介
aubio是一個標注音樂和聲音的python庫,源碼基於C語言。它能夠讀取任意媒體文件,提取特征並檢測事件。(aubio is a collection of tools for music and audio analysis.)
適用於python2和python3,本文代碼基於python3。
read audio from any media file, including videos and remote streams
high quality phase vocoder, spectral filterbanks, and linear filters
Mel-Frequency Cepstrum Coefficients and standard spectral descriptors
detection of note attacks (onset)
pitch tracking (fundamental frequency estimation)
beat detection and tempo tracking
音頻讀取
class aubio.source(path, samplerate=0, hop_size=512, channels=0)
src = aubio.source('test01.wav')
src.uri, src.samplerate, src.channels, src.duration #('test01.wav', 16000, 2, 86833)
snk = aubio.sink('out.wav') #Create a new sink at 44100Hz, mono
snk = aubio.sink('out.wav', samplerate=16000, channels=3) #Create a new sink at 32000Hz, stereo, write 100 samples into it
snk(aubio.fvec(100), 100)
pitch
pitch 和聲音的基頻 fundamental frequency(F0)有關,反應聲音的音高信息,即聲調。計算F0也稱之為基頻檢測算法PDA。
理論計算參考
test-pitch C源碼
aubiopitch C源碼
# Supported methods: yinfft, yin, yinfast, fcomb, mcomb, schmitt, specacf, default (yinfft).
class aubio.pitch(method="default", buf_size=1024, hop_size=512, samplerate=44100)
其中,默認方法yinfft是yin方法的改進,yin方法具體見論文:
De Cheveigné, A., Kawahara, H. (2002) "YIN, a fundamental frequency estimator for speech and music", J. Acoust. Soc. Am. 111, 1917-1930.
Yinfft algorithm was derived from the YIN algorithm. In this implementation, a Fourier transform is used to compute a tapered square difference function, which allows spectral weighting. Because the difference function is tapered, the selection of the period is simplified.
具體方法參考論文:
Paul Brossier, Automatic annotation of musical audio for interactive systems, Chapter 3, Pitch Analysis, PhD thesis, Centre for Digital music, Queen Mary University of London, London, UK, 2006.
更多pitch detection方法的信息見pitch.h File Reference。
MFCC
mfcc creates a callable which takes a cvec as input. cvec is a container holding spectral data.
class aubio.cvec(size)
class aubio.mfcc(buf_size=1024, n_filters=40, n_coeffs=13, samplerate=44100)
If n_filters = 40, the filterbank will be initialized with filterbank.set_mel_coeffs_slaney(). Otherwise, if n_filters is greater than 0, it will be initialized with filterbank.set_mel_coeffs() using fmin = 0, fmax = samplerate.
buf_size = 2048; n_filters = 128; n_coeffs = 13; samplerate = 44100
mf = aubio.mfcc(buf_size, n_filters, n_coeffs, samplerate)
fftgrain = aubio.cvec(buf_size)
mf(fftgrain).shape #(13,)
