PyAudio 中文手冊--靜音移除/有聲檢測

本文轉載自查看原文 2020-04-10 00:27 785

靜音移除/有聲檢測

audioSegmentation.py中的silenceRemoval()方法將連續的音頻拆分為獨立的事件片段，即靜默的信號都被移除掉了。其實現主要通過一個半監督方法：首先訓練一個SVM模型來分辨（短期時長的）高能信號與低能信號，使用其中10%的高能片段和10%的低能片段來作為訓練集。然后對整段音頻應用這一SVM訓練模型，並使用動態的閾值來判斷動態的片段。

核心方法：silenceRemoval(x, fs, st_win, st_step, smoothWindow=0.5, weight=0.5, plot=False)

 
          def silenceRemoval(x, fs, st_win, st_step, smoothWindow=0.5, weight=0.5, plot=False): ''' Event Detection (silence removal) ARGUMENTS: - x: the input audio signal - fs: sampling freq - st_win, st_step: window size and step in seconds - smoothWindow: (optinal) smooth window (in seconds) - weight: (optinal) weight factor (0 < weight < 1) the higher, the more strict - plot: (optinal) True if results are to be plotted RETURNS: - seg_limits: list of segment limits in seconds (e.g [[0.1, 0.9], [1.4, 3.0]] means that the resulting segments are (0.1 - 0.9) seconds and (1.4, 3.0) seconds '''  
         

參數：

x: 輸入的音頻（數字化的） fs: 采樣頻率 st_win, st_step: 短期時長的窗口及步長 smoothWindow=0.5: 平滑窗口（單位秒） weight=0.5: 決策閾值（越高越嚴格） plot=False: 是否可視化結果

返回結果seg_limits是一個數組，表示非靜音（有聲）片段的起止時間。

根據音頻性質的不同，也應該使用不同的平滑窗口和決策閾值。上述的樣本中，聲音是非常稀疏的。對於持續的講話，應該使用更短（值更小）的平滑窗口和更嚴格的（值更大）的決策閾值。

全部手冊：https://www.cnblogs.com/liuzhongrong/p/12269181.html

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 sqlmap中文手冊 Nmap中文手冊 LuCI中文手冊 RakNet中文手冊 DirectX中文手冊 elasticsearch中文手冊 numpy 中文手冊 journalctl 中文手冊 Autoconf 中文手冊 GDB中文手冊