深度學習語音增強


作者:YeBobr
鏈接:https://www.zhihu.com/question/273665262/answer/388296862
來源:知乎
著作權歸作者所有。商業轉載請聯系作者獲得授權,非商業轉載請注明出處。

最近在深度學習在語音增強中的應用最前沿的應該數GAN網絡了吧,把生成器當做增強網絡,用判別器區分干凈語音和增強語音。主要有如下兩篇論文:

1.SEGAN: Speech Enhancement Generative Adversarial Network

2.Conditional Generative Adversarial Networks for Speech Enhancement and Noise-Robust Speaker Verification

 

在卷積神經網絡方面,有基於全卷積的,有基於冗余卷積的,在時域上和在頻域上處理語音。論文鏈接如下:

1.Single channel speech enhancement using convolutional neural network

2.A FULLY CONVOLUTIONAL NEURAL NETWORK FOR SPEECH ENHANCEMENT

3.Raw Waveform-based Speech Enhancement by Fully Convolutional Networks

 

在DNN方面,主要是在頻域內處理語音,通過短時傅里葉變換求得短時頻譜,然后對短時頻譜進行處理,利用含噪語音的相位進行重構增強語音。還有一些小是DNN和傳統語音增強方法進行結合的辦法,把傳統語音中的features換成DNN網絡,基本這個套路。論鏈接如下:

1.Speech Enhancement In Multiple-Noise Conditions using Deep Neural Networks

2.NMF-based Speech Enhancement Incorporating Deep Neural Network

3.A Novel Single Channel Speech Enhancement Based on Joint Deep Neural Network and Wiener Filter

4.An Experimental Study on Speech Enhancement Based on Deep Neural Networks

5.A Regression Approach to Speech Enhancement Based on Deep Neural Networks


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM