論文: THE SPEECHTRANSFORMER FOR LARGE-SCALE MANDARIN CHINESE SPEECH RECOGNITION 思想: 在speechTransformer基礎上進行三點改進, 1)降低幀率,縮短聲學特征 ...
論文: SPEECH TRANSFORMER: A NO RECURRENCE SEQUENCE TO SEQUENCE MODELFOR SPEECH RECOGNITION 思路: 整體采用seq seq的encoder和decoder架構 借助transformer對文本位置信息進行學習 相對於RNN,transformer可並行化訓練,加速了訓練過程 論文提出了 D attention結 ...
2020-09-16 22:14 0 2294 推薦指數:
論文: THE SPEECHTRANSFORMER FOR LARGE-SCALE MANDARIN CHINESE SPEECH RECOGNITION 思想: 在speechTransformer基礎上進行三點改進, 1)降低幀率,縮短聲學特征 ...
論文: Deep-FSMN for Large Vocabulary Continuous Speech Recognition 思想: 對於大詞匯量語音識別,往往需要更深的網絡結構,但是當FSMN[1]或cFSMN[2]的結構很深時容易引發剃度消失和爆炸問題 ...
LAS: listen, attented and spell,Google 思想: sequence to sequence的思想,模型分為encoder和dec ...
論文: EESEN:END-TO-END SPEECH RECOGNITION USING DEEP RNN MODELS AND WFST-BASED DECODING ...
論文: CTC:Connectionist Temporal Classification: Labelling Unsegmented Sequence Data with Recurrent Neural Networks 思想: 語音識別中,一般包含語音 ...
論文: TRANSFORMER-TRANSDUCER:END-TO-END SPEECH RECOGNITION WITH SELF-ATTENTION 思想: 1)借助RNN-T在語音識別上的優勢,通過tranformer替換RNN-T中的RNN結構,實現 ...
論文: A time delay neural network architecture for efficient modeling of longtemporal contexts ...
論文: TRANSFORMER TRANSDUCER: A STREAMABLE SPEECH RECOGNITION MODELWITH TRANSFORMER ENCODERS A ...