利用微軟認知服務實現語音識別功能

本文轉載自查看原文 2017-08-20 18:46 4687 C#實現語音識別/ 語音識別的實現方式/ 語音識別

　　想實現語音識別已經很久了，也嘗試了許多次，終究還是失敗了，原因很多，識別效果不理想，個人在技術上沒有成功實現，種種原因，以至於花費了好多時間在上面。語音識別，我嘗試過的有科大訊飛、百度語音，微軟系。最終還是喜歡微軟系的簡潔高效。(勿噴，純個人感覺)

　　最開始自己的想法是我說一句話(暫且在控制台上做Demo)，控制台程序能識別我說的是什么，然后顯示出來，並且根據我說的信息，執行相應的行為.(想法很美好，現實很糟心）初入語音識別，各種錯誤各種來，徘徊不定的選擇哪家公司的api，百度上查找各種語音識別的demo，學習參考，可是真正在.NET平台上運行成功的卻是寥寥無幾，或許是我查找方向有問題，經歷了許多的坑，沒一次成功過，心灰且意冷，打了幾次退堂鼓，卻終究忍受不住想玩語音識別。

　　可以看看我VS中的語音demo

　　第一個是今天的主角-稍后再提。

　　第二個和第三個是微軟系的系統自帶的System.Speech.dll和看了微軟博客里面的一篇文章而去嘗試的Microsoft.Speech.dll 可惜文章寫的挺好的，我嘗試卻是失敗的，並且發現一個問題，就是英文版的微軟語音識別是無效的(Microsoft.Speech.Recognition)，而中文版的語音合成是無效的(Microsoft.Speech.Synthesis).，因此，我不得不將兩個dll混合使用，來達到我想要的效果，最終效果確實達到了，不過卻是極其簡單的，一旦識別詞匯多起來，這識別率直接下降，我一直認為是采樣頻率的問題，可是怎么也找不到采樣頻率的屬性或是字段，如有會的朋友可給我點信息，讓我也飛起來，哈哈。

　　第四個是百度語音識別demo，代碼簡潔許多，實現難度不難，可是小細節很多，需要注意，然后是雷區挺多的，但是呢，指導走出雷區的說明書卻是太少了，我是踩了雷，很郁悶。

　　首先來看看，現在市面上主流語音識別設計方式：

　　1、離線語音識別

　　離線語音識別很好理解，就是語音識別庫在本地或是局域網內，無需發起遠程連接。這個也是我當初的想法，自己弄一套語音識別庫，然后根據里面的內容設計想要的行為請求。利用微軟系的System.Speech.dll中的語音識別和語音合成功能。實現了簡單的中文語音識別功能，但是一旦我將語音識別庫逐漸加大，識別率就越來越低，不知是我電腦麥克風不行還是其它原因。最終受打擊，放棄。當我試着學習百度語音時，也發現了離線語音識別庫，但是呢官方並沒有給出具體的操作流程和設計思路，我也沒有去深入了解，有時間我要好好了解一番。

 1 using System;  2 //using Microsoft.Speech.Synthesis;//中文版tts不能發聲
 3 using Microsoft.Speech.Recognition;  4 using System.Speech.Synthesis;  5 //using System.Speech.Recognition;
 6 
 7 namespace SAssassin.SpeechDemo  8 {  9     /// <summary>
10     /// 微軟語音識別 中文版 貌似效果還好點 11     /// </summary>
12     class Program 13  { 14         static SpeechSynthesizer sy = new SpeechSynthesizer(); 15         static void Main(string[] args) 16  { 17             //創建中文識別器 
18             using (SpeechRecognitionEngine recognizer = new SpeechRecognitionEngine(new System.Globalization.CultureInfo("zh-CN"))) 19  { 20                 foreach (var config in SpeechRecognitionEngine.InstalledRecognizers()) 21  { 22  Console.WriteLine(config.Id); 23  } 24                 //初始化命令詞 
25                 Choices commonds = new Choices(); 26                 string[] commond1 = new string[] { "一", "二", "三", "四", "五", "六", "七", "八", "九" }; 27                 string[] commond2 = new string[] { "很高興見到你", "識別率", "assassin", "長沙", "湖南", "實習" }; 28                 string[] commond3 = new string[] { "開燈", "關燈", "播放音樂", "關閉音樂", "澆水", "停止澆水", "打開背景燈", "關閉背景燈" }; 29                 //添加命令詞
30  commonds.Add(commond1); 31  commonds.Add(commond2); 32  commonds.Add(commond3); 33                 //初始化命令詞管理 
34                 GrammarBuilder gBuilder = new GrammarBuilder(); 35                 //將命令詞添加到管理中 
36  gBuilder.Append(commonds); 37                 //實例化命令詞管理 
38                 Grammar grammar = new Grammar(gBuilder); 39 
40                 //創建並加載聽寫語法(添加命令詞匯識別的比較精准) 
41  recognizer.LoadGrammarAsync(grammar); 42                 //為語音識別事件添加處理程序。 
43                 recognizer.SpeechRecognized += new EventHandler<SpeechRecognizedEventArgs>(Recognizer_SpeechRRecongized); 44                 //將輸入配置到語音識別器。 
45  recognizer.SetInputToDefaultAudioDevice(); 46                 //啟動異步，連續語音識別。 
47  recognizer.RecognizeAsync(RecognizeMode.Multiple); 48                 //保持控制台窗口打開。
49                 Console.WriteLine("你好"); 50                 sy.Speak("你好"); 51  Console.ReadLine(); 52  } 53  } 54 
55         //speechrecognized事件處理 
56         static void Recognizer_SpeechRRecongized(object sender, SpeechRecognizedEventArgs e) 57  { 58             Console.WriteLine("識別結果：" + e.Result.Text + " " + e.Result.Confidence + " " + DateTime.Now); 59  sy.Speak(e.Result.Text); 60  } 61  } 62 }

　　2、在線語音識別。

　　在線語音識別是我們當前程序將語音文件發送到遠程服務中心，待遠程服務中心匹配解決后將匹配結果進行返回的過程。其使用的一般是Restful風格，利用Json數據往返識別結果。

　　剛開始學習科大訊飛的語音識別，剛開始什么也不懂，聽朋友推薦加上自己百度學習，科大訊飛都說很不錯，也抱着心態去學習學習，可是windows平台下只有C++的demo，無奈我是C#，雖說語言很大程度上不分家，可是不想過於麻煩，網上找了一個demo，據說是最全的C#版本的訊飛語音識別demo,可是當看到里面錯綜復雜的源代碼時，內心是憂傷的，這里是直接通過一種方式引用c++的函數，運行了該demo，成功了，能簡單的錄音然后識別，但是有些地方存在問題，也得不到解決方案，不得已，放棄。

　　后來，百度語音吸引我了，七月份時，重新開始看百度語音的demo，官網demo比較簡單，嘗試着學習了一下，首先你得到百度語音開放平台去創建應用得到App key 和Secret key，然后下載着demo，在構造函數或者字段中又或是寫入配置文件中，將這兩個得到的key寫入，程序會根據這兩個key去發起請求的。就如同開頭所說，這是在線語音識別，利用Restful風格，將語音文件上傳至百度語音識別中心，然后識別后將回執數據返回到我們的程序中，剛開始，配置的時候自己技術不怎么樣，配置各種出錯，地雷開始踩了，總要炸幾次，最終還是能將demo中的測試文件識別出來，算是我個人的一小步把.(如果有朋友正好碰到踩雷問題，不妨可與我一起探討，或許我也不懂，但在我踩過的里面至少我懂了，哈哈)

　　接下來是設計思路的問題，語音識別能成功了，語音合成也能成功了，這里要注意，語音識別和語音合成要分別開通，並且這兩個都有App Key和Secret Key 雖然是一樣的，但是還是要注意，不然語音合成就會出問題的。接下來要考慮的問題就是，百度語音的設計思路是根據文件識別，但是我們考慮的最多的就是我直接麥克風語音輸入，然后識別，這也是我的想法，接下來解決這一問題，設計思路是，我將輸入的信息作為文件形式保存，等我輸入完，然后就調用語音識別方法，這不就行了嗎，確實也是可以的，此處，又開始進入雷區了，利用NAudio.dll文件實現錄音功能，這個包可以在Nuget中下載。

 1 using NAudio.Wave;  2 using System;  3 
 4 namespace SAssassin.VOC  5 {  6     /// <summary>
 7     /// 實現錄音功能  8     /// </summary>
 9     public class RecordWaveToFile 10  { 11         private WaveFileWriter waveFileWriter = null; 12         private WaveIn myWaveIn = null; 13 
14         public void StartRecord() 15  { 16  ConfigWave(); 17  myWaveIn.StartRecording(); 18  } 19 
20         private void ConfigWave() 21  { 22             string filePath = AppDomain.CurrentDomain.BaseDirectory + "Temp.wav"; 23             myWaveIn = new WaveIn() 24  { 25                 WaveFormat = new WaveFormat(16000, 16, 1)//8k,16bit,單頻 26                 //WaveFormat = new WaveFormat()//識別音質清晰
27  }; 28             myWaveIn.DataAvailable += new System.EventHandler<WaveInEventArgs>(WaveIn_DataAvailable); 29             myWaveIn.RecordingStopped += new System.EventHandler<StoppedEventArgs>(WaveIn_RecordingStopped); 30             waveFileWriter = new WaveFileWriter(filePath, myWaveIn.WaveFormat); 31  } 32 
33         private void WaveIn_DataAvailable(object sender,WaveInEventArgs e) 34  { 35             if(waveFileWriter != null) 36  { 37                 waveFileWriter.Write(e.Buffer,0,e.BytesRecorded); 38  waveFileWriter.Flush(); 39  } 40  } 41 
42         private void WaveIn_RecordingStopped(object sender,StoppedEventArgs e) 43  { 44  myWaveIn.StopRecording(); 45  } 46  } 47 }

　　此處控制器中使用WaveInEvent不會報錯，可就在這之前，我用的是WaveIn類，然后直接報錯了

　　“System.InvalidOperationException:“Use WaveInEvent to record on a background thread””

　　在StackOverFlow上找到了解決方案，就是將WaveIn類換成WaveInEvent類即可，進入類里面看一下，其實發現都是引用同一個接口，甚至說兩個類的結構都是一模一樣的，只是一個用於GUI線程，一個用於后台線程。一切就緒，錄音也能實現，可是當我查看自己的錄音文件時，雜音好多，音質不侵襲，甚至是直接失真了，沒什么用，送百度也識別失敗，當將采樣頻率提高到44k時效果很好，錄音文件很不錯，但是問題來了，百度語音識別規定的pcm文件只能是8k-16bit,糟心，想換成其它格式的文件，采取壓縮形式保存，但是一旦將采樣頻率降下來，這個效果就很糟糕，識別也是成了問題。不得不說，這還要慢慢來解決哈。

　　進入今天重頭戲，這也是我博客園第一篇隨筆文章，該講點重點了，微軟認知服務，七月中旬的時候接觸到了必應的語音識別api，在微軟bing官網里，並且里面的識別效果，讓我驚呼，這識別率太高了。然后想找它的api，發現文檔全是英文資料，糟心。把資料看完，感覺使用方式很不錯，也是遠程調用的方式，但是api呢，官網找了老半天，只有文檔，那時也沒看上面的產品，試用版什么的，只能看着，卻不能用，心累。也就在這幾天，重新看了下必應的語音識別文檔，才接觸到這個詞--"微軟認知服務", 恕我見識太淺，這個好東西卻沒聽過，百度一查，真是不錯，微軟太牛了，這個里面包含很多api，語音識別都只算小菜一只，人臉識別，語義感知，等等很牛的功能，找到Api，找到免費試用，登錄獲得app的secret key ，便可以用起來了。下載一個demo，將secret key輸入，測試一下，哇塞，這識別效果，簡直了，太強了。並且從百度中看到很多結果，使用到微軟認知服務語音識別功能的很少，我也因此有寫一點東西的想法。

　　我將demo中的很多地方抽出來直接形成了一個控制器程序，源碼如下

 1 public class SpeechConfig  2  {  3         #region Fields
 4         /// <summary>
 5         /// The isolated storage subscription key file name.  6         /// </summary>
 7         private const string IsolatedStorageSubscriptionKeyFileName = "Subscription.txt";  8 
 9         /// <summary>
 10         /// The default subscription key prompt message  11         /// </summary>
 12         private const string DefaultSubscriptionKeyPromptMessage = "Secret key";  13 
 14         /// <summary>
 15         /// You can also put the primary key in app.config, instead of using UI.  16         /// string subscriptionKey = ConfigurationManager.AppSettings["primaryKey"];  17         /// </summary>
 18         private string subscriptionKey = ConfigurationManager.AppSettings["primaryKey"];  19 
 20         /// <summary>
 21         /// Gets or sets subscription key  22         /// </summary>
 23         public string SubscriptionKey  24  {  25             get
 26  {  27                 return this.subscriptionKey;  28  }  29 
 30             set
 31  {  32                 this.subscriptionKey = value;  33                 this.OnPropertyChanged<string>();  34  }  35  }  36 
 37         /// <summary>
 38         /// The data recognition client  39         /// </summary>
 40         private DataRecognitionClient dataClient;  41 
 42         /// <summary>
 43         /// The microphone client  44         /// </summary>
 45         private MicrophoneRecognitionClient micClient;  46 
 47         #endregion Fields
 48 
 49         #region event
 50         /// <summary>
 51         /// Implement INotifyPropertyChanged interface  52         /// </summary>
 53         public event PropertyChangedEventHandler PropertyChanged;  54 
 55         /// <summary>
 56         /// Helper function for INotifyPropertyChanged interface  57         /// </summary>
 58         /// <typeparam name="T">Property type</typeparam>
 59         /// <param name="caller">Property name</param>
 60         private void OnPropertyChanged<T>([CallerMemberName]string caller = null)  61  {  62             this.PropertyChanged?.Invoke(this, new PropertyChangedEventArgs(caller));  63  }  64         #endregion event
 65 
 66         #region 屬性
 67         /// <summary>
 68         /// Gets the current speech recognition mode.  69         /// </summary>
 70         /// <value>
 71         /// The speech recognition mode.  72         /// </value>
 73         private SpeechRecognitionMode Mode  74  {  75             get
 76  {  77                 if (this.IsMicrophoneClientDictation ||
 78                     this.IsDataClientDictation)  79  {  80                     return SpeechRecognitionMode.LongDictation;  81  }  82 
 83                 return SpeechRecognitionMode.ShortPhrase;  84  }  85  }  86 
 87         /// <summary>
 88         /// Gets the default locale.  89         /// </summary>
 90         /// <value>
 91         /// The default locale.  92         /// </value>
 93         private string DefaultLocale  94  {  95             //get { return "en-US"; }
 96             get { return "zh-CN"; }  97 
 98  }  99 
100         /// <summary>
101         /// Gets the Cognitive Service Authentication Uri. 102         /// </summary>
103         /// <value>
104         /// The Cognitive Service Authentication Uri. Empty if the global default is to be used. 105         /// </value>
106         private string AuthenticationUri 107  { 108             get
109  { 110                 return ConfigurationManager.AppSettings["AuthenticationUri"]; 111  } 112  } 113 
114         /// <summary>
115         /// Gets a value indicating whether or not to use the microphone. 116         /// </summary>
117         /// <value>
118         ///   <c>true</c> if [use microphone]; otherwise, <c>false</c>. 119         /// </value>
120         private bool UseMicrophone 121  { 122             get
123  { 124                 return this.IsMicrophoneClientWithIntent ||
125                     this.IsMicrophoneClientDictation ||
126                     this.IsMicrophoneClientShortPhrase; 127  } 128  } 129 
130         /// <summary>
131         /// Gets the short wave file path. 132         /// </summary>
133         /// <value>
134         /// The short wave file. 135         /// </value>
136         private string ShortWaveFile 137  { 138             get
139  { 140                 return ConfigurationManager.AppSettings["ShortWaveFile"]; 141  } 142  } 143 
144         /// <summary>
145         /// Gets the long wave file path. 146         /// </summary>
147         /// <value>
148         /// The long wave file. 149         /// </value>
150         private string LongWaveFile 151  { 152             get
153  { 154                 return ConfigurationManager.AppSettings["LongWaveFile"]; 155  } 156  } 157         #endregion 屬性
158 
159         #region 模式選擇控制器設置
160         /// <summary>
161         /// Gets or sets a value indicating whether this instance is microphone client short phrase. 162         /// </summary>
163         /// <value>
164         /// <c>true</c> if this instance is microphone client short phrase; otherwise, <c>false</c>. 165         /// </value>
166         public bool IsMicrophoneClientShortPhrase { get; set; } 167 
168         /// <summary>
169         /// Gets or sets a value indicating whether this instance is microphone client dictation. 170         /// </summary>
171         /// <value>
172         /// <c>true</c> if this instance is microphone client dictation; otherwise, <c>false</c>. 173         /// </value>
174         public bool IsMicrophoneClientDictation { get; set; } 175 
176         /// <summary>
177         /// Gets or sets a value indicating whether this instance is microphone client with intent. 178         /// </summary>
179         /// <value>
180         /// <c>true</c> if this instance is microphone client with intent; otherwise, <c>false</c>. 181         /// </value>
182         public bool IsMicrophoneClientWithIntent { get; set; } 183 
184         /// <summary>
185         /// Gets or sets a value indicating whether this instance is data client short phrase. 186         /// </summary>
187         /// <value>
188         /// <c>true</c> if this instance is data client short phrase; otherwise, <c>false</c>. 189         /// </value>
190         public bool IsDataClientShortPhrase { get; set; } 191 
192         /// <summary>
193         /// Gets or sets a value indicating whether this instance is data client with intent. 194         /// </summary>
195         /// <value>
196         /// <c>true</c> if this instance is data client with intent; otherwise, <c>false</c>. 197         /// </value>
198         public bool IsDataClientWithIntent { get; set; } 199 
200         /// <summary>
201         /// Gets or sets a value indicating whether this instance is data client dictation. 202         /// </summary>
203         /// <value>
204         /// <c>true</c> if this instance is data client dictation; otherwise, <c>false</c>. 205         /// </value>
206         public bool IsDataClientDictation { get; set; } 207 
208         #endregion
209 
210         #region 委托執行對象
211         /// <summary>
212         /// Called when the microphone status has changed. 213         /// </summary>
214         /// <param name="sender">The sender.</param>
215         /// <param name="e">The <see cref="MicrophoneEventArgs"/> instance containing the event data.</param>
216         private void OnMicrophoneStatus(object sender, MicrophoneEventArgs e) 217  { 218             Task task = new Task(() =>
219  { 220                 Console.WriteLine("--- Microphone status change received by OnMicrophoneStatus() ---"); 221                 Console.WriteLine("********* Microphone status: {0} *********", e.Recording); 222                 if (e.Recording) 223  { 224                     Console.WriteLine("Please start speaking."); 225  } 226 
227  Console.WriteLine(); 228  }); 229  task.Start(); 230  } 231 
232         /// <summary>
233         /// Called when a partial response is received. 234         /// </summary>
235         /// <param name="sender">The sender.</param>
236         /// <param name="e">The <see cref="PartialSpeechResponseEventArgs"/> instance containing the event data.</param>
237         private void OnPartialResponseReceivedHandler(object sender, PartialSpeechResponseEventArgs e) 238  { 239             Console.WriteLine("--- Partial result received by OnPartialResponseReceivedHandler() ---"); 240             Console.WriteLine("{0}", e.PartialResult); 241  Console.WriteLine(); 242  } 243 
244         /// <summary>
245         /// Called when an error is received. 246         /// </summary>
247         /// <param name="sender">The sender.</param>
248         /// <param name="e">The <see cref="SpeechErrorEventArgs"/> instance containing the event data.</param>
249         private void OnConversationErrorHandler(object sender, SpeechErrorEventArgs e) 250  { 251             Console.WriteLine("--- Error received by OnConversationErrorHandler() ---"); 252             Console.WriteLine("Error code: {0}", e.SpeechErrorCode.ToString()); 253             Console.WriteLine("Error text: {0}", e.SpeechErrorText); 254  Console.WriteLine(); 255  } 256 
257         /// <summary>
258         /// Called when a final response is received; 259         /// </summary>
260         /// <param name="sender">The sender.</param>
261         /// <param name="e">The <see cref="SpeechResponseEventArgs"/> instance containing the event data.</param>
262         private void OnMicShortPhraseResponseReceivedHandler(object sender, SpeechResponseEventArgs e) 263  { 264             Task task = new Task(() =>
265  { 266                 Console.WriteLine("--- OnMicShortPhraseResponseReceivedHandler ---"); 267 
268                 // we got the final result, so it we can end the mic reco. No need to do this 269                 // for dataReco, since we already called endAudio() on it as soon as we were done 270                 // sending all the data.
271                 this.micClient.EndMicAndRecognition(); 272 
273                 this.WriteResponseResult(e); 274  }); 275  task.Start(); 276  } 277 
278         /// <summary>
279         /// Called when a final response is received; 280         /// </summary>
281         /// <param name="sender">The sender.</param>
282         /// <param name="e">The <see cref="SpeechResponseEventArgs"/> instance containing the event data.</param>
283         private void OnDataShortPhraseResponseReceivedHandler(object sender, SpeechResponseEventArgs e) 284  { 285             Task task = new Task(() =>
286  { 287                 Console.WriteLine("--- OnDataShortPhraseResponseReceivedHandler ---"); 288 
289                 // we got the final result, so it we can end the mic reco. No need to do this 290                 // for dataReco, since we already called endAudio() on it as soon as we were done 291                 // sending all the data.
292                 this.WriteResponseResult(e); 293 
294  }); 295  task.Start(); 296  } 297 
298         /// <summary>
299         /// Called when a final response is received; 300         /// </summary>
301         /// <param name="sender">The sender.</param>
302         /// <param name="e">The <see cref="SpeechResponseEventArgs"/> instance containing the event data.</param>
303         private void OnMicDictationResponseReceivedHandler(object sender, SpeechResponseEventArgs e) 304  { 305             Console.WriteLine("--- OnMicDictationResponseReceivedHandler ---"); 306             if (e.PhraseResponse.RecognitionStatus == RecognitionStatus.EndOfDictation ||
307                 e.PhraseResponse.RecognitionStatus == RecognitionStatus.DictationEndSilenceTimeout) 308  { 309                 Task task = new Task(() =>
310  { 311                     // we got the final result, so it we can end the mic reco. No need to do this 312                     // for dataReco, since we already called endAudio() on it as soon as we were done 313                     // sending all the data.
314                     this.micClient.EndMicAndRecognition(); 315  }); 316  task.Start(); 317  } 318 
319             this.WriteResponseResult(e); 320  } 321 
322         /// <summary>
323         /// Called when a final response is received; 324         /// </summary>
325         /// <param name="sender">The sender.</param>
326         /// <param name="e">The <see cref="SpeechResponseEventArgs"/> instance containing the event data.</param>
327         private void OnDataDictationResponseReceivedHandler(object sender, SpeechResponseEventArgs e) 328  { 329             Console.WriteLine("--- OnDataDictationResponseReceivedHandler ---"); 330             if (e.PhraseResponse.RecognitionStatus == RecognitionStatus.EndOfDictation ||
331                 e.PhraseResponse.RecognitionStatus == RecognitionStatus.DictationEndSilenceTimeout) 332  { 333                 Task task = new Task(() =>
334  { 335 
336                     // we got the final result, so it we can end the mic reco. No need to do this 337                     // for dataReco, since we already called endAudio() on it as soon as we were done 338                     // sending all the data.
339  }); 340  task.Start(); 341  } 342 
343             this.WriteResponseResult(e); 344  } 345 
346         /// <summary>
347         /// Sends the audio helper. 348         /// </summary>
349         /// <param name="wavFileName">Name of the wav file.</param>
350         private void SendAudioHelper(string wavFileName) 351  { 352             using (FileStream fileStream = new FileStream(wavFileName, FileMode.Open, FileAccess.Read)) 353  { 354                 // Note for wave files, we can just send data from the file right to the server. 355                 // In the case you are not an audio file in wave format, and instead you have just 356                 // raw data (for example audio coming over bluetooth), then before sending up any 357                 // audio data, you must first send up an SpeechAudioFormat descriptor to describe 358                 // the layout and format of your raw audio data via DataRecognitionClient's sendAudioFormat() method.
359                 int bytesRead = 0; 360                 byte[] buffer = new byte[1024]; 361 
362                 try
363  { 364                     do
365  { 366                         // Get more Audio data to send into byte buffer.
367                         bytesRead = fileStream.Read(buffer, 0, buffer.Length); 368 
369                         // Send of audio data to service. 
370                         this.dataClient.SendAudio(buffer, bytesRead); 371  } 372                     while (bytesRead > 0); 373  } 374                 finally
375  { 376                     // We are done sending audio. Final recognition results will arrive in OnResponseReceived event call.
377                     this.dataClient.EndAudio(); 378  } 379  } 380  } 381         #endregion 委托執行對象
382 
383         #region 輔助方法
384         /// <summary>
385         /// Gets the subscription key from isolated storage. 386         /// </summary>
387         /// <returns>The subscription key.</returns>
388         private string GetSubscriptionKeyFromIsolatedStorage() 389  { 390             string subscriptionKey = null; 391 
392             using (IsolatedStorageFile isoStore = IsolatedStorageFile.GetStore(IsolatedStorageScope.User | IsolatedStorageScope.Assembly, null, null)) 393  { 394                 try
395  { 396                     using (var iStream = new IsolatedStorageFileStream(IsolatedStorageSubscriptionKeyFileName, FileMode.Open, isoStore)) 397  { 398                         using (var reader = new StreamReader(iStream)) 399  { 400                             subscriptionKey = reader.ReadLine(); 401  } 402  } 403  } 404                 catch (FileNotFoundException) 405  { 406                     subscriptionKey = null; 407  } 408  } 409 
410             if (string.IsNullOrEmpty(subscriptionKey)) 411  { 412                 subscriptionKey = DefaultSubscriptionKeyPromptMessage; 413  } 414 
415             return subscriptionKey; 416  } 417 
418         /// <summary>
419         /// Creates a new microphone reco client without LUIS intent support. 420         /// </summary>
421         private void CreateMicrophoneRecoClient() 422  { 423             this.micClient = SpeechRecognitionServiceFactory.CreateMicrophoneClient( 424                 this.Mode,this.DefaultLocale,this.SubscriptionKey); 425 
426             this.micClient.AuthenticationUri = this.AuthenticationUri; 427 
428             // Event handlers for speech recognition results
429             this.micClient.OnMicrophoneStatus += this.OnMicrophoneStatus; 430             this.micClient.OnPartialResponseReceived += this.OnPartialResponseReceivedHandler; 431             if (this.Mode == SpeechRecognitionMode.ShortPhrase) 432  { 433                 this.micClient.OnResponseReceived += this.OnMicShortPhraseResponseReceivedHandler; 434  } 435             else if (this.Mode == SpeechRecognitionMode.LongDictation) 436  { 437                 this.micClient.OnResponseReceived += this.OnMicDictationResponseReceivedHandler; 438  } 439 
440             this.micClient.OnConversationError += this.OnConversationErrorHandler; 441  } 442 
443         /// <summary>
444         /// Creates a data client without LUIS intent support. 445         /// Speech recognition with data (for example from a file or audio source). 446         /// The data is broken up into buffers and each buffer is sent to the Speech Recognition Service. 447         /// No modification is done to the buffers, so the user can apply their 448         /// own Silence Detection if desired. 449         /// </summary>
450         private void CreateDataRecoClient() 451  { 452             this.dataClient = SpeechRecognitionServiceFactory.CreateDataClient( 453                 this.Mode, 454                 this.DefaultLocale, 455                 this.SubscriptionKey); 456             this.dataClient.AuthenticationUri = this.AuthenticationUri; 457 
458             // Event handlers for speech recognition results
459             if (this.Mode == SpeechRecognitionMode.ShortPhrase) 460  { 461                 this.dataClient.OnResponseReceived += this.OnDataShortPhraseResponseReceivedHandler; 462  } 463             else
464  { 465                 this.dataClient.OnResponseReceived += this.OnDataDictationResponseReceivedHandler; 466  } 467 
468             this.dataClient.OnPartialResponseReceived += this.OnPartialResponseReceivedHandler; 469             this.dataClient.OnConversationError += this.OnConversationErrorHandler; 470  } 471 
472         /// <summary>
473         /// Writes the response result. 474         /// </summary>
475         /// <param name="e">The <see cref="SpeechResponseEventArgs"/> instance containing the event data.</param>
476         private void WriteResponseResult(SpeechResponseEventArgs e) 477  { 478             if (e.PhraseResponse.Results.Length == 0) 479  { 480                 Console.WriteLine("No phrase response is available."); 481  } 482             else
483  { 484                 Console.WriteLine("********* Final n-BEST Results *********"); 485                 for (int i = 0; i < e.PhraseResponse.Results.Length; i++) 486  { 487  Console.WriteLine( 488                         "[{0}] Confidence={1}, Text=\"{2}\"", 489  i, 490  e.PhraseResponse.Results[i].Confidence, 491  e.PhraseResponse.Results[i].DisplayText); 492                     if (e.PhraseResponse.Results[i].DisplayText == "關閉。") 493  { 494                         Console.WriteLine("收到命令，馬上關閉"); 495  } 496  } 497 
498  Console.WriteLine(); 499  } 500  } 501         #endregion 輔助方法
502 
503         #region Init
504         public SpeechConfig() 505  { 506             this.IsMicrophoneClientShortPhrase = true; 507             this.IsMicrophoneClientWithIntent = false; 508             this.IsMicrophoneClientDictation = false; 509             this.IsDataClientShortPhrase = false; 510             this.IsDataClientWithIntent = false; 511             this.IsDataClientDictation = false; 512 
513             this.SubscriptionKey = this.GetSubscriptionKeyFromIsolatedStorage(); 514  } 515 
516         /// <summary>
517         /// 語音識別開始執行 518         /// </summary>
519         public void SpeechRecognize() 520  { 521             if (this.UseMicrophone) 522  { 523                 if (this.micClient == null) 524  { 525                     this.CreateMicrophoneRecoClient(); 526  } 527 
528                 this.micClient.StartMicAndRecognition(); 529  } 530             else
531  { 532                 if (null == this.dataClient) 533  { 534                     this.CreateDataRecoClient(); 535  } 536 
537                 this.SendAudioHelper((this.Mode == SpeechRecognitionMode.ShortPhrase) ? this.ShortWaveFile : this.LongWaveFile); 538  } 539  } 540         #endregion Init
541     }

　　在這其中有幾個引用文件可以通過nuget包下載，基本沒什么問題。對了這里注意的一個問題就是，下載Microsoft.Speech的時候一定是兩個包都需要下載，不然會報錯的，版本必須是4.5+以上的。

　　只需替換默認的key就行，程序便可跑起來，效果真是很6

　　這識別率真是很好很好，很滿意，可是這個微軟的免費試用只有一個月，那就只能在這個月里多讓它開花結果了哈哈。

　　2017-08-20,望技術有成后能回來看見自己的腳步。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 微軟認知服務實現語音識別功能微軟認知語音服務語音識別 ASP.NET Core環境Web Audio API+SingalR+微軟語音服務實現web實時語音識別 Python實現語音識別功能 QT中使用微軟Speech API實現語音識別微軟認知服務——人臉識別瀏覽器的語音識別功能 Microsoft.Speech 語音識別功能 java通過jna調用科大訊飛語音雲實現語音識別功能利用百度語音API進行語音識別。