(三)Hololens Unity 開發之 語音識別


**學習源於官方文檔 Voice input in Unity **

筆記一部分是直接翻譯官方文檔,部分各人理解不一致的和一些比較淺顯的保留英文原文

(三)Hololens Unity 開發之 語音識別

HoloLens 有三大輸入系統,凝視點、手勢和聲音 ~ 本文主要講解 語音輸入 ~ (測試不支持中文語音輸入~)

一、概述

HoloToolKit Unity 包提供了三種 語音輸入的方式 :

  • Phrase Recognition 短語識別
    * KeywordRecognizer 單一關鍵詞識別
    * GrammarRecognizer 語法識別

  • Dictation Recognition 聽寫識別
    * DictationRecognizer 將聲音識別轉化為文字

Note: KeywordRecognizer 和 GrammarRecognizer 是主動活躍識別的方式~ 也就是說調用開始識別的方法,那么久處於活躍狀態開始識別,而DictationRecognizer只要注冊了就就在默默的監聽語音輸入,一旦監聽到關鍵詞~那么久觸發回調~

二、Unity開發打開Microphone權限

下面是官方文檔 講解 如何打開microphone權限,直接上配圖~

The Microphone capability must be declared for an app to leverage Voice input.

  1. In the Unity Editor, go to the player settings by navigating to "Edit > Project Settings > Player"
  2. Click on the "Windows Store" tab
  3. In the "Publishing Settings > Capabilities" section, check the Microphone capability

三、Phrase Recognition 短語識別

To enable your app to listen for specific phrases spoken by the user then take some action, you need to:

  1. Specify which phrases to listen for using a KeywordRecognizer or GrammarRecognizer
  2. Handle the OnPhraseRecognized event and take action corresponding to the phrase recognized

使用短語識別嘞~需要做兩個步驟:

  1. 指定需要監聽的 短語 或者 關鍵詞
  2. 處理識別到 短語 或者 關鍵詞 之后的事件回調 ~ OnPhraseRecognized

1、 關鍵詞識別 (直接Demo代碼~)

using System.Collections;
using System.Collections.Generic;
using UnityEngine;
using UnityEngine.Windows.Speech;
using System.Linq;

public class VoiceInputDemo : MonoBehaviour {

    public Material yellow;
    public Material red;
    public Material blue;
    public Material green;

    /// <summary>
    /// 關鍵詞識別對象
    /// </summary>
    private KeywordRecognizer keywordRecognizer;

    /// <summary>
    /// 存放關鍵詞的字典
    /// </summary>
    private Dictionary<string, System.Action> keywords = new Dictionary<string, System.Action>();
    // Use this for initialization
    void Start () {

        // 向字典中添加關鍵詞,key為關鍵詞, vallue為一個匿名action
        keywords.Add("yellow", () =>
        {
            Debug.Log("聽到了 yellow");
            transform.GetComponent<MeshRenderer>().material = yellow;
        });

        keywords.Add("red", () =>
        {
            Debug.Log("聽到了 red");
            transform.GetComponent<MeshRenderer>().material = red;
        });

        keywords.Add("green", () =>
        {
            Debug.Log("聽到了 green");
            transform.GetComponent<MeshRenderer>().material = green;
        });

        keywords.Add("blue", () =>
        {
            Debug.Log("聽到了 blue");
            transform.GetComponent<MeshRenderer>().material = blue;
        });

        // 初始化關鍵詞識別對象
        keywordRecognizer = new KeywordRecognizer(keywords.Keys.ToArray());

        // 添加關鍵詞代理事件
        keywordRecognizer.OnPhraseRecognized += KeywordRecognizer_OnPhraseRecognized;

        // 注意: 這方法一定要寫,開始執行監聽
        keywordRecognizer.Start();
    }

    private void KeywordRecognizer_OnPhraseRecognized(PhraseRecognizedEventArgs args)
    {

        System.Action keywordAction;
        // if the keyword recognized is in our dictionary, call that Action.
        // 如果關鍵字在我們的字典中被識別,調用該action。
        if (keywords.TryGetValue(args.text, out keywordAction))
        {
            Debug.Log("聽到了,進入了事件方法  關鍵詞語 : " + args.text.ToString());

           // 執行該action
            keywordAction.Invoke();
        }
    }

    // Update is called once per frame
    void Update () {

	}
}

2、 語法識別 GrammarRecognizer

按照官方文檔上來說的 我得 創建一個 SRGS 的XML文件放在 StreamingAssets 文件夾下~不過我沒有做到英文語法輸入的需求 ~ 感興趣的點擊 https://msdn.microsoft.com/en-us/library/hh378349(v=office.14).aspx 自己查看官方文段對SRGS的講解~

下面貼的一段官方文檔的代碼
Once you have your SRGS grammar, and it is in your project in a StreamingAssets folder:

<PROJECT_ROOT>/Assets/StreamingAssets/SRGS/myGrammar.xml

Create a GrammarRecognizer and pass it the path to your SRGS file:

private GrammarRecognizer grammarRecognizer;
grammarRecognizer = new GrammarRecognizer(Application.streamingDataPath + "/SRGS/myGrammar.xml");

Now register for the OnPhraseRecognized event

grammarRecognizer.OnPhraseRecognized += grammarRecognizer_OnPhraseRecognized;

You will get a callback containing information specified in your SRGS grammar which you can handle appropriately. Most of the important information will be provided in the semanticMeanings array.

private void Grammar_OnPhraseRecognized(PhraseRecognizedEventArgs args)
{
    SemanticMeaning[] meanings = args.semanticMeanings;
    // do something
}

Finally, start recognizing!

grammarRecognizer.Start();

四、聽寫

1、概述

DictationRecognizer 使用這個對象可以識別語音輸入轉化為文本,使用這個對象有三個步驟~

  1. 創建一個DictationRecognizer對象
  2. 注冊Dictation 事件
  3. 開始識別聽寫

2、開啟網絡客戶端權限

The "Internet Client" capability, in addition to the "Microphone" capability mentioned above, must be declared for an app to leverage dictation.

  1. In the Unity Editor, go to the player settings by navigating to "Edit > Project Settings > Player" page
  2. Click on the "Windows Store" tab
  3. In the "Publishing Settings > Capabilities" section, check the InternetClient capability

直接上Unity的圖吧~

3、Demo代碼示例~

using System.Collections;
using System.Collections.Generic;
using UnityEngine;
using UnityEngine.Windows.Speech;

public class VoiceDictationDemo : MonoBehaviour
{

    private DictationRecognizer dictationRecognizer;

    // Use this for initialization
    void Start()
    {

        // 定義一個聽寫對象
        dictationRecognizer = new DictationRecognizer();

        // 注冊一個 結果回調 事件
        dictationRecognizer.DictationResult += DictationRecognizer_DictationResult;
        // 注冊一個 完成 事件
        dictationRecognizer.DictationComplete += DictationRecognizer_DictationComplete;
        // 注冊一個 錯誤 事件
        dictationRecognizer.DictationError += DictationRecognizer_DictationError;
        // 注冊一個 識別語句 的 事件
        dictationRecognizer.DictationHypothesis += DictationRecognizer_DictationHypothesis;

        dictationRecognizer.Start();
    }

    private void DictationRecognizer_DictationHypothesis(string text)
    {
        Debug.Log("進入了Hypothesis 的 事件 回調 ~ " + text);
        dictationRecognizer.Start();
    }

    private void DictationRecognizer_DictationError(string error, int hresult)
    {
        Debug.Log("進入了Error 的 事件 回調 ~ " + error + " 狀態碼 " + hresult);
        dictationRecognizer.Start();
    }

    private void DictationRecognizer_DictationComplete(DictationCompletionCause cause)
    {

        Debug.Log("進入了Complete 的 事件 回調 ~ " + cause);
        dictationRecognizer.Start();
    }

    private void DictationRecognizer_DictationResult(string text, ConfidenceLevel confidence)
    {
        Debug.Log("進入了Result 的 事件 回調 ~ " + text + " 枚舉 " + confidence);
        dictationRecognizer.Start();
    }

    void OnDestroy()
    {
        // 銷毀事件
        dictationRecognizer.DictationResult -= DictationRecognizer_DictationResult;
        dictationRecognizer.DictationComplete -= DictationRecognizer_DictationComplete;
        dictationRecognizer.DictationHypothesis -= DictationRecognizer_DictationHypothesis;
        dictationRecognizer.DictationError -= DictationRecognizer_DictationError;
        dictationRecognizer.Dispose();
    }

}

用有道 里面 的英語短視頻 做了下測試~ 幾乎能達到百分之九十八 以上的 識別率。。感嘆微軟做的挺不錯的~

五、同時使用 語音識別 和 聽寫 (文檔翻譯)

If you want to use both phrase recognition and dictation in your app, you'll need to fully shut one down before you can start the other. If you have multiple KeywordRecognizers running, you can shut them all down at once with:
如果你想同時使用 語音識別 和 聽寫識別,那么你必須關閉一個再啟動另外一個~ 如果你有多個語音識別的對象KeywordRecognizers,那么你可以通過下面的方法把他們全部關閉~

PhraseRecognitionSystem.Shutdown();

In order to restore all recognizers to their previous state, after the DictationRecognizer has stopped, you can call:
當然,你也可以恢復關閉前的所有狀態,當在你的聽寫識別結束的時候,你可以調用下面的方法恢復之前的語音識別~

PhraseRecognitionSystem.Restart();

You could also just start a KeywordRecognizer, which will restart the PhraseRecognitionSystem as well.
當然,你也可以只啟動一個KeywordRecognizer語音識別對象~同樣的也是用PhraseRecognitionSystem來控制其暫停或者恢復~


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM