Azure認知服務之使用墨跡識別功能識別手寫漢字


前面我們使用Azure Face實現了人臉識別、使用Azure表格識別器提取了表格里的數據。這次我們試試使用Azure墨跡識別API來對筆跡進行識別。

墨跡識別

墨跡識別器認知服務提供基於雲的 REST API 用於分析和識別數字墨跡內容。 與使用光學字符識別 (OCR) 的服務不同,該 API 需要使用數字墨跡筆划數據作為輸入。 數字墨跡筆划是 2D 點(X,Y 坐標,表示數字手寫筆或手指的動作)的時序集。 然后,墨跡識別器會識別輸入中的形狀和手寫內容,並返回包含所有已識別實體的 JSON 響應。

引用自微軟文檔

它不是ocr對圖像進行識別,而是對墨跡數據進行識別。墨跡數據的原理主要是一些手寫輸入設備,比如平板,手寫板等。

創建墨跡識別資源

跟前面的內容一樣,在portal控制台找到墨跡識別功能,點擊創建,取一個實例名。墨跡識別也是一個免費服務,定價選F0方案,額度為5次/分,20000事務/月。
d8uQJI.png

獲取秘鑰和終結點

我們調用墨跡識別API需要秘鑰跟終結點信息。點擊菜單“密鑰和終結點”查看信息。
d8ulWt.png

新建一個WPF項目

我們這次同樣實現一個WPF小程序。界面上放置一個InkCanvas用來手寫,一個文本框用來顯示識別的文本,一個按鈕用來觸發識別。
d31qhj.png

MainWindow.xaml

修改MainWindow.xaml為如下代碼:

<Window x:Class="InkRec2.MainWindow"
        xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
        xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
        xmlns:d="http://schemas.microsoft.com/expression/blend/2008"
        xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
        mc:Ignorable="d"
        xmlns:local="clr-namespace:NoteTaker"
        xmlns:controls="clr-namespace:Microsoft.Toolkit.Wpf.UI.Controls;assembly=Microsoft.Toolkit.Wpf.UI.Controls"
        Title="MainWindow">

    <Grid >
        <Grid.RowDefinitions>
            <RowDefinition Height="4*" />
            <RowDefinition Height="1*" />
            <RowDefinition Height="50" />

        </Grid.RowDefinitions>
        <Border Grid.Row ="0" BorderBrush="Black" BorderThickness="1">
            <controls:InkCanvas x:Name="inkCanvas" Loaded="inkCanvas_Loaded"/>
        </Border>
        <Border Grid.Row ="1" BorderBrush="Black" BorderThickness="1">
            <ScrollViewer>
                <TextBox x:Name="output" FontSize="18" TextWrapping="Wrap"/>
            </ScrollViewer>
        </Border>
        <StackPanel Grid.Row="2" Orientation="Horizontal">
            <Button Click="Button_InkRec">開始識別</Button>
        </StackPanel>
    </Grid>
</Window>

注意:InkCanvas控件需要使用的是Microsoft.Toolkit.Wpf.UI.Controls包下的,如果本地沒有使用nuget進行安裝

采集墨跡

inkCanvas load事件里設置輸入設備的類型:

   private void inkCanvas_Loaded(object sender, RoutedEventArgs e)
        {
            inkCanvas.InkPresenter.InputDeviceTypes = CoreInputDeviceTypes.Mouse | CoreInputDeviceTypes.Pen | CoreInputDeviceTypes.Touch;
        }

先定義幾個模型用來存儲墨跡數據:


    public class InkStroke
    {
        public int id { get; set; }

        public string points { get; set; }
    }

    public class InkData
    {
        public string language { get; set; }

        public List<InkStroke> strokes { get; set; }
    }

從InkCanvas獲取墨跡數據組裝成InkData:

        private InkData GetInkData()
        {
            var data = new InkData();
            data.language = "zh-CN";
            data.strokes = new List<InkStroke>();

            int id = 0;
            foreach (var stroke in this.inkCanvas.InkPresenter.StrokeContainer.GetStrokes())
            {
                var points = stroke.GetInkPoints();

                var convertPoints = ConvertPixelsToMillimeters(points);

                var inkStorke = new InkStroke();
                inkStorke.id = id++;

                var sb = new StringBuilder();
                foreach (var point in convertPoints)
                {
                    sb.Append(point.X);
                    sb.Append(",");
                    sb.Append(point.Y);
                    sb.Append(",");
                }
                inkStorke.points = sb.ToString().TrimEnd(',');

                data.strokes.Add(inkStorke);
            }

            return data;
        }
                private List<System.Windows.Point> ConvertPixelsToMillimeters(IReadOnlyList<InkPoint> pointsInPixels)
        {
            float dpiX = 96.0f;
            float dpiY = 96.0f;
            var transformedInkPoints = new List<System.Windows.Point>();
            const float inchToMillimeterFactor = 25.4f;


            foreach (var point in pointsInPixels)
            {
                var transformedX = (point.Position.X / dpiX) * inchToMillimeterFactor;
                var transformedY = (point.Position.Y / dpiY) * inchToMillimeterFactor;

                transformedInkPoints.Add(new System.Windows.Point(transformedX, transformedY));
            }

            return transformedInkPoints;
        }

調用墨跡API

這里需要前面復制好的密鑰跟終結點地址。識別其實很簡單,就是把墨跡數據轉換成json后給服務器發生一個put請求,識別成功后就會返回一個json字符串的結果。

        private async Task<string> InkRec(InkData data)
        {
            string inkRecognitionUrl = "/inkrecognizer/v1.0-preview/recognize";
            string endPoint = "x";
            string subscriptionKey = "x";

            using (HttpClient client = new HttpClient { BaseAddress = new Uri(endPoint) })
            {
                System.Net.ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12 | SecurityProtocolType.Tls11 | SecurityProtocolType.Tls;
                client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));
                client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", subscriptionKey);
                var jsonData = JsonConvert.SerializeObject(data);
                var content = new StringContent(jsonData, Encoding.UTF8, "application/json");
                var res = await client.PutAsync(inkRecognitionUrl, content);
                if (res.IsSuccessStatusCode)
                {
                    var result = await res.Content.ReadAsStringAsync();

                    return result;
                }
                else
                {
                    var err = $"ErrorCode: {res.StatusCode}";

                    return err;
                }
            }
        }

解析識別結果

識別成功后,結果會以json字符串的形式進行返回。結果是一個數組,里面存放了每一個筆跡的識別結果,以及最終的識別結果。
結果示例:

{"recognitionUnits":[{"alternates":[{"category":"inkWord","recognizedString":"乖"},{"category":"inkWord","recognizedString":"黍"},{"category":"inkWord","recognizedString":"秉"},{"category":"inkWord","recognizedString":"乗"},{"category":"inkWord","recognizedString":"埀"}],"boundingRectangle":{"height":48.159999847412109,"topX":7.190000057220459,"topY":22.010000228881836,"width":35.639999389648438},"category":"inkWord","class":"leaf","id":4,"parentId":3,"recognizedText":"乘","rotatedBoundingRectangle":[{"x":41.490001678466797,"y":21.25},{"x":43.209999084472656,"y":69.239997863769531},{"x":7.8299999237060547,"y":70.5},{"x":6.1100001335144043,"y":22.520000457763672}],"strokeIds":[0,1,2,3,4,5,6,7,8,9]},{"alternates":[{"category":"inkWord","recognizedString":"風"},{"category":"inkWord","recognizedString":"夙"},{"category":"inkWord","recognizedString":"鳳"},{"category":"inkWord","recognizedString":"凡"},{"category":"inkWord","recognizedString":"㶡"}],"boundingRectangle":{"height":32."class":"leaf","id":8,"parent
...

那么我們只要對其進行反序列化取出想要的識別結果就行了。

    public class InkRecResponse
    {
        public List<InkRecResponseUnit> recognitionUnits { get; set; }
    }

    public class InkRecResponseUnit
    {
        public string category { get; set; }

        public string recognizedText { get; set; }
    }
   private async void Button_InkRec(object sender, RoutedEventArgs e)
        {
            var inkData = GetInkData();
            var response = await InkRec(inkData);

            var jsonObj = JsonConvert.DeserializeObject<InkRecResponse>(response);

            var recognizedText = jsonObj.recognitionUnits.First(o => o.category == "line").recognizedText;

            this.output.Text = recognizedText;
        }

運行一下

我們的程序寫好了,運行一下。在canvas上隨便寫上幾個漢字點擊識別按鈕。字雖然丑了點,但是結果還是完美的。
d31bNQ.png

總結

使用Azure墨跡識別可以輕松的識別手寫輸入設備的筆跡。墨跡識別功能並不是見到的orc識別,它可以對每一個筆畫進行識別,提供候選結果。以上代碼雖然多,其實主要是獲取墨跡數據比較麻煩,其實真正識別墨跡只是一個http put請求而已,這是非常簡單的。有了這個API我們可以實現很多創意,比如稍微改進下上面的代碼就可以實現手寫文字的連續識別功能,一邊寫一邊不斷的識別,封裝進平板就是一款可以實時識別手寫板啦。

關注我的公眾號一起玩轉技術


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM