前面我們使用Azure Face實現了人臉識別、使用Azure表格識別器提取了表格里的數據。這次我們試試使用Azure墨跡識別API來對筆跡進行識別。
墨跡識別
墨跡識別器認知服務提供基於雲的 REST API 用於分析和識別數字墨跡內容。 與使用光學字符識別 (OCR) 的服務不同,該 API 需要使用數字墨跡筆划數據作為輸入。 數字墨跡筆划是 2D 點(X,Y 坐標,表示數字手寫筆或手指的動作)的時序集。 然后,墨跡識別器會識別輸入中的形狀和手寫內容,並返回包含所有已識別實體的 JSON 響應。
引用自微軟文檔
它不是ocr對圖像進行識別,而是對墨跡數據進行識別。墨跡數據的原理主要是一些手寫輸入設備,比如平板,手寫板等。
創建墨跡識別資源
跟前面的內容一樣,在portal控制台找到墨跡識別功能,點擊創建,取一個實例名。墨跡識別也是一個免費服務,定價選F0方案,額度為5次/分,20000事務/月。
獲取秘鑰和終結點
我們調用墨跡識別API需要秘鑰跟終結點信息。點擊菜單“密鑰和終結點”查看信息。
新建一個WPF項目
我們這次同樣實現一個WPF小程序。界面上放置一個InkCanvas用來手寫,一個文本框用來顯示識別的文本,一個按鈕用來觸發識別。
MainWindow.xaml
修改MainWindow.xaml為如下代碼:
<Window x:Class="InkRec2.MainWindow"
xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
xmlns:d="http://schemas.microsoft.com/expression/blend/2008"
xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
mc:Ignorable="d"
xmlns:local="clr-namespace:NoteTaker"
xmlns:controls="clr-namespace:Microsoft.Toolkit.Wpf.UI.Controls;assembly=Microsoft.Toolkit.Wpf.UI.Controls"
Title="MainWindow">
<Grid >
<Grid.RowDefinitions>
<RowDefinition Height="4*" />
<RowDefinition Height="1*" />
<RowDefinition Height="50" />
</Grid.RowDefinitions>
<Border Grid.Row ="0" BorderBrush="Black" BorderThickness="1">
<controls:InkCanvas x:Name="inkCanvas" Loaded="inkCanvas_Loaded"/>
</Border>
<Border Grid.Row ="1" BorderBrush="Black" BorderThickness="1">
<ScrollViewer>
<TextBox x:Name="output" FontSize="18" TextWrapping="Wrap"/>
</ScrollViewer>
</Border>
<StackPanel Grid.Row="2" Orientation="Horizontal">
<Button Click="Button_InkRec">開始識別</Button>
</StackPanel>
</Grid>
</Window>
注意:InkCanvas控件需要使用的是Microsoft.Toolkit.Wpf.UI.Controls包下的,如果本地沒有使用nuget進行安裝
采集墨跡
inkCanvas load事件里設置輸入設備的類型:
private void inkCanvas_Loaded(object sender, RoutedEventArgs e)
{
inkCanvas.InkPresenter.InputDeviceTypes = CoreInputDeviceTypes.Mouse | CoreInputDeviceTypes.Pen | CoreInputDeviceTypes.Touch;
}
先定義幾個模型用來存儲墨跡數據:
public class InkStroke
{
public int id { get; set; }
public string points { get; set; }
}
public class InkData
{
public string language { get; set; }
public List<InkStroke> strokes { get; set; }
}
從InkCanvas獲取墨跡數據組裝成InkData:
private InkData GetInkData()
{
var data = new InkData();
data.language = "zh-CN";
data.strokes = new List<InkStroke>();
int id = 0;
foreach (var stroke in this.inkCanvas.InkPresenter.StrokeContainer.GetStrokes())
{
var points = stroke.GetInkPoints();
var convertPoints = ConvertPixelsToMillimeters(points);
var inkStorke = new InkStroke();
inkStorke.id = id++;
var sb = new StringBuilder();
foreach (var point in convertPoints)
{
sb.Append(point.X);
sb.Append(",");
sb.Append(point.Y);
sb.Append(",");
}
inkStorke.points = sb.ToString().TrimEnd(',');
data.strokes.Add(inkStorke);
}
return data;
}
private List<System.Windows.Point> ConvertPixelsToMillimeters(IReadOnlyList<InkPoint> pointsInPixels)
{
float dpiX = 96.0f;
float dpiY = 96.0f;
var transformedInkPoints = new List<System.Windows.Point>();
const float inchToMillimeterFactor = 25.4f;
foreach (var point in pointsInPixels)
{
var transformedX = (point.Position.X / dpiX) * inchToMillimeterFactor;
var transformedY = (point.Position.Y / dpiY) * inchToMillimeterFactor;
transformedInkPoints.Add(new System.Windows.Point(transformedX, transformedY));
}
return transformedInkPoints;
}
調用墨跡API
這里需要前面復制好的密鑰跟終結點地址。識別其實很簡單,就是把墨跡數據轉換成json后給服務器發生一個put請求,識別成功后就會返回一個json字符串的結果。
private async Task<string> InkRec(InkData data)
{
string inkRecognitionUrl = "/inkrecognizer/v1.0-preview/recognize";
string endPoint = "x";
string subscriptionKey = "x";
using (HttpClient client = new HttpClient { BaseAddress = new Uri(endPoint) })
{
System.Net.ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12 | SecurityProtocolType.Tls11 | SecurityProtocolType.Tls;
client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));
client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", subscriptionKey);
var jsonData = JsonConvert.SerializeObject(data);
var content = new StringContent(jsonData, Encoding.UTF8, "application/json");
var res = await client.PutAsync(inkRecognitionUrl, content);
if (res.IsSuccessStatusCode)
{
var result = await res.Content.ReadAsStringAsync();
return result;
}
else
{
var err = $"ErrorCode: {res.StatusCode}";
return err;
}
}
}
解析識別結果
識別成功后,結果會以json字符串的形式進行返回。結果是一個數組,里面存放了每一個筆跡的識別結果,以及最終的識別結果。
結果示例:
{"recognitionUnits":[{"alternates":[{"category":"inkWord","recognizedString":"乖"},{"category":"inkWord","recognizedString":"黍"},{"category":"inkWord","recognizedString":"秉"},{"category":"inkWord","recognizedString":"乗"},{"category":"inkWord","recognizedString":"埀"}],"boundingRectangle":{"height":48.159999847412109,"topX":7.190000057220459,"topY":22.010000228881836,"width":35.639999389648438},"category":"inkWord","class":"leaf","id":4,"parentId":3,"recognizedText":"乘","rotatedBoundingRectangle":[{"x":41.490001678466797,"y":21.25},{"x":43.209999084472656,"y":69.239997863769531},{"x":7.8299999237060547,"y":70.5},{"x":6.1100001335144043,"y":22.520000457763672}],"strokeIds":[0,1,2,3,4,5,6,7,8,9]},{"alternates":[{"category":"inkWord","recognizedString":"風"},{"category":"inkWord","recognizedString":"夙"},{"category":"inkWord","recognizedString":"鳳"},{"category":"inkWord","recognizedString":"凡"},{"category":"inkWord","recognizedString":"㶡"}],"boundingRectangle":{"height":32."class":"leaf","id":8,"parent
...
那么我們只要對其進行反序列化取出想要的識別結果就行了。
public class InkRecResponse
{
public List<InkRecResponseUnit> recognitionUnits { get; set; }
}
public class InkRecResponseUnit
{
public string category { get; set; }
public string recognizedText { get; set; }
}
private async void Button_InkRec(object sender, RoutedEventArgs e)
{
var inkData = GetInkData();
var response = await InkRec(inkData);
var jsonObj = JsonConvert.DeserializeObject<InkRecResponse>(response);
var recognizedText = jsonObj.recognitionUnits.First(o => o.category == "line").recognizedText;
this.output.Text = recognizedText;
}
運行一下
我們的程序寫好了,運行一下。在canvas上隨便寫上幾個漢字點擊識別按鈕。字雖然丑了點,但是結果還是完美的。
總結
使用Azure墨跡識別可以輕松的識別手寫輸入設備的筆跡。墨跡識別功能並不是見到的orc識別,它可以對每一個筆畫進行識別,提供候選結果。以上代碼雖然多,其實主要是獲取墨跡數據比較麻煩,其實真正識別墨跡只是一個http put請求而已,這是非常簡單的。有了這個API我們可以實現很多創意,比如稍微改進下上面的代碼就可以實現手寫文字的連續識別功能,一邊寫一邊不斷的識別,封裝進平板就是一款可以實時識別手寫板啦。