智能家居手勢識別，只需百度AI即可搞定

本文轉載自查看原文 2019-07-19 17:04 402 AI程序/ 敏捷開發/ 人工智能/ 技術文檔/ 手勢識別

上次我嘗試做了一個給眼鏡加特效，針對的是靜態圖像，具體文章參考 https://ai.baidu.com/forum/topic/show/942890 。

這次我嘗試在視頻中加眼鏡特效，並且加上手勢識別，不同的手勢佩戴不同的眼鏡。接下來將介紹手勢識別接口，並介紹如何接入。

手勢識別接口

接口描述
識別圖片中的手勢類型，返回手勢名稱、手勢矩形框、概率分數，可識別24種常見手勢，適用於手勢特效、智能家居手勢交互等場景。

支持的24類手勢列表：拳頭、OK、祈禱、作揖、作別、單手比心、點贊、Diss、我愛你、掌心向上、雙手比心（3種）、數字（9種）、Rock、豎中指。

注：

上述24類以外的其他手勢會划分到other類。
除識別手勢外，若圖像中檢測到人臉，會同時返回人臉框位置。

人體分析的請求方式和人臉識別的請求方式有所不同，具體的使用說明參見文檔 https://ai.baidu.com/docs#/Body-API/27495b11

請求格式
POST 方式調用，請求 URL 為 https://aip.baidubce.com/rest/2.0/image-classify/v1/gesture ，Content-Type 為 application/x-www-form-urlencoded，然后通過 urlencode 格式化請求體。

請求參數

返回說明

返回示例

{
        "log_id": 4466502370458351471,
    	"result_num": 2,
    	"result": [{
    		"probability": 0.9844077229499817,
    		"top": 20,
    		"height": 156,
    		"classname": "Face",
    		"width": 116,
    		"left": 173
    	},
    	{
    		"probability": 0.4679304957389832,
    		"top": 157,
    		"height": 106,
    		"classname": "Heart_2",
    		"width": 177,
    		"left": 183
    	}]
    }

實例

1. 創建應用
由於戴眼鏡是使用的人臉識別的接口，手勢識別是人體分析的接口，因此為了將手勢識別應用到戴眼鏡特效中，需要在創建人臉識別應用時勾選人體分析的手勢識別。

首先進入“控制台”的“人臉識別”，然后“創建應用”。

然后填上“應用名稱”和“應用描述”，並且接口勾選上“人體分析”下的“手勢識別”。

之后點擊“立即創建”，創建好之后我們就能夠獲取到應用的 “API key” 和 “Secret key”，用於后面獲取 “token key”。

2.獲取 token key
通過 API Key 和 Secret Key 獲取的 access_token。更多關於 access_token 的獲取方法參考 http://ai.baidu.com/docs#/Auth/top。

下面代碼是 python3 獲取 access_token 的代碼

def get_token_key():
    # client_id 為官網獲取的AK， client_secret 為官網獲取的SK
    client_id = '【百度雲應用的AK】'  # API key
    client_secret = '【百度雲應用的SK】'  # Secret key
    url = f'https://aip.baidubce.com/oauth/2.0/token?grant_type=client_credentials' \
        f'&client_id={client_id}&client_secret={client_secret}'
    headers = {'Content-Type': 'application/json; charset=UTF-8'}
    res = requests.post(url, headers=headers)
    token_content = res.json()
    assert "error" not in token_content, f"{token_content['error_description']}"
    token_key = token_content['access_token']
    return token_key

3.調用手勢識別接口
調用手勢識別接口的 python3 代碼實現如下：

def get_hand_info(image_base64, token_key):
    request_url = "https://aip.baidubce.com/rest/2.0/image-classify/v1/gesture"
    params_d = dict()
    params_d['image'] = str(image_base64, encoding='utf-8')
    access_token = token_key
    request_url = request_url + "?access_token=" + access_token
    res = requests.post(url=request_url,
                        data=params_d,
                        headers={'Content-Type': 'application/x-www-form-urlencoded'})
    data = res.json()
    assert 'error_code' not in data, f'Error: {data["error_msg"]}'
    return data

正確調用接口獲取到數據之后，我們可以得到一些想要的信息。例如：獲取檢測的類別的數量、各個類別的類別名以及邊框。

def get_hand_num(data):
    return data['result_num']


def get_hand_cls_and_bbox(data):
    result = list()
    cls_list = list()
    hand_num = get_hand_num(data)
    for i in range(hand_num):
        res_dict = data['result'][i]
        cls = res_dict['classname']
        cls_list.append(cls)
        bbox = [res_dict['left'], res_dict['top'], res_dict['width'], res_dict['height']]
        res = [cls] + bbox
        result.append(res)
    return result, cls_list

案例代碼與說明
整個案例的核心代碼如下：(由於人臉識別的 QPS 為 2，因此在顯示圖像時使用了 cv2.waitKey(500)，所以這個應用看起來不是很流暢)

import cv2
from util import pic_base64, get_face_info, get_face_location, get_face_num, frame2base64, get_hand_info
from pprint import pprint
import util
import face_util
import gesture_util
import os
import random

token_key = '【獲取的 token key】'
glasses_img = ['images/glasses/'+img for img in os.listdir('images/glasses')]

glasses = cv2.imread('images/glasses/glasses6.png', cv2.IMREAD_UNCHANGED)


cap = cv2.VideoCapture(0)
while True:
    _, image = cap.read()
    detect_img = image.copy()
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    image_base64 = frame2base64(image)
    face_data = get_face_info(image_base64, token_key)
    hand_data = get_hand_info(image_base64, token_key)
    _, cls_list = util.get_hand_cls_and_bbox(hand_data)
    if face_data:
        location = get_face_location(face_data)
        face_num = util.get_face_num(face_data)
        landmark4 = util.get_landmark4(face_data)
        if util.compare_hand(cls_list, 'Heart_single'):
            detect_img = gesture_util.draw_heart_single(detect_img)
        if util.compare_hand(cls_list, 'Ok'):
            detect_img = gesture_util.draw_firework(detect_img)
        if util.compare_hand(cls_list, 'One'):
            glasses = cv2.imread(glasses_img[1], cv2.IMREAD_UNCHANGED)
            detect_img = gesture_util.draw_one(detect_img)
        if util.compare_hand(cls_list, 'Two'):
            glasses = cv2.imread(glasses_img[2], cv2.IMREAD_UNCHANGED)
            detect_img = gesture_util.draw_two(detect_img)
        if util.compare_hand(cls_list, 'Three'):
            glasses = cv2.imread(glasses_img[3], cv2.IMREAD_UNCHANGED)
            detect_img = gesture_util.draw_three(detect_img)
        if util.compare_hand(cls_list, 'Four'):
            glasses = cv2.imread(glasses_img[4], cv2.IMREAD_UNCHANGED)
            detect_img = gesture_util.draw_four(detect_img)
        if util.compare_hand(cls_list, 'Five'):
            glasses = cv2.imread(glasses_img[5], cv2.IMREAD_UNCHANGED)
            detect_img = gesture_util.draw_five(detect_img)
        if util.compare_hand(cls_list, 'Fist'):
            glasses = cv2.imread(glasses_img[random.randint(0, len(glasses_img)-1)], cv2.IMREAD_UNCHANGED)
        if util.compare_hand(cls_list, 'ILY'):
            detect_img = gesture_util.draw_love(detect_img)

        detect_img = face_util.wear_glasses(detect_img, glasses, face_num, landmark4)
        detect_img = cv2.flip(detect_img, 1)
    else:
        detect_img = cv2.flip(detect_img, 1)
    # for i, cls in enumerate(cls_list):
    #     if cls != 'Face':
    #         cv2.putText(detect_img, cls, (50, 50 + 100 * i), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 255, 255), 2)
    cv2.imshow('pic', detect_img)
    key = cv2.waitKey(500) & 0xFF
    if key == ord('q'):
        break
cap.release()
cv2.destroyAllWindows()

該代碼主要識別數字1-5、比心、OK、單手我愛你和拳頭手勢，數字1-5對應不同類型的眼鏡，拳頭代表隨機更換眼鏡，比心會在界面上畫出心❤，OK會在界面上展示一些煙花，單手我愛你展示愛你的表情。

下面是一些截圖展示：

one：