學習筆記TF058:人臉識別


人臉識別,基於人臉部特征信息識別身份的生物識別技術。攝像機、攝像頭采集人臉圖像或視頻流,自動檢測、跟蹤圖像中人臉,做臉部相關技術處理,人臉檢測、人臉關鍵點檢測、人臉驗證等。《麻省理工科技評論》(MIT Technology Review),2017年全球十大突破性技術榜單,支付寶“刷臉支付”(Paying with Your Face)入圍。

人臉識別優勢,非強制性(采集方式不容易被察覺,被識別人臉圖像可主動獲取)、非接觸性(用戶不需要與設備接觸)、並發性(可同時多人臉檢測、跟蹤、識別)。深度學習前,人臉識別兩步驟:高維人工特征提取、降維。傳統人臉識別技術基於可見光圖像。深度學習+大數據(海量有標注人臉數據)為人臉識別領域主流技術路線。神經網絡人臉識別技術,大量樣本圖像訓練識別模型,無需人工選取特征,樣本訓練過程自行學習,識別准確率可以達到99%。

人臉識別技術流程。

人臉圖像采集、檢測。人臉圖像采集,攝像頭把人臉圖像采集下來,靜態圖像、動態圖像、不同位置、不同表情。用戶在采集設備拍報范圍內,采集設置自動搜索並拍攝。人臉檢測屬於目標檢測(object detection)。對要檢測目標對象概率統計,得到待檢測對象特征,建立目標檢測模型。用模型匹配輸入圖像,輸出匹配區域。人臉檢測是人臉識別預處理,准確標定人臉在圖像的位置大小。人臉圖像模式特征豐富,直方圖特征、顏色特征、模板特征、結構特征、哈爾特征(Haar-like feature)。人臉檢測挑出有用信息,用特征檢測人臉。人臉檢測算法,模板匹配模型、Adaboost模型,Adaboost模型速度。精度綜合性能最好,訓練慢、檢測快,可達到視頻流實時檢測效果。

人臉圖像預處理。基於人臉檢測結果,處理圖像,服務特征提取。系統獲取人臉圖像受到各種條件限制、隨機干擾,需縮放、旋轉、拉伸、光線補償、灰度變換、直方圖均衡化、規范化、幾何校正、過濾、銳化等圖像預處理。

人臉圖像特征提取。人臉圖像信息數字化,人臉圖像轉變為一串數字(特征向量)。如,眼睛左邊、嘴唇右邊、鼻子、下巴位置,特征點間歐氏距離、曲率、角度提取出特征分量,相關特征連接成長特征向量。

人臉圖像匹配、識別。提取人臉圖像特征數據與數據庫存儲人臉特征模板搜索匹配,根據相似程度對身份信息進行判斷,設定閾值,相似度越過閾值,輸出匹配結果。確認,一對一(1:1)圖像比較,證明“你就是你”,金融核實身份、信息安全領域。辨認,一對多(1:N)圖像匹配,“N人中找你”,視頻流,人走進識別范圍就完成識別,安防領域。

人臉識別分類。

人臉檢測。檢測、定位圖片人臉,返回高業餓呀人臉框坐標。對人臉分析、處理的第一步。“滑動窗口”,選擇圖像矩形區域作滑動窗口,窗口中提取特征對圖像區域描述,根據特征描述判斷窗口是否人臉。不斷遍歷需要觀察窗口。

人臉關鍵點檢測。定位、返回人臉五官、輪廓關鍵點坐標位置。人臉輪廓、眼睛、眉毛、嘴唇、鼻子輪廓。Face++提供高達106點關鍵點。人臉關鍵點定位技術,級聯形回歸(cascaded shape regression, CSR)。人臉識別,基於DeepID網絡結構。DeepID網絡結構類似卷積神經網絡結構,倒數第二層,有DeepID層,與卷積層4、最大池化層3相連,卷積神經網絡層數越高視野域越大,既考慮局部特征,又考慮全局特征。輸入層 31x39x1、卷積層1 28x36x20(卷積核4x4x1)、最大池化層1 12x18x20(過濾器2x2)、卷積層2 12x16x20(卷積核3x3x20)、最大池化層2 6x8x40(過濾器2x2)、卷積層3 4x6x60(卷積核3x3x40)、最大池化層2 2x3x60(過濾器2x2)、卷積層4 2x2x80(卷積核2x2x60)、DeepID層 1x160、全連接層 Softmax。《Deep Learning Face Representation from Predicting 10000 Classes》 http://mmlab.ie.cuhk.edu.hk/pdf/YiSun_CVPR14.pdf 。

人臉驗證。分析兩張人臉同一人可能性大小。輸入兩張人臉,得到置信度分類、相應閾值,評估相似度。

人臉屬性檢測。人臉屬性辯識、人臉情緒分析。https://www.betaface.com/wpa/ 在線人臉識別測試。給出人年齡、是否有胡子、情緒(高興、正常、生氣、憤怒)、性別、是否帶眼鏡、膚色。

人臉識別應用,美圖秀秀美顏應用、世紀佳緣查看潛在配偶“面相”相似度,支付領域“刷臉支付”,安防領域“人臉鑒權”。Face++、商湯科技,提供人臉識別SDK。

人臉檢測。https://github.com/davidsandberg/facenet 。

Florian Schroff、Dmitry Kalenichenko、James Philbin論文《FaceNet: A Unified Embedding for Face Recognition and Clustering》 https://arxiv.org/abs/1503.03832 。https://github.com/davidsandberg/facenet/wiki/Validate-on-lfw 。

LFW(Labeled Faces in the Wild Home)數據集。http://vis-www.cs.umass.edu/lfw/ 。美國馬薩諸塞大學阿姆斯特分校計算機視覺實驗室整理。13233張圖片,5749人。4096人只有一張圖片,1680人多於一張。每張圖片尺寸250x250。人臉圖片在每個人物名字文件夾下。

數據預處理。校准代碼 https://github.com/davidsandberg/facenet/blob/master/src/align/align_dataset_mtcnn.py 。
檢測所用數據集校准為和預訓練模型所用數據集大小一致。
設置環境變量

export PYTHONPATH=[...]/facenet/src

校准命令

for N in {1..4}; do python src/align/align_dataset_mtcnn.py ~/datasets/lfw/raw ~/datasets/lfw/lfw_mtcnnpy_160 --image_size 160 --margin 32 --random_order --gpu_memory_fraction 0.25 & done

預訓練模型20170216-091149.zip https://drive.google.com/file/d/0B5MzpY9kBtDVZ2RpVDYwWmxoSUk 。
訓練集 MS-Celeb-1M數據集 https://www.microsoft.com/en-us/research/project/ms-celeb-1m-challenge-recognizing-one-million-celebrities-real-world/ 。微軟人臉識別數據庫,名人榜選擇前100萬名人,搜索引擎采集每個名人100張人臉圖片。預訓練模型准確率0.993+-0.004。

檢測。python src/validate_on_lfw.py datasets/lfw/lfw_mtcnnpy_160 models
基准比較,采用facenet/data/pairs.txt,官方隨機生成數據,匹配和不匹配人名和圖片編號。

十折交叉驗證(10-fold cross validation),精度測試方法。數據集分成10份,輪流將其中9份做訓練集,1份做測試保,10次結果均值作算法精度估計。一般需要多次10折交叉驗證求均值。

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import tensorflow as tf
import numpy as np
import argparse
import facenet
import lfw
import os
import sys
import math
from sklearn import metrics
from scipy.optimize import brentq
from scipy import interpolate

def main(args):
with tf.Graph().as_default():
with tf.Session() as sess:

# Read the file containing the pairs used for testing
# 1. 讀入之前的pairs.txt文件
# 讀入后如[['Abel_Pacheco','1','4']]
pairs = lfw.read_pairs(os.path.expanduser(args.lfw_pairs))
# Get the paths for the corresponding images
# 獲取文件路徑和是否匹配關系對
paths, actual_issame = lfw.get_paths(os.path.expanduser(args.lfw_dir), pairs, args.lfw_file_ext)
# Load the model
# 2. 加載模型
facenet.load_model(args.model)

# Get input and output tensors
# 獲取輸入輸出張量
images_placeholder = tf.get_default_graph().get_tensor_by_name("input:0")
embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0")
phase_train_placeholder = tf.get_default_graph().get_tensor_by_name("phase_train:0")

#image_size = images_placeholder.get_shape()[1] # For some reason this doesn't work for frozen graphs
image_size = args.image_size
embedding_size = embeddings.get_shape()[1]

# Run forward pass to calculate embeddings
# 3. 使用前向傳播驗證
print('Runnning forward pass on LFW images')
batch_size = args.lfw_batch_size
nrof_images = len(paths)
nrof_batches = int(math.ceil(1.0*nrof_images / batch_size)) # 總共批次數
emb_array = np.zeros((nrof_images, embedding_size))
for i in range(nrof_batches):
start_index = i*batch_size
end_index = min((i+1)*batch_size, nrof_images)
paths_batch = paths[start_index:end_index]
images = facenet.load_data(paths_batch, False, False, image_size)
feed_dict = { images_placeholder:images, phase_train_placeholder:False }
emb_array[start_index:end_index,:] = sess.run(embeddings, feed_dict=feed_dict)

# 4. 計算准確率、驗證率,十折交叉驗證方法
tpr, fpr, accuracy, val, val_std, far = lfw.evaluate(emb_array,
actual_issame, nrof_folds=args.lfw_nrof_folds)
print('Accuracy: %1.3f+-%1.3f' % (np.mean(accuracy), np.std(accuracy)))
print('Validation rate: %2.5f+-%2.5f @ FAR=%2.5f' % (val, val_std, far))
# 得到auc值
auc = metrics.auc(fpr, tpr)
print('Area Under Curve (AUC): %1.3f' % auc)
# 1得到錯誤率(eer)
eer = brentq(lambda x: 1. - x - interpolate.interp1d(fpr, tpr)(x), 0., 1.)
print('Equal Error Rate (EER): %1.3f' % eer)

def parse_arguments(argv):
parser = argparse.ArgumentParser()

parser.add_argument('lfw_dir', type=str,
help='Path to the data directory containing aligned LFW face patches.')
parser.add_argument('--lfw_batch_size', type=int,
help='Number of images to process in a batch in the LFW test set.', default=100)
parser.add_argument('model', type=str,
help='Could be either a directory containing the meta_file and ckpt_file or a model protobuf (.pb) file')
parser.add_argument('--image_size', type=int,
help='Image size (height, width) in pixels.', default=160)
parser.add_argument('--lfw_pairs', type=str,
help='The file containing the pairs to use for validation.', default='data/pairs.txt')
parser.add_argument('--lfw_file_ext', type=str,
help='The file extension for the LFW dataset.', default='png', choices=['jpg', 'png'])
parser.add_argument('--lfw_nrof_folds', type=int,
help='Number of folds to use for cross validation. Mainly used for testing.', default=10)
return parser.parse_args(argv)
if __name__ == '__main__':
main(parse_arguments(sys.argv[1:]))

性別、年齡識別。https://github.com/dpressel/rude-carnie 。

Adience 數據集。http://www.openu.ac.il/home/hassner/Adience/data.html#agegender 。26580張圖片,2284類,年齡范圍8個區段(0~2、4~6、8~13、15~20、25~32、38~43、48~53、60~),含有噪聲、姿勢、光照變化。aligned # 經過剪裁對齊數據,faces # 原始數據。fold_0_data.txt至fold_4_data.txt 全部數據標記。fold_frontal_0_data.txt至fold_frontal_4_data.txt 僅用近似正面姿態面部標記。數據結構 user_id 用戶Flickr帳戶ID、original_image 圖片文件名、face_id 人標識符、age、gender、x、y、dx、dy 人臉邊框、tilt_ang 切斜角度、fiducial_yaw_angle 基准偏移角度、fiducial_score 基准分數。https://www.flickr.com/

數據預處理。腳本把數據處理成TFRecords格式。https://github.com/dpressel/rude-carnie/blob/master/preproc.py 。https://github.com/GilLevi/AgeGenderDeepLearning/tree/master/Folds文件夾,已經對訓練集、測試集划分、標注。gender_train.txt、gender_val.txt 圖片列表 Adience 數據集處理TFRecords文件。圖片處理為大小256x256 JPEG編碼RGB圖像。tf.python_io.TFRecordWriter寫入TFRecords文件,輸出文件output_file。

構建模型。年齡、性別訓練模型,Gil Levi、Tal Hassner論文《Age and Gender Classification Using Convolutional Neural Networks》http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.722.9654&rank=1 。模型 https://github.com/dpressel/rude-carnie/blob/master/model.py 。tenforflow.contrib.slim。

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from datetime import datetime
import time
import os
import numpy as np
import tensorflow as tf
from data import distorted_inputs
import re
from tensorflow.contrib.layers import *
from tensorflow.contrib.slim.python.slim.nets.inception_v3 import inception_v3_base
TOWER_NAME = 'tower'
def select_model(name):
if name.startswith('inception'):
print('selected (fine-tuning) inception model')
return inception_v3
elif name == 'bn':
print('selected batch norm model')
return levi_hassner_bn
print('selected default model')
return levi_hassner
def get_checkpoint(checkpoint_path, requested_step=None, basename='checkpoint'):
if requested_step is not None:
model_checkpoint_path = '%s/%s-%s' % (checkpoint_path, basename, requested_step)
if os.path.exists(model_checkpoint_path) is None:
print('No checkpoint file found at [%s]' % checkpoint_path)
exit(-1)
print(model_checkpoint_path)
print(model_checkpoint_path)
return model_checkpoint_path, requested_step
ckpt = tf.train.get_checkpoint_state(checkpoint_path)
if ckpt and ckpt.model_checkpoint_path:
# Restore checkpoint as described in top of this program
print(ckpt.model_checkpoint_path)
global_step = ckpt.model_checkpoint_path.split('/')[-1].split('-')[-1]
return ckpt.model_checkpoint_path, global_step
else:
print('No checkpoint file found at [%s]' % checkpoint_path)
exit(-1)
def _activation_summary(x):
tensor_name = re.sub('%s_[0-9]*/' % TOWER_NAME, '', x.op.name)
tf.summary.histogram(tensor_name + '/activations', x)
tf.summary.scalar(tensor_name + '/sparsity', tf.nn.zero_fraction(x))
def inception_v3(nlabels, images, pkeep, is_training):
batch_norm_params = {
"is_training": is_training,
"trainable": True,
# Decay for the moving averages.
"decay": 0.9997,
# Epsilon to prevent 0s in variance.
"epsilon": 0.001,
# Collection containing the moving mean and moving variance.
"variables_collections": {
"beta": None,
"gamma": None,
"moving_mean": ["moving_vars"],
"moving_variance": ["moving_vars"],
}
}
weight_decay = 0.00004
stddev=0.1
weights_regularizer = tf.contrib.layers.l2_regularizer(weight_decay)
with tf.variable_scope("InceptionV3", "InceptionV3", [images]) as scope:
with tf.contrib.slim.arg_scope(
[tf.contrib.slim.conv2d, tf.contrib.slim.fully_connected],
weights_regularizer=weights_regularizer,
trainable=True):
with tf.contrib.slim.arg_scope(
[tf.contrib.slim.conv2d],
weights_initializer=tf.truncated_normal_initializer(stddev=stddev),
activation_fn=tf.nn.relu,
normalizer_fn=batch_norm,
normalizer_params=batch_norm_params):
net, end_points = inception_v3_base(images, scope=scope)
with tf.variable_scope("logits"):
shape = net.get_shape()
net = avg_pool2d(net, shape[1:3], padding="VALID", scope="pool")
net = tf.nn.dropout(net, pkeep, name='droplast')
net = flatten(net, scope="flatten")

with tf.variable_scope('output') as scope:

weights = tf.Variable(tf.truncated_normal([2048, nlabels], mean=0.0, stddev=0.01), name='weights')
biases = tf.Variable(tf.constant(0.0, shape=[nlabels], dtype=tf.float32), name='biases')
output = tf.add(tf.matmul(net, weights), biases, name=scope.name)
_activation_summary(output)
return output
def levi_hassner_bn(nlabels, images, pkeep, is_training):
batch_norm_params = {
"is_training": is_training,
"trainable": True,
# Decay for the moving averages.
"decay": 0.9997,
# Epsilon to prevent 0s in variance.
"epsilon": 0.001,
# Collection containing the moving mean and moving variance.
"variables_collections": {
"beta": None,
"gamma": None,
"moving_mean": ["moving_vars"],
"moving_variance": ["moving_vars"],
}
}
weight_decay = 0.0005
weights_regularizer = tf.contrib.layers.l2_regularizer(weight_decay)
with tf.variable_scope("LeviHassnerBN", "LeviHassnerBN", [images]) as scope:
with tf.contrib.slim.arg_scope(
[convolution2d, fully_connected],
weights_regularizer=weights_regularizer,
biases_initializer=tf.constant_initializer(1.),
weights_initializer=tf.random_normal_initializer(stddev=0.005),
trainable=True):
with tf.contrib.slim.arg_scope(
[convolution2d],
weights_initializer=tf.random_normal_initializer(stddev=0.01),
normalizer_fn=batch_norm,
normalizer_params=batch_norm_params):
conv1 = convolution2d(images, 96, [7,7], [4, 4], padding='VALID', biases_initializer=tf.constant_initializer(0.), scope='conv1')
pool1 = max_pool2d(conv1, 3, 2, padding='VALID', scope='pool1')
conv2 = convolution2d(pool1, 256, [5, 5], [1, 1], padding='SAME', scope='conv2')
pool2 = max_pool2d(conv2, 3, 2, padding='VALID', scope='pool2')
conv3 = convolution2d(pool2, 384, [3, 3], [1, 1], padding='SAME', biases_initializer=tf.constant_initializer(0.), scope='conv3')
pool3 = max_pool2d(conv3, 3, 2, padding='VALID', scope='pool3')
# can use tf.contrib.layer.flatten
flat = tf.reshape(pool3, [-1, 384*6*6], name='reshape')
full1 = fully_connected(flat, 512, scope='full1')
drop1 = tf.nn.dropout(full1, pkeep, name='drop1')
full2 = fully_connected(drop1, 512, scope='full2')
drop2 = tf.nn.dropout(full2, pkeep, name='drop2')
with tf.variable_scope('output') as scope:

weights = tf.Variable(tf.random_normal([512, nlabels], mean=0.0, stddev=0.01), name='weights')
biases = tf.Variable(tf.constant(0.0, shape=[nlabels], dtype=tf.float32), name='biases')
output = tf.add(tf.matmul(drop2, weights), biases, name=scope.name)
return output
def levi_hassner(nlabels, images, pkeep, is_training):
weight_decay = 0.0005
weights_regularizer = tf.contrib.layers.l2_regularizer(weight_decay)
with tf.variable_scope("LeviHassner", "LeviHassner", [images]) as scope:
with tf.contrib.slim.arg_scope(
[convolution2d, fully_connected],
weights_regularizer=weights_regularizer,
biases_initializer=tf.constant_initializer(1.),
weights_initializer=tf.random_normal_initializer(stddev=0.005),
trainable=True):
with tf.contrib.slim.arg_scope(
[convolution2d],
weights_initializer=tf.random_normal_initializer(stddev=0.01)):
conv1 = convolution2d(images, 96, [7,7], [4, 4], padding='VALID', biases_initializer=tf.constant_initializer(0.), scope='conv1')
pool1 = max_pool2d(conv1, 3, 2, padding='VALID', scope='pool1')
norm1 = tf.nn.local_response_normalization(pool1, 5, alpha=0.0001, beta=0.75, name='norm1')
conv2 = convolution2d(norm1, 256, [5, 5], [1, 1], padding='SAME', scope='conv2')
pool2 = max_pool2d(conv2, 3, 2, padding='VALID', scope='pool2')
norm2 = tf.nn.local_response_normalization(pool2, 5, alpha=0.0001, beta=0.75, name='norm2')
conv3 = convolution2d(norm2, 384, [3, 3], [1, 1], biases_initializer=tf.constant_initializer(0.), padding='SAME', scope='conv3')
pool3 = max_pool2d(conv3, 3, 2, padding='VALID', scope='pool3')
flat = tf.reshape(pool3, [-1, 384*6*6], name='reshape')
full1 = fully_connected(flat, 512, scope='full1')
drop1 = tf.nn.dropout(full1, pkeep, name='drop1')
full2 = fully_connected(drop1, 512, scope='full2')
drop2 = tf.nn.dropout(full2, pkeep, name='drop2')
with tf.variable_scope('output') as scope:

weights = tf.Variable(tf.random_normal([512, nlabels], mean=0.0, stddev=0.01), name='weights')
biases = tf.Variable(tf.constant(0.0, shape=[nlabels], dtype=tf.float32), name='biases')
output = tf.add(tf.matmul(drop2, weights), biases, name=scope.name)
return output

訓練模型。https://github.com/dpressel/rude-carnie/blob/master/train.py 。

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from six.moves import xrange
from datetime import datetime
import time
import os
import numpy as np
import tensorflow as tf
from data import distorted_inputs
from model import select_model
import json
import re
LAMBDA = 0.01
MOM = 0.9
tf.app.flags.DEFINE_string('pre_checkpoint_path', '',
"""If specified, restore this pretrained model """
"""before beginning any training.""")
tf.app.flags.DEFINE_string('train_dir', '/home/dpressel/dev/work/AgeGenderDeepLearning/Folds/tf/test_fold_is_0',
'Training directory')
tf.app.flags.DEFINE_boolean('log_device_placement', False,
"""Whether to log device placement.""")
tf.app.flags.DEFINE_integer('num_preprocess_threads', 4,
'Number of preprocessing threads')
tf.app.flags.DEFINE_string('optim', 'Momentum',
'Optimizer')
tf.app.flags.DEFINE_integer('image_size', 227,
'Image size')
tf.app.flags.DEFINE_float('eta', 0.01,
'Learning rate')
tf.app.flags.DEFINE_float('pdrop', 0.,
'Dropout probability')
tf.app.flags.DEFINE_integer('max_steps', 40000,
'Number of iterations')
tf.app.flags.DEFINE_integer('steps_per_decay', 10000,
'Number of steps before learning rate decay')
tf.app.flags.DEFINE_float('eta_decay_rate', 0.1,
'Learning rate decay')
tf.app.flags.DEFINE_integer('epochs', -1,
'Number of epochs')
tf.app.flags.DEFINE_integer('batch_size', 128,
'Batch size')
tf.app.flags.DEFINE_string('checkpoint', 'checkpoint',
'Checkpoint name')
tf.app.flags.DEFINE_string('model_type', 'default',
'Type of convnet')
tf.app.flags.DEFINE_string('pre_model',
'',#'./inception_v3.ckpt',
'checkpoint file')
FLAGS = tf.app.flags.FLAGS
# Every 5k steps cut learning rate in half
def exponential_staircase_decay(at_step=10000, decay_rate=0.1):
print('decay [%f] every [%d] steps' % (decay_rate, at_step))
def _decay(lr, global_step):
return tf.train.exponential_decay(lr, global_step,
at_step, decay_rate, staircase=True)
return _decay
def optimizer(optim, eta, loss_fn, at_step, decay_rate):
global_step = tf.Variable(0, trainable=False)
optz = optim
if optim == 'Adadelta':
optz = lambda lr: tf.train.AdadeltaOptimizer(lr, 0.95, 1e-6)
lr_decay_fn = None
elif optim == 'Momentum':
optz = lambda lr: tf.train.MomentumOptimizer(lr, MOM)
lr_decay_fn = exponential_staircase_decay(at_step, decay_rate)
return tf.contrib.layers.optimize_loss(loss_fn, global_step, eta, optz, clip_gradients=4., learning_rate_decay_fn=lr_decay_fn)
def loss(logits, labels):
labels = tf.cast(labels, tf.int32)
cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(
logits=logits, labels=labels, name='cross_entropy_per_example')
cross_entropy_mean = tf.reduce_mean(cross_entropy, name='cross_entropy')
tf.add_to_collection('losses', cross_entropy_mean)
losses = tf.get_collection('losses')
regularization_losses = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES)
total_loss = cross_entropy_mean + LAMBDA * sum(regularization_losses)
tf.summary.scalar('tl (raw)', total_loss)
#total_loss = tf.add_n(losses + regularization_losses, name='total_loss')
loss_averages = tf.train.ExponentialMovingAverage(0.9, name='avg')
loss_averages_op = loss_averages.apply(losses + [total_loss])
for l in losses + [total_loss]:
tf.summary.scalar(l.op.name + ' (raw)', l)
tf.summary.scalar(l.op.name, loss_averages.average(l))
with tf.control_dependencies([loss_averages_op]):
total_loss = tf.identity(total_loss)
return total_loss
def main(argv=None):
with tf.Graph().as_default():
model_fn = select_model(FLAGS.model_type)
# Open the metadata file and figure out nlabels, and size of epoch
# 打開元數據文件md.json,這個文件是在預處理數據時生成。找出nlabels、epoch大小
input_file = os.path.join(FLAGS.train_dir, 'md.json')
print(input_file)
with open(input_file, 'r') as f:
md = json.load(f)
images, labels, _ = distorted_inputs(FLAGS.train_dir, FLAGS.batch_size, FLAGS.image_size, FLAGS.num_preprocess_threads)
logits = model_fn(md['nlabels'], images, 1-FLAGS.pdrop, True)
total_loss = loss(logits, labels)
train_op = optimizer(FLAGS.optim, FLAGS.eta, total_loss, FLAGS.steps_per_decay, FLAGS.eta_decay_rate)
saver = tf.train.Saver(tf.global_variables())
summary_op = tf.summary.merge_all()
sess = tf.Session(config=tf.ConfigProto(
log_device_placement=FLAGS.log_device_placement))
tf.global_variables_initializer().run(session=sess)
# This is total hackland, it only works to fine-tune iv3
# 本例可以輸入預訓練模型Inception V3,可用來微調 Inception V3
if FLAGS.pre_model:
inception_variables = tf.get_collection(
tf.GraphKeys.VARIABLES, scope="InceptionV3")
restorer = tf.train.Saver(inception_variables)
restorer.restore(sess, FLAGS.pre_model)
if FLAGS.pre_checkpoint_path:
if tf.gfile.Exists(FLAGS.pre_checkpoint_path) is True:
print('Trying to restore checkpoint from %s' % FLAGS.pre_checkpoint_path)
restorer = tf.train.Saver()
tf.train.latest_checkpoint(FLAGS.pre_checkpoint_path)
print('%s: Pre-trained model restored from %s' %
(datetime.now(), FLAGS.pre_checkpoint_path))
# 將ckpt文件存儲在run-(pid)目錄
run_dir = '%s/run-%d' % (FLAGS.train_dir, os.getpid())
checkpoint_path = '%s/%s' % (run_dir, FLAGS.checkpoint)
if tf.gfile.Exists(run_dir) is False:
print('Creating %s' % run_dir)
tf.gfile.MakeDirs(run_dir)
tf.train.write_graph(sess.graph_def, run_dir, 'model.pb', as_text=True)
tf.train.start_queue_runners(sess=sess)
summary_writer = tf.summary.FileWriter(run_dir, sess.graph)
steps_per_train_epoch = int(md['train_counts'] / FLAGS.batch_size)
num_steps = FLAGS.max_steps if FLAGS.epochs < 1 else FLAGS.epochs * steps_per_train_epoch
print('Requested number of steps [%d]' % num_steps)

for step in xrange(num_steps):
start_time = time.time()
_, loss_value = sess.run([train_op, total_loss])
duration = time.time() - start_time
assert not np.isnan(loss_value), 'Model diverged with loss = NaN'
# 每10步記錄一次摘要文件,保存一個檢查點文件
if step % 10 == 0:
num_examples_per_step = FLAGS.batch_size
examples_per_sec = num_examples_per_step / duration
sec_per_batch = float(duration)

format_str = ('%s: step %d, loss = %.3f (%.1f examples/sec; %.3f ' 'sec/batch)')
print(format_str % (datetime.now(), step, loss_value,
examples_per_sec, sec_per_batch))
# Loss only actually evaluated every 100 steps?
if step % 100 == 0:
summary_str = sess.run(summary_op)
summary_writer.add_summary(summary_str, step)

if step % 1000 == 0 or (step + 1) == num_steps:
saver.save(sess, checkpoint_path, global_step=step)
if __name__ == '__main__':
tf.app.run()

驗證模型。https://github.com/dpressel/rude-carnie/blob/master/guess.py 。

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from datetime import datetime
import math
import time
from data import inputs
import numpy as np
import tensorflow as tf
from model import select_model, get_checkpoint
from utils import *
import os
import json
import csv
RESIZE_FINAL = 227
GENDER_LIST =['M','F']
AGE_LIST = ['(0, 2)','(4, 6)','(8, 12)','(15, 20)','(25, 32)','(38, 43)','(48, 53)','(60, 100)']
MAX_BATCH_SZ = 128
tf.app.flags.DEFINE_string('model_dir', '',
'Model directory (where training data lives)')
tf.app.flags.DEFINE_string('class_type', 'age',
'Classification type (age|gender)')
tf.app.flags.DEFINE_string('device_id', '/cpu:0',
'What processing unit to execute inference on')
tf.app.flags.DEFINE_string('filename', '',
'File (Image) or File list (Text/No header TSV) to process')
tf.app.flags.DEFINE_string('target', '',
'CSV file containing the filename processed along with best guess and score')
tf.app.flags.DEFINE_string('checkpoint', 'checkpoint',
'Checkpoint basename')
tf.app.flags.DEFINE_string('model_type', 'default',
'Type of convnet')
tf.app.flags.DEFINE_string('requested_step', '', 'Within the model directory, a requested step to restore e.g., 9000')
tf.app.flags.DEFINE_boolean('single_look', False, 'single look at the image or multiple crops')
tf.app.flags.DEFINE_string('face_detection_model', '', 'Do frontal face detection with model specified')
tf.app.flags.DEFINE_string('face_detection_type', 'cascade', 'Face detection model type (yolo_tiny|cascade)')
FLAGS = tf.app.flags.FLAGS
def one_of(fname, types):
return any([fname.endswith('.' + ty) for ty in types])
def resolve_file(fname):
if os.path.exists(fname): return fname
for suffix in ('.jpg', '.png', '.JPG', '.PNG', '.jpeg'):
cand = fname + suffix
if os.path.exists(cand):
return cand
return None
def classify_many_single_crop(sess, label_list, softmax_output, coder, images, image_files, writer):
try:
num_batches = math.ceil(len(image_files) / MAX_BATCH_SZ)
pg = ProgressBar(num_batches)
for j in range(num_batches):
start_offset = j * MAX_BATCH_SZ
end_offset = min((j + 1) * MAX_BATCH_SZ, len(image_files))

batch_image_files = image_files[start_offset:end_offset]
print(start_offset, end_offset, len(batch_image_files))
image_batch = make_multi_image_batch(batch_image_files, coder)
batch_results = sess.run(softmax_output, feed_dict={images:image_batch.eval()})
batch_sz = batch_results.shape[0]
for i in range(batch_sz):
output_i = batch_results[i]
best_i = np.argmax(output_i)
best_choice = (label_list[best_i], output_i[best_i])
print('Guess @ 1 %s, prob = %.2f' % best_choice)
if writer is not None:
f = batch_image_files[i]
writer.writerow((f, best_choice[0], '%.2f' % best_choice[1]))
pg.update()
pg.done()
except Exception as e:
print(e)
print('Failed to run all images')
def classify_one_multi_crop(sess, label_list, softmax_output, coder, images, image_file, writer):
try:
print('Running file %s' % image_file)
image_batch = make_multi_crop_batch(image_file, coder)
batch_results = sess.run(softmax_output, feed_dict={images:image_batch.eval()})
output = batch_results[0]
batch_sz = batch_results.shape[0]

for i in range(1, batch_sz):
output = output + batch_results[i]

output /= batch_sz
best = np.argmax(output) # 最可能性能分類
best_choice = (label_list[best], output[best])
print('Guess @ 1 %s, prob = %.2f' % best_choice)

nlabels = len(label_list)
if nlabels > 2:
output[best] = 0
second_best = np.argmax(output)
print('Guess @ 2 %s, prob = %.2f' % (label_list[second_best], output[second_best]))
if writer is not None:
writer.writerow((image_file, best_choice[0], '%.2f' % best_choice[1]))
except Exception as e:
print(e)
print('Failed to run image %s ' % image_file)
def list_images(srcfile):
with open(srcfile, 'r') as csvfile:
delim = ',' if srcfile.endswith('.csv') else '\t'
reader = csv.reader(csvfile, delimiter=delim)
if srcfile.endswith('.csv') or srcfile.endswith('.tsv'):
print('skipping header')
_ = next(reader)

return [row[0] for row in reader]
def main(argv=None): # pylint: disable=unused-argument
files = []

if FLAGS.face_detection_model:
print('Using face detector (%s) %s' % (FLAGS.face_detection_type, FLAGS.face_detection_model))
face_detect = face_detection_model(FLAGS.face_detection_type, FLAGS.face_detection_model)
face_files, rectangles = face_detect.run(FLAGS.filename)
print(face_files)
files += face_files
config = tf.ConfigProto(allow_soft_placement=True)
with tf.Session(config=config) as sess:
label_list = AGE_LIST if FLAGS.class_type == 'age' else GENDER_LIST
nlabels = len(label_list)
print('Executing on %s' % FLAGS.device_id)
model_fn = select_model(FLAGS.model_type)
with tf.device(FLAGS.device_id):

images = tf.placeholder(tf.float32, [None, RESIZE_FINAL, RESIZE_FINAL, 3])
logits = model_fn(nlabels, images, 1, False)
init = tf.global_variables_initializer()

requested_step = FLAGS.requested_step if FLAGS.requested_step else None

checkpoint_path = '%s' % (FLAGS.model_dir)
model_checkpoint_path, global_step = get_checkpoint(checkpoint_path, requested_step, FLAGS.checkpoint)

saver = tf.train.Saver()
saver.restore(sess, model_checkpoint_path)

softmax_output = tf.nn.softmax(logits)
coder = ImageCoder()
# Support a batch mode if no face detection model
if len(files) == 0:
if (os.path.isdir(FLAGS.filename)):
for relpath in os.listdir(FLAGS.filename):
abspath = os.path.join(FLAGS.filename, relpath)

if os.path.isfile(abspath) and any([abspath.endswith('.' + ty) for ty in ('jpg', 'png', 'JPG', 'PNG', 'jpeg')]):
print(abspath)
files.append(abspath)
else:
files.append(FLAGS.filename)
# If it happens to be a list file, read the list and clobber the files
if any([FLAGS.filename.endswith('.' + ty) for ty in ('csv', 'tsv', 'txt')]):
files = list_images(FLAGS.filename)

writer = None
output = None
if FLAGS.target:
print('Creating output file %s' % FLAGS.target)
output = open(FLAGS.target, 'w')
writer = csv.writer(output)
writer.writerow(('file', 'label', 'score'))
image_files = list(filter(lambda x: x is not None, [resolve_file(f) for f in files]))
print(image_files)
if FLAGS.single_look:
classify_many_single_crop(sess, label_list, softmax_output, coder, images, image_files, writer)
else:
for image_file in image_files:
classify_one_multi_crop(sess, label_list, softmax_output, coder, images, image_file, writer)
if output is not None:
output.close()

if __name__ == '__main__':
tf.app.run()

微軟臉部圖片識別性別、年齡網站 http://how-old.net/ 。圖片識別年齡、性別。根據問題搜索圖片。

參考資料:
《TensorFlow技術解析與實戰》

歡迎推薦上海機器學習工作機會,我的微信:qingxingfengzi


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM