tensorflow 1.0 学习:模型的保存与恢复(Saver)
将训练好的模型参数保存起来,以便以后进行验证或测试,这是我们经常要做的事情。tf里面提供模型保存的是tf.train.Saver()模块。
模型保存,先要创建一个Saver对象:如
saver=tf.train.Saver()
在创建这个Saver对象的时候,有一个参数我们经常会用到,就是 max_to_keep 参数,这个是用来设置保存模型的个数,默认为5,即 max_to_keep=5,保存最近的5个模型。如果你想每训练一代(epoch)就想保存一次模型,则可以将 max_to_keep设置为None或者0,如:
saver=tf.train.Saver(max_to_keep=0)
但是这样做除了多占用硬盘,并没有实际多大的用处,因此不推荐。
当然,如果你只想保存最后一代的模型,则只需要将max_to_keep设置为1即可,即
saver=tf.train.Saver(max_to_keep=1)
创建完saver对象后,就可以保存训练好的模型了,如:
saver.save(sess,'ckpt/mnist.ckpt',global_step=step)
第一个参数sess,这个就不用说了。第二个参数设定保存的路径和名字,第三个参数将训练的次数作为后缀加入到模型名字中。
saver.save(sess, 'my-model', global_step=0) ==> filename: 'my-model-0'
...
saver.save(sess, 'my-model', global_step=1000) ==> filename: 'my-model-1000'
看一个mnist实例:
# -*- coding: utf-8 -*- """ Created on Sun Jun 4 10:29:48 2017 @author: Administrator """ import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data mnist = input_data.read_data_sets("MNIST_data/", one_hot=False) x = tf.placeholder(tf.float32, [None, 784]) y_=tf.placeholder(tf.int32,[None,]) dense1 = tf.layers.dense(inputs=x, units=1024, activation=tf.nn.relu, kernel_initializer=tf.truncated_normal_initializer(stddev=0.01), kernel_regularizer=tf.nn.l2_loss) dense2= tf.layers.dense(inputs=dense1, units=512, activation=tf.nn.relu, kernel_initializer=tf.truncated_normal_initializer(stddev=0.01), kernel_regularizer=tf.nn.l2_loss) logits= tf.layers.dense(inputs=dense2, units=10, activation=None, kernel_initializer=tf.truncated_normal_initializer(stddev=0.01), kernel_regularizer=tf.nn.l2_loss) loss=tf.losses.sparse_softmax_cross_entropy(labels=y_,logits=logits) train_op=tf.train.AdamOptimizer(learning_rate=0.001).minimize(loss) correct_prediction = tf.equal(tf.cast(tf.argmax(logits,1),tf.int32), y_) acc= tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) sess=tf.InteractiveSession() sess.run(tf.global_variables_initializer()) saver=tf.train.Saver(max_to_keep=1) for i in range(100): batch_xs, batch_ys = mnist.train.next_batch(100) sess.run(train_op, feed_dict={x: batch_xs, y_: batch_ys}) val_loss,val_acc=sess.run([loss,acc], feed_dict={x: mnist.test.images, y_: mnist.test.labels}) print('epoch:%d, val_loss:%f, val_acc:%f'%(i,val_loss,val_acc)) saver.save(sess,'ckpt/mnist.ckpt',global_step=i+1) sess.close()
代码中红色部分就是保存模型的代码,虽然我在每训练完一代的时候,都进行了保存,但后一次保存的模型会覆盖前一次的,最终只会保存最后一次。因此我们可以节省时间,将保存代码放到循环之外(仅适用max_to_keep=1,否则还是需要放在循环内).
在实验中,最后一代可能并不是验证精度最高的一代,因此我们并不想默认保存最后一代,而是想保存验证精度最高的一代,则加个中间变量和判断语句就可以了。
saver=tf.train.Saver(max_to_keep=1) max_acc=0 for i in range(100): batch_xs, batch_ys = mnist.train.next_batch(100) sess.run(train_op, feed_dict={x: batch_xs, y_: batch_ys}) val_loss,val_acc=sess.run([loss,acc], feed_dict={x: mnist.test.images, y_: mnist.test.labels}) print('epoch:%d, val_loss:%f, val_acc:%f'%(i,val_loss,val_acc)) if val_acc>max_acc: max_acc=val_acc saver.save(sess,'ckpt/mnist.ckpt',global_step=i+1) sess.close()
如果我们想保存验证精度最高的三代,且把每次的验证精度也随之保存下来,则我们可以生成一个txt文件用于保存。
saver=tf.train.Saver(max_to_keep=3) max_acc=0 f=open('ckpt/acc.txt','w') for i in range(100): batch_xs, batch_ys = mnist.train.next_batch(100) sess.run(train_op, feed_dict={x: batch_xs, y_: batch_ys}) val_loss,val_acc=sess.run([loss,acc], feed_dict={x: mnist.test.images, y_: mnist.test.labels}) print('epoch:%d, val_loss:%f, val_acc:%f'%(i,val_loss,val_acc)) f.write(str(i+1)+', val_acc: '+str(val_acc)+'\n') if val_acc>max_acc: max_acc=val_acc saver.save(sess,'ckpt/mnist.ckpt',global_step=i+1) f.close() sess.close()
模型的恢复用的是restore()函数,它需要两个参数restore(sess, save_path),save_path指的是保存的模型路径。我们可以使用tf.train.latest_checkpoint()来自动获取最后一次保存的模型。如:
model_file=tf.train.latest_checkpoint('ckpt/') saver.restore(sess,model_file)
则程序后半段代码我们可以改为:
sess=tf.InteractiveSession() sess.run(tf.global_variables_initializer()) is_train=False saver=tf.train.Saver(max_to_keep=3) #训练阶段 if is_train: max_acc=0 f=open('ckpt/acc.txt','w') for i in range(100): batch_xs, batch_ys = mnist.train.next_batch(100) sess.run(train_op, feed_dict={x: batch_xs, y_: batch_ys}) val_loss,val_acc=sess.run([loss,acc], feed_dict={x: mnist.test.images, y_: mnist.test.labels}) print('epoch:%d, val_loss:%f, val_acc:%f'%(i,val_loss,val_acc)) f.write(str(i+1)+', val_acc: '+str(val_acc)+'\n') if val_acc>max_acc: max_acc=val_acc saver.save(sess,'ckpt/mnist.ckpt',global_step=i+1) f.close() #验证阶段 else: model_file=tf.train.latest_checkpoint('ckpt/') saver.restore(sess,model_file) val_loss,val_acc=sess.run([loss,acc], feed_dict={x: mnist.test.images, y_: mnist.test.labels}) print('val_loss:%f, val_acc:%f'%(val_loss,val_acc)) sess.close()
标红的地方,就是与保存、恢复模型相关的代码。用一个bool型变量is_train来控制训练和验证两个阶段。
整个源程序:

# -*- coding: utf-8 -*- """ Created on Sun Jun 4 10:29:48 2017 @author: Administrator """ import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data mnist = input_data.read_data_sets("MNIST_data/", one_hot=False) x = tf.placeholder(tf.float32, [None, 784]) y_=tf.placeholder(tf.int32,[None,]) dense1 = tf.layers.dense(inputs=x, units=1024, activation=tf.nn.relu, kernel_initializer=tf.truncated_normal_initializer(stddev=0.01), kernel_regularizer=tf.nn.l2_loss) dense2= tf.layers.dense(inputs=dense1, units=512, activation=tf.nn.relu, kernel_initializer=tf.truncated_normal_initializer(stddev=0.01), kernel_regularizer=tf.nn.l2_loss) logits= tf.layers.dense(inputs=dense2, units=10, activation=None, kernel_initializer=tf.truncated_normal_initializer(stddev=0.01), kernel_regularizer=tf.nn.l2_loss) loss=tf.losses.sparse_softmax_cross_entropy(labels=y_,logits=logits) train_op=tf.train.AdamOptimizer(learning_rate=0.001).minimize(loss) correct_prediction = tf.equal(tf.cast(tf.argmax(logits,1),tf.int32), y_) acc= tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) sess=tf.InteractiveSession() sess.run(tf.global_variables_initializer()) is_train=True saver=tf.train.Saver(max_to_keep=3) #训练阶段 if is_train: max_acc=0 f=open('ckpt/acc.txt','w') for i in range(100): batch_xs, batch_ys = mnist.train.next_batch(100) sess.run(train_op, feed_dict={x: batch_xs, y_: batch_ys}) val_loss,val_acc=sess.run([loss,acc], feed_dict={x: mnist.test.images, y_: mnist.test.labels}) print('epoch:%d, val_loss:%f, val_acc:%f'%(i,val_loss,val_acc)) f.write(str(i+1)+', val_acc: '+str(val_acc)+'\n') if val_acc>max_acc: max_acc=val_acc saver.save(sess,'ckpt/mnist.ckpt',global_step=i+1) f.close() #验证阶段 else: model_file=tf.train.latest_checkpoint('ckpt/') saver.restore(sess,model_file) val_loss,val_acc=sess.run([loss,acc], feed_dict={x: mnist.test.images, y_: mnist.test.labels}) print('val_loss:%f, val_acc:%f'%(val_loss,val_acc)) sess.close()
参考文章:http://blog.csdn.net/u011500062/article/details/51728830
tensorflow 1.0 学习:用别人训练好的模型来进行图像分类
谷歌在大型图像数据库ImageNet上训练好了一个Inception-v3模型,这个模型我们可以直接用来进来图像分类。
下载地址:https://storage.googleapis.com/download.tensorflow.org/models/inception_dec_2015.zip
下载完解压后,得到几个文件:
其中的classify_image_graph_def.pb 文件就是训练好的Inception-v3模型。
imagenet_synset_to_human_label_map.txt是类别文件。
随机找一张图片:如
对这张图片进行识别,看它属于什么类?
代码如下:先创建一个类NodeLookup来将softmax概率值映射到标签上。
然后创建一个函数create_graph()来读取模型。
最后读取图片进行分类识别:
# -*- coding: utf-8 -*- import tensorflow as tf import numpy as np import re import os model_dir='D:/tf/model/' image='d:/cat.jpg' #将类别ID转换为人类易读的标签 class NodeLookup(object): def __init__(self, label_lookup_path=None, uid_lookup_path=None): if not label_lookup_path: label_lookup_path = os.path.join( model_dir, 'imagenet_2012_challenge_label_map_proto.pbtxt') if not uid_lookup_path: uid_lookup_path = os.path.join( model_dir, 'imagenet_synset_to_human_label_map.txt') self.node_lookup = self.load(label_lookup_path, uid_lookup_path) def load(self, label_lookup_path, uid_lookup_path): if not tf.gfile.Exists(uid_lookup_path): tf.logging.fatal('File does not exist %s', uid_lookup_path) if not tf.gfile.Exists(label_lookup_path): tf.logging.fatal('File does not exist %s', label_lookup_path) # Loads mapping from string UID to human-readable string proto_as_ascii_lines = tf.gfile.GFile(uid_lookup_path).readlines() uid_to_human = {} p = re.compile(r'[n\d]*[ \S,]*') for line in proto_as_ascii_lines: parsed_items = p.findall(line) uid = parsed_items[0] human_string = parsed_items[2] uid_to_human[uid] = human_string # Loads mapping from string UID to integer node ID. node_id_to_uid = {} proto_as_ascii = tf.gfile.GFile(label_lookup_path).readlines() for line in proto_as_ascii: if line.startswith(' target_class:'): target_class = int(line.split(': ')[1]) if line.startswith(' target_class_string:'): target_class_string = line.split(': ')[1] node_id_to_uid[target_class] = target_class_string[1:-2] # Loads the final mapping of integer node ID to human-readable string node_id_to_name = {} for key, val in node_id_to_uid.items(): if val not in uid_to_human: tf.logging.fatal('Failed to locate: %s', val) name = uid_to_human[val] node_id_to_name[key] = name return node_id_to_name def id_to_string(self, node_id): if node_id not in self.node_lookup: return '' return self.node_lookup[node_id] #读取训练好的Inception-v3模型来创建graph def create_graph(): with tf.gfile.FastGFile(os.path.join( model_dir, 'classify_image_graph_def.pb'), 'rb') as f: graph_def = tf.GraphDef() graph_def.ParseFromString(f.read()) tf.import_graph_def(graph_def, name='') #读取图片 image_data = tf.gfile.FastGFile(image, 'rb').read() #创建graph create_graph() sess=tf.Session() #Inception-v3模型的最后一层softmax的输出 softmax_tensor= sess.graph.get_tensor_by_name('softmax:0') #输入图像数据,得到softmax概率值(一个shape=(1,1008)的向量) predictions = sess.run(softmax_tensor,{'DecodeJpeg/contents:0': image_data}) #(1,1008)->(1008,) predictions = np.squeeze(predictions) # ID --> English string label. node_lookup = NodeLookup() #取出前5个概率最大的值(top-5) top_5 = predictions.argsort()[-5:][::-1] for node_id in top_5: human_string = node_lookup.id_to_string(node_id) score = predictions[node_id] print('%s (score = %.5f)' % (human_string, score)) sess.close()
最后输出:
tiger cat (score = 0.40316)
Egyptian cat (score = 0.21686)
tabby, tabby cat (score = 0.21348)
lynx, catamount (score = 0.01403)
Persian cat (score = 0.00394)
tensorflow中保存模型、加载模型做预测(不需要再定义网络结构)
旭旭_哥 2017-10-11 23:11:21 20282 收藏 11
展开
下面用一个线下回归模型来记载保存模型、加载模型做预测
参考文章:
http://blog.csdn.net/thriving_fcl/article/details/71423039
训练一个线下回归模型并保存看代码:
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
money=np.array([[109],[82],[99], [72], [87], [78], [86], [84], [94], [57]]).astype(np.float32)
click=np.array([[11], [8], [8], [6],[ 7], [7], [7], [8], [9], [5]]).astype(np.float32)
x_test=money[0:5].reshape(-1,1)
y_test=click[0:5]
x_train=money[5:].reshape(-1,1)
y_train=click[5:]
x=tf.placeholder(tf.float32,[None,1],name='x') #保存要输入的格式
w=tf.Variable(tf.zeros([1,1]))
b=tf.Variable(tf.zeros([1]))
y=tf.matmul(x,w)+b
tf.add_to_collection('pred_network', y) #用于加载模型获取要预测的网络结构
y_=tf.placeholder(tf.float32,[None,1])
cost=tf.reduce_sum(tf.pow((y-y_),2))
train_step=tf.train.GradientDescentOptimizer(0.000001).minimize(cost)
init=tf.global_variables_initializer()
sess=tf.Session()
sess.run(init)
cost_history=[]
saver = tf.train.Saver()
for i in range(100):
feed={x:x_train,y_:y_train}
sess.run(train_step,feed_dict=feed)
cost_history.append(sess.run(cost,feed_dict=feed))
# 输出最终的W,b和cost值
print("109的预测值是:",sess.run(y, feed_dict={x: [[109]]}))
print("W_Value: %f" % sess.run(w), "b_Value: %f" % sess.run(b), "cost_Value: %f" % sess.run(cost, feed_dict=feed))
saver_path = saver.save(sess, "/Users/shuubiasahi/Desktop/tensorflow/modelsave/model.ckpt",global_step=100)
print("model saved in file: ", saver_path)
模型训练结果:
2017-10-11 23:05:12.606557: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
109的预测值是: [[ 9.84855175]]
保存模型文件分析:
一、 saver.restore()时填的文件名,因为在saver.save的时候,每个checkpoint会保存三个文件,如
model.ckpt-100.meta, model.ckpt-100.index, model.ckpt-100.data-00000-of-00001
在import_meta_graph时填的就是meta文件名,我们知道权值都保存在model.ckpt-100.data-00000-of-00001这个文件中,但是如果在restore方法中填这个文件名,就会报错,应该填的是前缀,这个前缀可以使用tf.train.latest_checkpoint(checkpoint_dir)这个方法获取。
二、模型的y中有用到placeholder,在sess.run()的时候肯定要feed对应的数据,因此还要根据具体placeholder的名字,从graph中使用get_operation_by_name方法获取。
加载训练好的模型:
import tensorflow as tf
with tf.Session() as sess:
new_saver=tf.train.import_meta_graph('/Users/shuubiasahi/Desktop/tensorflow/modelsave/model.ckpt-100.meta')
new_saver.restore(sess,"/Users/shuubiasahi/Desktop/tensorflow/modelsave/model.ckpt-100")
graph = tf.get_default_graph()
x=graph.get_operation_by_name('x').outputs[0]
y=tf.get_collection("pred_network")[0]
print("109的预测值是:",sess.run(y, feed_dict={x: [[109]]}))
加载模型后的预测结果:
2017-10-11 23:07:33.176523: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
109的预测值是: [[ 9.84855175]]
Process finished with exit code 0
————————————————
版权声明:本文为CSDN博主「旭旭_哥」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/luoyexuge/java/article/details/78209670