在Tensorflow下使用SSD模型訓練自己的數據集時,經過查找很多博客資料,已經成功訓練出來了自己的模型,但就是在測試自己模型效果的時候,出現了如下錯誤。
2019-10-27 14:47:12.862573: W tensorflow/core/framework/op_kernel.cc:1401] OP_REQUIRES failed at save_restore_v2_ops.cc:184 : Not found: Key ssd_300_vgg/block3_box/L2Normalization/gamma not found in checkpoint Traceback (most recent call last): File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call return fn(*args) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.NotFoundError: Key ssd_300_vgg/block3_box/L2Normalization/gamma not found in checkpoint [[{{node save/RestoreV2}}]] [[{{node save/RestoreV2}}]] During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1276, in restore {self.saver_def.filename_tensor_name: save_path}) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 929, in run run_metadata_ptr) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _run feed_dict_tensor, options, run_metadata) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run run_metadata) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.NotFoundError: Key ssd_300_vgg/block3_box/L2Normalization/gamma not found in checkpoint [[node save/RestoreV2 (defined at ssd_notebook.py:53) ]] [[node save/RestoreV2 (defined at ssd_notebook.py:53) ]] Caused by op 'save/RestoreV2', defined at: File "ssd_notebook.py", line 53, in <module> saver = tf.train.Saver() File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 832, in __init__ self.build() File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 844, in build self._build(self._filename, build_save=True, build_restore=True) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 881, in _build build_save=build_save, build_restore=build_restore) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 513, in _build_internal restore_sequentially, reshape) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 332, in _AddRestoreOps restore_sequentially) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 580, in bulk_restore return io_ops.restore_v2(filename_tensor, names, slices, dtypes) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/ops/gen_io_ops.py", line 1572, in restore_v2 name=name) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper op_def=op_def) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func return func(*args, **kwargs) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3300, in create_op op_def=op_def) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1801, in __init__ self._traceback = tf_stack.extract_stack() NotFoundError (see above for traceback): Key ssd_300_vgg/block3_box/L2Normalization/gamma not found in checkpoint [[node save/RestoreV2 (defined at ssd_notebook.py:53) ]] [[node save/RestoreV2 (defined at ssd_notebook.py:53) ]] During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1286, in restore names_to_keys = object_graph_key_mapping(save_path) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1591, in object_graph_key_mapping checkpointable.OBJECT_GRAPH_PROTO_KEY) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 370, in get_tensor status) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 528, in __exit__ c_api.TF_GetCode(self.status.status)) tensorflow.python.framework.errors_impl.NotFoundError: Key _CHECKPOINTABLE_OBJECT_GRAPH not found in checkpoint During handling of the above exception, another exception occurred: Traceback (most recent call last): File "ssd_notebook.py", line 54, in <module> saver.restore(isess, ckpt_filename) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1292, in restore err, "a Variable name or other graph key that is missing") tensorflow.python.framework.errors_impl.NotFoundError: Restoring from checkpoint failed. This is most likely due to a Variable nameor other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error: Key ssd_300_vgg/block3_box/L2Normalization/gamma not found in checkpoint [[node save/RestoreV2 (defined at ssd_notebook.py:53) ]] [[node save/RestoreV2 (defined at ssd_notebook.py:53) ]] Caused by op 'save/RestoreV2', defined at: File "ssd_notebook.py", line 53, in <module> saver = tf.train.Saver() File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 832, in __init__ self.build() File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 844, in build self._build(self._filename, build_save=True, build_restore=True) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 881, in _build build_save=build_save, build_restore=build_restore) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 513, in _build_internal restore_sequentially, reshape) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 332, in _AddRestoreOps restore_sequentially) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 580, in bulk_restore return io_ops.restore_v2(filename_tensor, names, slices, dtypes) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/ops/gen_io_ops.py", line 1572, in restore_v2 name=name) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper op_def=op_def) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func return func(*args, **kwargs) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3300, in create_op op_def=op_def) File "/opt/anaconda3/envs/dlipy3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1801, in __init__ self._traceback = tf_stack.extract_stack() NotFoundError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error: Key ssd_300_vgg/block3_box/L2Normalization/gamma not found in checkpoint [[node save/RestoreV2 (defined at ssd_notebook.py:53) ]] [[node save/RestoreV2 (defined at ssd_notebook.py:53) ]]
在查找資料的過程中,出現了很多波折,百度上基本沒有同樣的錯誤,
最開始使用的代碼是:
ckpt_filename = '../train_model/model.ckpt-1000'
嘗試過很多種方法,比如下面這種方法,改了后還是報同樣的錯誤。
ckpt_filename = tf.train.latest_checkpoint('../train_model/model.ckpt-1000')
還有說模型沒有完全保存,經過很多次訓練,發現模型確實是成功保存了的。
還說是按照這個英文意思來解決,就是這個Key在ckpt文件里面沒有。經查找資料用如下代碼查看ckpt文件里面的key。
import os
from tensorflow.python import pywrap_tensorflow
current_path = '****/SSD_small_object_detection/'
model_dir = os.path.join(current_path, 'train_model')
checkpoint_path = os.path.join(model_dir,'model.ckpt-1000') # 保存的ckpt文件名,不一定是這個
# Read data from checkpoint file
reader = pywrap_tensorflow.NewCheckpointReader(checkpoint_path)
var_to_shape_map = reader.get_variable_to_shape_map()
# Print tensor name and values
for key in var_to_shape_map:
print("tensor_name: ", key)
# print(reader.get_tensor(key)) # 打印變量的值,對我們查找問題沒啥影響,打印出來反而影響找問題
確實得到了一點結果,如下圖所示:

就算得到了結果,但是代碼太復雜,本身也看不太懂,就想着實在沒辦法的話就嘗試Debug下代碼,但是我相信前面的步驟沒有問題,然后終於發現了解決方法。

於是我在我自己的代碼中將saver的定義改變一下
saver = tf.train.import_meta_graph("../train_model/model.ckpt-1000.meta")
錯誤成功解決。
