我试图通过跟随tensorflow对象检测API的running locally选项来训练对象检测模型 . 按照文档后:

  • 我可以下载数据并将pascal_train.record,pascal_eval.record和pascal_label_map.pbtxt放在./object_detection/data/目录中 .

  • 我已经编辑了"faster_rcnn_resnet101.config"并将其放置为:./ model / model / faster_rcnn_resnet101.config .

  • 我在.config文件中进行了以下更改:fine_tune_checkpoint:model_path / model.ckpt,input_path = ./object_detection/data/pascal_train.record,label_map_path:./object_detection/data/pascal_label_map.pbtxt,类似于eval_input_reader .

  • 现在,如果我使用以下命令训练模型:python3 object_detection / train.py --pipeline_config_path = . / object_detection / data / faster_rcnn_resnet101.config --train_dir = . / model / train_logs,它会抛出此错误:

信息:tensorflow:开始会话 .

INFO:tensorflow:将检查点保存到路径./model/train_logs/model.ckpt INFO:tensorflow:启动队列 .

INFO:tensorflow:向协调员报告错误:,〜/ Documents / Projects / models / object_detection / data / pascal_train.record

[[Node:parallel_read / ReaderReadV2_2 = ReaderReadV2 [_device =“/ job:localhost / replica:0 / task:0 / cpu:0”](parallel_read / TFRecordReaderV2_2,parallel_read / filenames)]]

INFO:tensorflow:global_step / sec:0 2017-08-29 11:41:59.852783:W tensorflow / core / framework / op_kernel.cc:1192]超出范围:FIFOQueue'_5_prefetch_queue'关闭且元素不足(请求1 ,当前大小0)[[Node:prefetch_queue_Dequeue = QueueDequeueV2component_types = [DT_INT32,DT_FLOAT,DT_INT32,DT_BOOL,DT_INT32,DT_BOOL,DT_INT32,DT_INT32,DT_STRING,DT_FLOAT,DT_FLOAT,DT_INT3 2,DT_INT64,DT_STRING,DT_INT32,DT_INT32,DT_INT32,DT_INT64 ,DT_INT32,DT_INT64,DT_STRING,DT_INT32],timeout_ms = -1,_device =“/ job:localhost / replica:0 / task:0 / cpu:0”]] 2017-08-29 11:41:59.852819:W tensorflow /core/framework/op_kernel.cc:1192]超出范围:FIFOQueue'_5_prefetch_queue'关闭且元素不足(请求1,当前大小为0)[[Node:prefetch_queue_Dequeue = QueueDequeueV2component_types = [DT_INT32,DT_FLOAT,DT_INT32,DT_BOOL, DT_INT32,DT_BOOL,DT_INT32,DT_INT32,DT_STRING,DT_FLOAT,DT_FLOAT,DT_INT3 2,DT_INT64,DT_STRING,DT_INT32,DT_INT32,DT_INT32, DT_INT64,DT_INT32,DT_INT64,DT_STRING,DT_INT32],timeout_ms = -1,_device =“/ job:localhost / replica:0 / task:0 / cpu:0”] 2017-08-29 11:41:59.852797:W tensorflow / core / framework / op_kernel.cc:1192]超出范围:FIFOQueue'_5_prefetch_queue'关闭且元素不足(请求1,当前大小为0) . . . .

[[Node:prefetch_queue_Dequeue = QueueDequeueV2component_types = [DT_INT32,DT_FLOAT,DT_INT32,DT_BOOL,DT_INT32,DT_BOOL,DT_INT32,DT_INT32,DT_STRING,DT_FLOAT,DT_FLOAT,DT_INT32,DT_INT64,DT_STRING,DT_INT32,DT_INT32,DT_INT32,DT_INT64,DT_INT32,DT_INT64,DT_STRING ,DT_INT32],timeout_ms = -1,_device =“/ job:localhost / replica:0 / task:0 / cpu:0”]] 2017-08-29 11:42:00.353191:W tensorflow / core / framework / op_kernel .cc:1192]超出范围:FIFOQueue'_5_prefetch_queue'关闭且元素不足(请求1,当前大小0)[[Node:prefetch_queue_Dequeue = QueueDequeueV2component_types = [DT_INT32,DT_FLOAT,DT_INT32,DT_BOOL,DT_INT32,DT_BOOL,DT_INT32, DT_INT32,DT_STRING,DT_FLOAT,DT_FLOAT,DT_INT32,DT_INT64,DT_STRING,DT_INT32,DT_INT32,DT_INT32,DT_INT64,DT_INT32,DT_INT64,DT_STRING,DT_INT32],timeout_ms = -1,_device =“/ job:localhost / replica:0 / task: 0 / cpu:0“]] 2017-08-29 11:42:00.353105:W tensorflow / core / framework / op_kernel.cc:1192]超出范围:FIFOQueue'_5_prefet ch_queue'关闭且元素不足(请求1,当前大小0)[[Node:prefetch_queue_Dequeue = QueueDequeueV2component_types = [DT_INT32,DT_FLOAT,DT_INT32,DT_BOOL,DT_INT32,DT_BOOL,DT_INT32,DT_INT32,DT_STRING,DT_FLOAT,DT_FLOAT,DT_INT32,DT_INT64 ,DT_STRING,DT_INT32,DT_INT32,DT_INT32,DT_INT64,DT_INT32,DT_INT64,DT_STRING,DT_INT32],timeout_ms = -1,_device =“/ job:localhost / replica:0 / task:0 / cpu:0”]]

信息:tensorflow:捕获OutOfRangeError . 停止训练 .

信息:tensorflow:完成培训!将模型保存到磁盘 .

回溯(最近一次调用最后一次):文件“object_detection / train.py”,第199行,在tf.app.run()中

文件“/home/skulhare/.local/lib/python3.5/site-packages/tensorflow/python/platform/app.py”,第48行,在运行中

_sys.exit(main(_sys.argv[:1] + flags_passthrough))

文件“object_detection / train.py”,第195行,主要

worker_job_name, is_chief, FLAGS.train_dir)

文件“/ home / skulhare / Documents / Projects / models / object_detection

/trainer.py“,第296行,在火车上

saver=saver)

火车上的“/home/skulhare/.local/lib/python3.5/site-packages/tensorflow/contrib/slim/python/slim/learning.py”,第767行

sv.stop(threads, close_summary_writer=True)

文件“/home/skulhare/.local/lib/python3.5/site-packages/tensorflow/python/training/supervisor.py”,第792行,在stop stop_grace_period_secs = self._stop_grace_secs)

文件“/home/skulhare/.local/lib/python3.5/site-packages/tensorflow/python/training/coordinator.py”,第389行,加入six.reraise(* self._exc_info_to_raise)

文件“/home/skulhare/.local/lib/python3.5/site-packages/six.py”,第686行,重新加注值文件“/home/skulhare/.local/lib/python3.5/site- packages / tensorflow / python / training / queue_runner_impl.py“,第238行,在_run中

enqueue_callable()

文件“/home/skulhare/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py”,第1235行,在_single_operation_run中target_list_as_strings,status,None)

文件"/usr/lib/python3.5/contextlib.py",第66行,在 exit 下(self.gen)文件"/home/skulhare/.local/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py",第466行,在raise_exception_on_not_ok_status pywrap_tensorflow.TF_GetCode(status))

tensorflow.python.framework.errors_impl.NotFoundError:〜/ Documents / Projects / models / object_detection / data / pascal_train.record [[Node:parallel_read / ReaderReadV2_2 = ReaderReadV2 [_device =“/ job:localhost / replica:0 / task:0 / cpu:0“](parallel_read / TFRecordReaderV2_2,parallel_read / filenames)]]

我一直在Ubuntu 16.04上使用tensorflow 1.3.0和GTX 1080 ti . Here是faster_rcnn_resnet101.config文件的内容 .

提前致谢 .