首页 文章

尝试在google collaboratory上训练tensorflow对象检测api时出现DuplicateFlagError

提问于
浏览
4

我正在尝试在包含苹果和辣椒的数据集上训练Tensorflow对象检测API . 为此,我生成了所需的文件(TFrecords和带注释的图像)并将它们放在models / research / object_detection目录中 . 然后,我从github分叉了Object detection api,并将我的文件推送到了forked repo . 然后,我在Google Collaboratory中克隆这个repo并运行train.py文件,但是我得到了DuplicateFlagError:master错误 .

---------------------------------------------------------------------------

DuplicateFlagError               Traceback (most recent call last)
/content/models/research/object_detection/train.py in <module>()
     56 
     57 flags = tf.app.flags
---> 58 flags.DEFINE_string('master', '', 'Name of the TensorFlow master to use.')
     59 flags.DEFINE_integer('task', 0, 'task id')
     60 flags.DEFINE_integer('num_clones', 1, 'Number of clones to deploy per worker.')

/usr/local/lib/python3.6/dist-packages/tensorflow/python/platform/flags.py in wrapper(*args, **kwargs)
     56           'Use of the keyword argument names (flag_name, default_value, '
     57           'docstring) is deprecated, please use (name, default, help) instead.')
---> 58     return original_function(*args, **kwargs)
     59 
     60   return tf_decorator.make_decorator(original_function, wrapper)

/usr/local/lib/python3.6/dist-packages/absl/flags/_defines.py in DEFINE_string(name, default, help, flag_values, **args)
    239   parser = _argument_parser.ArgumentParser()
    240   serializer = _argument_parser.ArgumentSerializer()
--> 241   DEFINE(parser, name, default, help, flag_values, serializer, **args)
    242 
    243 

/usr/local/lib/python3.6/dist-packages/absl/flags/_defines.py in DEFINE(parser, name, default, help, flag_values, serializer, module_name, **args)
     80   """
     81   DEFINE_flag(_flag.Flag(parser, serializer, name, default, help, **args),
---> 82               flag_values, module_name)
     83 
     84 

/usr/local/lib/python3.6/dist-packages/absl/flags/_defines.py in DEFINE_flag(flag, flag_values, module_name)
    102   # Copying the reference to flag_values prevents pychecker warnings.
    103   fv = flag_values
--> 104   fv[flag.name] = flag
    105   # Tell flag_values who's defining the flag.
    106   if module_name:

/usr/local/lib/python3.6/dist-packages/absl/flags/_flagvalues.py in __setitem__(self, name, flag)
    425         # module is simply being imported a subsequent time.
    426         return
--> 427       raise _exceptions.DuplicateFlagError.from_flag(name, self)
    428     short_name = flag.short_name
    429     # If a new flag overrides an old one, we need to cleanup the old flag's

DuplicateFlagError: The flag 'master' is defined twice. First from object_detection/train.py, Second from object_detection/train.py.  Description from first occurrence: Name of the TensorFlow master to use.

为了解决这个问题,我尝试对该行进行注释,但后来我在下一个标志上得到了DuplicateFlagError,即下一行 . 因此,为了尝试解决这个问题,我评论了train.py中声明这些标志的所有行,即我从第58行注释到第82行 . 但是,我得到了错误NotFoundError :;

---------------------------------------------------------------------------
NotFoundError                             Traceback (most recent call last)
/content/models/research/object_detection/train.py in <module>()
    165 
    166 if __name__ == '__main__':
--> 167   tf.app.run()

/usr/local/lib/python3.6/dist-packages/tensorflow/python/platform/app.py in run(main, argv)
    124   # Call the main function, passing through any arguments
    125   # to the final program.
--> 126   _sys.exit(main(argv))
    127 
    128 

/content/models/research/object_detection/train.py in main(_)
    105                            ('input.config', FLAGS.input_config_path)]:
    106         tf.gfile.Copy(config, os.path.join(FLAGS.train_dir, name),
--> 107                       overwrite=True)
    108 
    109   model_config = configs['model']

/usr/local/lib/python3.6/dist-packages/tensorflow/python/lib/io/file_io.py in copy(oldpath, newpath, overwrite)
    390   with errors.raise_exception_on_not_ok_status() as status:
    391     pywrap_tensorflow.CopyFile(
--> 392         compat.as_bytes(oldpath), compat.as_bytes(newpath), overwrite, status)
    393 
    394 

/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/errors_impl.py in __exit__(self, type_arg, value_arg, traceback_arg)
    514             None, None,
    515             compat.as_text(c_api.TF_Message(self.status.status)),
--> 516             c_api.TF_GetCode(self.status.status))
    517     # Delete the underlying status object from memory otherwise it stays alive
    518     # as there is a reference to status from this from the traceback due to

NotFoundError: ; No such file or directory

我该怎么解决?这是我的Collab笔记本 - https://drive.google.com/file/d/1mZGOKX3JZXyG4XYkI6WHIXoNbRSpkE_F/view?usp=sharing

2 回答

  • 9
    ####Delete all flags before declare#####
    
    def del_all_flags(FLAGS):
        flags_dict = FLAGS._flags()    
        keys_list = [keys for keys in flags_dict]    
        for keys in keys_list:
            FLAGS.__delattr__(keys)
    
    del_all_flags(tf.flags.FLAGS)
    
  • 0

    从tensorflow / models Github存储库浏览你的colab笔记本和你修改过的fork之后,我就可以在本地机器上运行了 .

    我得到了最新的tensorflow版本,即1.6,与Google Colab相同 .

    • 您在 ssd_mobilenet_v1_coco.config 中指定的路径是 data/object-detection.pbtxt . 所以从 models/research/object_detection 目录执行train.py .

    • train.py 期望 --pipeline_config_path 作为参数,但您已指定 --pipeline_config . 因此,如果您通过 train.py 代码,您将意识到如果未指定 --pipeline_config_path ,则它将配置文件名默认为 models.config ,因此您将获得 NotFoundError: ; No such file or directory

    所以最后的命令应该是这样的:

    ubuntu@Himanshu:~/Desktop/models/research/object_detection$ python train.py --logtostderr --train_dir=training --pipeline_config_path=training/ssd_mobilenet_v1_coco.config
    

    正如上面链接中的评论建议:在第109行的 object_detection/data_decoders/tf_example_decoder.py 中删除 dct_method=dct_method .

    希望这可以帮助 .

相关问题