运行seq2seq模型时出现tensorflow错误-Java 学习之路

在运行RNN教程example时，在读取数据行语句后出现以下错误：

reading data line 22500000

W tensorflow/core/common_runtime/executor.cc:1052] 0x3ef81ae60 Compute status: Not found: ./checkpoints_directory/translate.ckpt-200.tempstate15092134273276121938
         [[Node: save/save = SaveSlices[T=[DT_FLOAT, DT_INT32, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT
_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOA
T, DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/save/tensor_names, save/save/shapes_and_slices, Variable, Variable_1, embedding_attention_seq2seq/RNN/EmbeddingWrappe
r/embedding, embedding_attention_seq2seq/RNN/MultiRNNCell/Cell0/GRUCell/Candidate/Linear/Bias, embedding_attention_seq2seq/RNN/MultiRNNCell/Cell0/GRUCell/Candidate/Linear/Matrix, embedding_attention_seq2se
q/RNN/MultiRNNCell/Cell0/GRUCell/Gates/Linear/Bias, embedding_attention_seq2seq/RNN/MultiRNNCell/Cell0/GRUCell/Gates/Linear/Matrix, embedding_attention_seq2seq/RNN/MultiRNNCell/Cell1/GRUCell/Candidate/Line
ar/Bias, embedding_attention_seq2seq/RNN/MultiRNNCell/Cell1/GRUCell/Candidate/Linear/Matrix, embedding_attention_seq2seq/RNN/MultiRNNCell/Cell1/GRUCell/Gates/Linear/Bias, embedding_attention_seq2seq/RNN/Mu
ltiRNNCell/Cell1/GRUCell/Gates/Linear/Matrix, embedding_attention_seq2seq/RNN/MultiRNNCell/Cell2/GRUCell/Candidate/Linear/Bias, embedding_attention_seq2seq/RNN/MultiRNNCell/Cell2/GRUCell/Candidate/Linear/M
atrix, embedding_attention_seq2seq/RNN/MultiRNNCell/Cell2/GRUCell/Gates/Linear/Bias, embedding_attention_seq2seq/RNN/MultiRNNCell/Cell2/GRUCell/Gates/Linear/Matrix, embedding_attention_seq2seq/embedding_at
tention_decoder/attention_decoder/Attention_0/Linear/Bias, embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/Attention_0/Linear/Matrix, embedding_attention_seq2seq/embedding_attenti
on_decoder/attention_decoder/AttnOutputProjection/Linear/Bias, embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/AttnOutputProjection/Linear/Matrix, embedding_attention_seq2seq/embe
dding_attention_decoder/attention_decoder/AttnV_0, embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/AttnW_0, embedding_attention_seq2seq/embedding_attention_decoder/attention_decod
er/Linear/Bias, embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/Linear/Matrix, embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/MultiRNNCell/Cell0/GRUCell
/Candidate/Linear/Bias, embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/MultiRNNCell/Cell0/GRUCell/Candidate/Linear/Matrix, embedding_attention_seq2seq/embedding_attention_decoder
/attention_decoder/MultiRNNCell/Cell0/GRUCell/Gates/Linear/Bias, embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/MultiRNNCell/Cell0/GRUCell/Gates/Linear/Matrix, embedding_attentio
n_seq2seq/embedding_attention_decoder/attention_decoder/MultiRNNCell/Cell1/GRUCell/Candidate/Linear/Bias, embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/MultiRNNCell/Cell1/GRUCel
l/Candidate/Linear/Matrix, embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/MultiRNNCell/Cell1/GRUCell/Gates/Linear/Bias, embedding_attention_seq2seq/embedding_attention_decoder/at
tention_decoder/MultiRNNCell/Cell1/GRUCell/Gates/Linear/Matrix, embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/MultiRNNCell/Cell2/GRUCell/Candidate/Linear/Bias, embedding_attenti
on_seq2seq/embedding_attention_decoder/attention_decoder/MultiRNNCell/Cell2/GRUCell/Candidate/Linear/Matrix, embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/MultiRNNCell/Cell2/GRU
Cell/Gates/Linear/Bias, embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/MultiRNNCell/Cell2/GRUCell/Gates/Linear/Matrix, embedding_attention_seq2seq/embedding_attention_decoder/emb
edding, proj_b, proj_w)]]
global step 200 learning rate 0.5000 step-time 14.56 perplexity 2781.37
Traceback (most recent call last):
  File "/home/temp_user/.cache/bazel/_bazel_temp_user/7cf40d683d56020fae2d5abbde7f9f05/tensorflow/bazel-out/local_linux-opt/bin/tensorflow/models/rnn/translate/translate.runfiles/tensorflow/models/rnn/tran
slate/translate.py", line 264, in <module>
    tf.app.run()
  File "/home/temp_user/.cache/bazel/_bazel_temp_user/7cf40d683d56020fae2d5abbde7f9f05/tensorflow/bazel-out/local_linux-opt/bin/tensorflow/models/rnn/translate/translate.runfiles/tensorflow/python/platform
/default/_app.py", line 15, in run
    sys.exit(main(sys.argv))
  File "/home/temp_user/.cache/bazel/_bazel_temp_user/7cf40d683d56020fae2d5abbde7f9f05/tensorflow/bazel-out/local_linux-opt/bin/tensorflow/models/rnn/translate/translate.runfiles/tensorflow/models/rnn/tran
slate/translate.py", line 261, in main
    train()
  File "/home/temp_user/.cache/bazel/_bazel_temp_user/7cf40d683d56020fae2d5abbde7f9f05/tensorflow/bazel-out/local_linux-opt/bin/tensorflow/models/rnn/translate/translate.runfiles/tensorflow/models/rnn/tran
slate/translate.py", line 180, in train
    model.saver.save(sess, checkpoint_path, global_step=model.global_step)
  File "/home/temp_user/.cache/bazel/_bazel_temp_user/7cf40d683d56020fae2d5abbde7f9f05/tensorflow/bazel-out/local_linux-opt/bin/tensorflow/models/rnn/translate/translate.runfiles/tensorflow/python/training
/saver.py", line 847, in save
    self._save_tensor_name, {self._filename_tensor_name: checkpoint_file})
  File "/home/temp_user/.cache/bazel/_bazel_temp_user/7cf40d683d56020fae2d5abbde7f9f05/tensorflow/bazel-out/local_linux-opt/bin/tensorflow/models/rnn/translate/translate.runfiles/tensorflow/python/client/s
ession.py", line 401, in run
    results = self._do_run(target_list, unique_fetch_targets, feed_dict_string)
  File "/home/temp_user/.cache/bazel/_bazel_temp_user/7cf40d683d56020fae2d5abbde7f9f05/tensorflow/bazel-out/local_linux-opt/bin/tensorflow/models/rnn/translate/translate.runfiles/tensorflow/python/client/s
ession.py", line 477, in _do_run 
    e.code)
tensorflow.python.framework.errors.NotFoundError: ./checkpoints_directory/translate.ckpt-200.tempstate15092134273276121938
         [[Node: save/save = SaveSlices[T=[DT_FLOAT, DT_INT32, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/save/tensor_names, save/save/shapes_and_slices, Variable, Variable_1, embedding_attention_seq2seq/RNN/EmbeddingWrapper/embedding, embedding_attention_seq2seq/RNN/MultiRNNCell/Cell0/GRUCell/Candidate/Linear/Bias, embedding_attention_seq2seq/RNN/MultiRNNCell/Cell0/GRUCell/Candidate/Linear/Matrix, embedding_attention_seq2seq/RNN/MultiRNNCell/Cell0/GRUCell/Gates/Linear/Bias, embedding_attention_seq2seq/RNN/MultiRNNCell/Cell0/GRUCell/Gates/Linear/Matrix, embedding_attention_seq2seq/RNN/MultiRNNCell/Cell1/GRUCell/Candidate/Linear/Bias, embedding_attention_seq2seq/RNN/MultiRNNCell/Cell1/GRUCell/Candidate/Linear/Matrix, embedding_attention_seq2seq/RNN/MultiRNNCell/Cell1/GRUCell/Gates/Linear/Bias, embedding_attention_seq2seq/RNN/MultiRNNCell/Cell1/GRUCell/Gates/Linear/Matrix, embedding_attention_seq2seq/RNN/MultiRNNCell/Cell2/GRUCell/Candidate/Linear/Bias, embedding_attention_seq2seq/RNN/MultiRNNCell/Cell2/GRUCell/Candidate/Linear/Matrix, embedding_attention_seq2seq/RNN/MultiRNNCell/Cell2/GRUCell/Gates/Linear/Bias, embedding_attention_seq2seq/RNN/MultiRNNCell/Cell2/GRUCell/Gates/Linear/Matrix, embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/Attention_0/Linear/Bias, embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/Attention_0/Linear/Matrix, embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/AttnOutputProjection/Linear/Bias, embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/AttnOutputProjection/Linear/Matrix, embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/AttnV_0, embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/AttnW_0, embedding_attention_seq2seq/embedding_attention_decoder/attention_decod

/default/_app.py ", line 15, in run sys.exit(main(sys.argv)) File " /home/temp_user/.cache/bazel/_bazel_temp_user/7cf40d683d56020fae2d5abbde7f9f05/tensorflow/bazel-out/local_linux-opt/bin/tensorflow/models/rnn/translate/translate.runfiles/tensorflow/models/rnn /translate/translate.py ", line 261, in main train() File " /home/temp_user/.cache/bazel/_bazel_temp_user/7cf40d683d56020fae2d5abbde7f9f05/tensorflow/bazel-out/local_linux-opt/bin/tensorflow/models/rnn/translate/translate.runfiles/tensorflow/models/rnn /translate/translate.py ", line 130, in train model = create_model(sess, False) File " /home/temp_user/.cache/bazel/_bazel_temp_user/7cf40d683d56020fae2d5abbde7f9f05/tensorflow/bazel-out/local_linux-opt/bin/tensorflow/models/rnn/translate/translate.runfiles/tensorflow/models/rnn /translate/translate.py“，第109行，在create_model中forward_only = forward_only）
文件"/home/temp_user/.cache/bazel/_bazel_temp_user/7cf40d683d56020fae2d5abbde7f9f05/tensorflow/bazel-out/local_linux-opt/bin/tensorflow/models/rnn/translate/translate.runfiles/tensorflow/models/rnn/translate/seq2seq_model.py"，第153行，在 init self.saver = tf.train.Saver（tf.all_variables（））文件"/home/temp_user/.cache/bazel/_bazel_temp_user/7cf40d683d56020fae2d5abbde7f9f05/tensorflow/bazel-out/local_linux-opt/bin/tensorflow/models/rnn/translate/translate.runfiles/tensorflow/python/training/saver.py"，第693行，在 init restore_sequentially = restore_sequentially）文件"/home/temp_user/.cache/bazel/_bazel_temp_user/7cf40d683d56020fae2d5abbde7f9f05/tensorflow/bazel-out/local_linux-opt/bin/tensorflow/models/rnn/translate/translate.runfiles/tensorflow/python/training/saver.py"，第411行，在构建中
save_tensor = self._AddSaveOps（filename_tensor，vars_to_save）文件"/home/temp_user/.cache/bazel/_bazel_temp_user/7cf40d683d56020fae2d5abbde7f9f05/tensorflow/bazel-out/local_linux-opt/bin/tensorflow/models/rnn/translate/translate.runfiles/tensorflow/python/training/saver.py"，第114行，在_AddSaveOps中保存= self.save_op（filename_tensor，vars_to_save）文件"/home/temp_user/.cache/bazel/_bazel_temp_user/7cf40d683d56020fae2d5abbde7f9f05/tensorflow/bazel-out/local_linux-opt/bin/tensorflow/models/rnn/translate/translate.runfiles/tensorflow/python/training/saver.py"，第68行，在save_op中
tensor_slices = [vsslice_spec for vs in vars_to_save]）文件"/home/temp_user/.cache/bazel/_bazel_temp_user/7cf40d683d56020fae2d5abbde7f9f05/tensorflow/bazel-out/local_linux-opt/bin/tensorflow/models/rnn/translate/translate.runfiles/tensorflow/python/ops/io_ops.py"，第149行，在_save张量中，name = name）文件"/home/temp_user/.cache/bazel/_bazel_temp_user/7cf40d683d56020fae2d5abbde7f9f05/tensorflow/bazel-out/local_linux-opt/bin/tensorflow/models/rnn/translate/translate.runfiles/tensorflow/python/ops/gen_io_ops.py"，第343行，在_save_slices中名称= name）文件"/home/temp_user/.cache/bazel/_bazel_temp_user/7cf40d683d56020fae2d5abbde7f9f05/tensorflow/bazel-out/local_linux-opt/bin/tensorflow/models/rnn/translate/translate.runfiles/tensorflow/python/ops/op_def_library.py"，第646行，在apply_op中op_def = op_def）文件"/home/temp_user/.cache/bazel/_bazel_temp_user/7cf40d683d56020fae2d5abbde7f9f05/tensorflow/bazel-out/local_linux-opt/bin/tensorflow/models/rnn/translate/translate.runfiles/tensorflow/python/framework/ops.py"，第1767行，在create_op original_op = self._default_original_op，op_def = op_def）文件"/home/temp_user/.cache/bazel/_bazel_temp_user/7cf40d683d56020fae2d5abbde7f9f05/tensorflow/bazel-out/local_linux-opt/bin/tensorflow/models/rnn/translate/translate.runfiles/tensorflow/python/framework/ops.py"，第1008行，在 init self._traceback = _extract_stack（）

错误：来自命令的非零返回码'1'：进程退出，状态为1 .

那么这个问题的原因是什么，因为其他语言模型示例正在工作，并且库也已经构建 . 根据评论，我创建了检查点目录，仍然抛出相同的错误：tensorflow / core / common_runtime / executor.cc:1052] 0x400d2bbe0计算状态：未找到：./ checkpoint_directory/translate.ckpt-200.tempstate9246663217899500702

2 回答

我认为这是前一个检查点未正确保存时出现的问题之一 . 您可以按以下步骤更正它 .

1.您可以删除所有检查点文件并重新开始培训：

rm checkpoint
rm translate-ckpt-*

现在，重新开始训练 .

或者，您可以删除最新的检查点并从上一个检查点启动它 .

1.转到目录并删除最新的检查点，在这种情况下它是：

rm translate-ckpt-200

2.现在编辑检查点文件 . 你可能会看到类似的东西

model_checkpoint_path: "data/translate.ckpt-200"
all_model_checkpoint_paths: "data/translate.ckpt-170"
all_model_checkpoint_paths: "data/translate.ckpt-180"
all_model_checkpoint_paths: "data/translate.ckpt-190"
all_model_checkpoint_paths: "data/translate.ckpt-200"

3.删除最后一行并将检查点设置为上一个阶段 .

model_checkpoint_path: "data/translate.ckpt-190"
all_model_checkpoint_paths: "data/translate.ckpt-170"
all_model_checkpoint_paths: "data/translate.ckpt-180"
all_model_checkpoint_paths: "data/translate.ckpt-190"

4.重新开始训练 .

回复于 2024-04-29T03:55:25+08:00

0

运行序列到序列模型我遇到了同样的问题 . 在运行代码之前创建[checkpoint目录]解决了问题！

回复于 2024-04-29T03:55:25+08:00

运行seq2seq模型时出现tensorflow错误

2 回答

相关问题