首页 文章

Tensorflow MultiLSTMCell错误

提问于
浏览
0

我正在尝试在tensorflow 1.1.0中开发一个递归神经网络,我写了一个应该返回LSTM的函数 .

def LSTM(x, num_units, num, num_layers=3):
  cells = []
  for i in range(num_layers):
    cell = LSTMCell(num_units=num_units, state_is_tuple=True)
    cell = DropoutWrapper(cell=cell, output_keep_prob=0.5)
    cells.append(cell)

  lstm = MultiRNNCell(cells=cells, state_is_tuple=True)
  val, state = tf.nn.dynamic_rnn(lstm, x, dtype=tf.float32)

  val = tf.transpose(val, [1, 0, 2])  # rendo il risultato una sequenza
  last = tf.gather(val, int(val.get_shape()[0]) - 1)  # prendo l'output dell'ultimo elemento

  return last

这个功能实际上有效,但如果我尝试重复使用它多次,我会收到以下错误:

C:\ProgramData\Anaconda3\envs\obra\python.exe C:/Users/Simone/Desktop/Cobra/LSTM_Function_Filtro.py
Traceback (most recent call last):
  File "C:/Users/Simone/Desktop/Cobra/LSTM_Function_Filtro.py", line 81, in <module>
    Lstm2 = tf.nn.relu(tf.matmul(lyrs.LSTM(concat1, num_hidden, 1), W2) + B2)
  File "C:\Users\Simone\Desktop\Cobra\Layers_OK.py", line 62, in LSTM
    val, state = tf.nn.dynamic_rnn(lstm, x, dtype=tf.float32)
  File "C:\ProgramData\Anaconda3\envs\obra\lib\site-packages\tensorflow\python\ops\rnn.py", line 553, in dynamic_rnn
    dtype=dtype)
  File "C:\ProgramData\Anaconda3\envs\obra\lib\site-packages\tensorflow\python\ops\rnn.py", line 720, in _dynamic_rnn_loop
    swap_memory=swap_memory)
  File "C:\ProgramData\Anaconda3\envs\obra\lib\site-packages\tensorflow\python\ops\control_flow_ops.py", line 2623, in while_loop
    result = context.BuildLoop(cond, body, loop_vars, shape_invariants)
  File "C:\ProgramData\Anaconda3\envs\obra\lib\site-packages\tensorflow\python\ops\control_flow_ops.py", line 2456, in BuildLoop
    pred, body, original_loop_vars, loop_vars, shape_invariants)
  File "C:\ProgramData\Anaconda3\envs\obra\lib\site-packages\tensorflow\python\ops\control_flow_ops.py", line 2406, in _BuildLoop
    body_result = body(*packed_vars_for_body)
  File "C:\ProgramData\Anaconda3\envs\obra\lib\site-packages\tensorflow\python\ops\rnn.py", line 705, in _time_step
    (output, new_state) = call_cell()
  File "C:\ProgramData\Anaconda3\envs\obra\lib\site-packages\tensorflow\python\ops\rnn.py", line 691, in <lambda>
    call_cell = lambda: cell(input_t, state)
  File "C:\ProgramData\Anaconda3\envs\obra\lib\site-packages\tensorflow\contrib\rnn\python\ops\core_rnn_cell_impl.py", line 953, in __call__
    cur_inp, new_state = cell(cur_inp, cur_state)
  File "C:\ProgramData\Anaconda3\envs\obra\lib\site-packages\tensorflow\contrib\rnn\python\ops\core_rnn_cell_impl.py", line 713, in __call__
    output, new_state = self._cell(inputs, state, scope)
  File "C:\ProgramData\Anaconda3\envs\obra\lib\site-packages\tensorflow\contrib\rnn\python\ops\core_rnn_cell_impl.py", line 398, in __call__
    reuse=self._reuse) as unit_scope:
  File "C:\ProgramData\Anaconda3\envs\obra\lib\contextlib.py", line 59, in __enter__
    return next(self.gen)
  File "C:\ProgramData\Anaconda3\envs\obra\lib\site-packages\tensorflow\contrib\rnn\python\ops\core_rnn_cell_impl.py", line 93, in _checked_scope
    "the argument reuse=True." % (scope_name, type(cell).__name__))
ValueError: Attempt to have a second RNNCell use the weights of a variable scope that already has weights: 'rnn/multi_rnn_cell/cell_0/lstm_cell'; and the cell was not constructed as LSTMCell(..., reuse=True).  To share the weights of an RNNCell, simply reuse it in your second calculation, or create a new one with the argument reuse=True.

此外,我试图在for循环结束时添加 tf.get_variable_scope().reuse_variables() ,但我得到了错误

Variable rnn/multi_rnn_cell/cell_0/lstm_cell/weights does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=None in VarScope?

1 回答

  • 0

    如果我没有弄错,你想通过使用相同的代码共享参数或创建另一个RNN . 如果是这样,您可以使用tf.variable_scope,如下所示:

    def LSTM(x, num_units, num_layers=3, reuse=False, scope="MultiRNNCell"):
        with tf.variable_scope(name_or_scope=scope, reuse=reuse):
            cells = []
            for i in range(num_layers):
                cell = tf.nn.rnn_cell.LSTMCell(num_units=num_units, state_is_tuple=True)
                cell = tf.nn.rnn_cell.DropoutWrapper(cell=cell, output_keep_prob=0.5)
                cells.append(cell)
    
            lstm = tf.nn.rnn_cell.MultiRNNCell(cells=cells, state_is_tuple=True)
            val, state = tf.nn.dynamic_rnn(lstm, x, dtype=tf.float32)
    
            val = tf.transpose(val, [1, 0, 2])  # rendo il risultato una sequenza
            last = tf.gather(val, int(val.get_shape()[0]) - 1)  # prendo l'output dell'ultimo elemento
    
            return last
    

    在您第一次使用时,您应该传递 reuse 参数 False ,以便tensorflow创建变量 . 要与另一个RNN共享参数,那么传递_2546701就足够了 . 如果你想创建一个新模型,我建议你传递一个新的 scope 名称和 reuse=False . 以下示例运行应该使其更容易遵循 . 我创建了一个虚拟占位符 .

    def list_parameters():
        num_param = 0
        for v in tf.global_variables():
            print(v.name)
            num_param += np.prod(v.get_shape().as_list())
    
        print("# of parameters: " + str(num_param))
    
    x = tf.placeholder(dtype=tf.float32,
                     shape=[32, 50, 100],
                     name='input_data')
    
    lstm1 = LSTM(x, 64, 3, reuse=False, scope="MultiRNNCell")
    list_parameters()
    

    MultiRNNCell / rnn / multi_rnn_cell / cell_0 / lstm_cell / kernel:0 MultiRNNCell / rnn / multi_rnn_cell / cell_0 / lstm_cell / bias:0 MultiRNNCell / rnn / multi_rnn_cell / cell_1 / lstm_cell / kernel:0 MultiRNNCell / rnn / multi_rnn_cell / cell_1 / lstm_cell / bias:0 MultiRNNCell / rnn / multi_rnn_cell / cell_2 / lstm_cell / kernel:0 MultiRNNCell / rnn / multi_rnn_cell / cell_2 / lstm_cell / bias:0#参数:108288

    lstm2 = LSTM(x, 64, 3, reuse=True, scope="MultiRNNCell")
    list_parameters()
    

    MultiRNNCell / rnn / multi_rnn_cell / cell_0 / lstm_cell / kernel:0 MultiRNNCell / rnn / multi_rnn_cell / cell_0 / lstm_cell / bias:0 MultiRNNCell / rnn / multi_rnn_cell / cell_1 / lstm_cell / kernel:0 MultiRNNCell / rnn / multi_rnn_cell / cell_1 / lstm_cell / bias:0 MultiRNNCell / rnn / multi_rnn_cell / cell_2 / lstm_cell / kernel:0 MultiRNNCell / rnn / multi_rnn_cell / cell_2 / lstm_cell / bias:0#参数:108288

    请注意 lstm1lstm2 是共享参数 .

    lstm3 = LSTM(x, 64, 3, reuse=False, scope="NewMultiRNNCell")
    list_parameters()
    

    MultiRNNCell / rnn / multi_rnn_cell / cell_0 / lstm_cell / kernel:0 MultiRNNCell / rnn / multi_rnn_cell / cell_0 / lstm_cell / bias:0 MultiRNNCell / rnn / multi_rnn_cell / cell_1 / lstm_cell / kernel:0 MultiRNNCell / rnn / multi_rnn_cell / cell_1 / lstm_cell / bias:0 MultiRNNCell / rnn / multi_rnn_cell / cell_2 / lstm_cell / kernel:0 MultiRNNCell / rnn / multi_rnn_cell / cell_2 / lstm_cell / bias:0 NewMultiRNNCell / rnn / multi_rnn_cell / cell_0 / lstm_cell / kernel:0 NewMultiRNNCell / rnn / multi_rnn_cell / cell_0 / lstm_cell / bias:0 NewMultiRNNCell / rnn / multi_rnn_cell / cell_1 / lstm_cell / kernel:0 NewMultiRNNCell / rnn / multi_rnn_cell / cell_1 / lstm_cell / bias:0 NewMultiRNNCell / rnn / multi_rnn_cell / cell_2 / lstm_cell / kernel:0 NewMultiRNNCell / rnn / multi_rnn_cell / cell_2 / lstm_cell / bias:0#参数:216576

    lstm3 创建了一组新参数,因为 scope ,因此变量名称不同 . 最后,this post清楚地解释了变量命名 .

相关问题