如何在Tensorflow中结合FCNN和RNN？-Java 学习之路

我想创建一个神经网络，它在某些层具有重复性（例如，LSTM），在其他层具有正常连接（FC） . 我在Tensorflow找不到办法 . 它有效，如果我只有FC层，但我不知道如何正确添加一个复发层 .

我以下列方式创建网络：

with tf.variable_scope("autoencoder_variables", reuse=None) as scope:

  for i in xrange(self.__num_hidden_layers + 1):
    # Train weights
    name_w = self._weights_str.format(i + 1)
    w_shape = (self.__shape[i], self.__shape[i + 1])
    a = tf.multiply(4.0, tf.sqrt(6.0 / (w_shape[0] + w_shape[1])))
    w_init = tf.random_uniform(w_shape, -1 * a, a)
    self[name_w] = tf.Variable(w_init,
                               name=name_w,
                               trainable=True)
    # Train biases
    name_b = self._biases_str.format(i + 1)
    b_shape = (self.__shape[i + 1],)
    b_init = tf.zeros(b_shape)
    self[name_b] = tf.Variable(b_init, trainable=True, name=name_b)

    if i+1 == self.__recurrent_layer:
      # Create an LSTM cell
      lstm_size = self.__shape[self.__recurrent_layer]
      self['lstm'] = tf.contrib.rnn.BasicLSTMCell(lstm_size)

它应该按顺序处理批次 . 我有一个函数只处理一个时间步骤，后面将调用一个函数，它处理整个序列：

def single_run(self, input_pl, state, just_middle = False):
  """Get the output of the autoencoder for a single batch

  Args:
    input_pl:     tf placeholder for ae input data of size [batch_size, DoF]
    state:        current state of LSTM memory units
    just_middle : will indicate if we want to extract only the middle layer of the network
  Returns:
    Tensor of output
  """

  last_output = input_pl

  # Pass through the network
  for i in xrange(self.num_hidden_layers+1):

    if(i!=self.__recurrent_layer):
      w = self._w(i + 1)
      b = self._b(i + 1)
      last_output = self._activate(last_output, w, b)

    else:
      last_output, state = self['lstm'](last_output,state)

  return last_output

以下函数应将批次序列作为输入并生成批次序列作为输出：

def process_sequences(self, input_seq_pl, dropout, just_middle = False):
  """Get the output of the autoencoder

  Args:
    input_seq_pl:     input data of size [batch_size, sequence_length, DoF]
    dropout:          dropout rate
    just_middle :     indicate if we want to extract only the middle layer of the network
  Returns:
    Tensor of output
  """

  if(~just_middle): # if not middle layer
    numb_layers = self.__num_hidden_layers+1
  else:
    numb_layers = FLAGS.middle_layer

  with tf.variable_scope("process_sequence", reuse=None) as scope:

    # Initial state of the LSTM memory.
    state = initial_state = self['lstm'].zero_state(FLAGS.batch_size, tf.float32)

    tf.get_variable_scope().reuse_variables() # THIS IS IMPORTANT LINE

    # First - Apply Dropout
    the_whole_sequences = tf.nn.dropout(input_seq_pl, dropout)

    # Take batches for every time step and run them through the network
    # Stack all their outputs
    with tf.control_dependencies([tf.convert_to_tensor(state, name='state') ]): # do not let paralelize the loop
      stacked_outputs = tf.stack( [ self.single_run(the_whole_sequences[:,time_st,:], state, just_middle) for time_st in range(self.sequence_length) ])

    # Transpose output from the shape [sequence_length, batch_size, DoF] into [batch_size, sequence_length, DoF]

    output = tf.transpose(stacked_outputs , perm=[1, 0, 2])

  return output

问题在于变量范围及其属性“重用” .

如果我按原样运行此代码，则会收到以下错误：'Variable Train / process_sequence / basic_lstm_cell / weights不存在，或者未使用tf.get_variable（）创建 . 你是不是要在VarScope中设置reuse = None？ “

如果我注释掉该行，告诉它重用变量（tf.get_variable_scope（） . reuse_variables（））我收到以下错误：'Variable Train / process_sequence / basic_lstm_cell / weights已经存在，不允许 . 你的意思是在VarScope中设置reuse = True吗？

似乎我们需要“reuse = None”来初始化LSTM单元的权重，我们需要“reuse = True”来调用LSTM单元 .

请帮我弄清楚正确的方法 .

2 回答

0

我认为问题是你用tf.Variable创建变量 . 请使用tf.get_variable代替 - 这可以解决您的问题吗？

回复于 2024-05-06T00:29:03+08:00
0
似乎我使用官方Tensorflow RNN示例（https://www.tensorflow.org/tutorials/recurrent）中的hack解决了这个问题，代码如下
```
with tf.variable_scope("RNN"):
  for time_step in range(num_steps):
    if time_step > 0: tf.get_variable_scope().reuse_variables()
    (cell_output, state) = cell(inputs[:, time_step, :], state)
    outputs.append(cell_output)
```
黑客是当我们第一次运行LSTM时，tf.get_variable_scope（） . reuse被设置为False，以便创建新的LSTM单元 . 当我们下次运行它时，我们将tf.get_variable_scope（） . reuse设置为True，这样我们就可以使用已经创建的LSTM .
回复于 2024-05-06T00:29:03+08:00

如何在Tensorflow中结合FCNN和RNN？

2 回答

相关问题