首页 文章

如何在Tensorflow中使用LSTM模型生成样本句子?

提问于
浏览
2

我正在使用LSTM model in Tensorflow .
我已经训练并保存了LSTM模型 . 现在我要完成生成句子的最后一项任务 . 这是我的伪代码:

# We have already the run_epoch(session, m, data, eval_op, verbose=False) function with fee_dict like this:
feed_dict = {m.input_data: x,
             m.targets: y,
             m.initial_state: state}
...
# train and save model
...
# load saved model for generating task
new_sentence = [START_TOKEN]
# Here I want to generate a sentence until END_TOKEN is generated.
while new_sentence[-1] != END_TOKEN:
    logits = get_logits(model, new_sentence)
    # get argmax(logits) or sample(logits)
    next_word = argmax(logits)
    new_sentence.append(next_word)
print(new_sentence)

My question is:
在训练,验证或测试模型时,我必须通过feed_dict字典将两个输入及其标签(通过移位输入一个)提供给模型 . 但是在生成任务中,我只有一个输入,即生成句子new_sentence .

How can I build the right get_logits function or full generate function also?

1 回答

  • 3

    当您训练时,您有神经网络的输出,根据该输出计算错误,根据您创建优化器的错误来最小化错误 .

    为了生成一个新句子,你需要得到神经网络的输出(rnn) .

    编辑:

    """
    Placeholders
    """
    
    x = tf.placeholder(tf.int32, [batch_size, num_steps], name='input_placeholder')
    y = tf.placeholder(tf.int32, [batch_size, num_steps], name='labels_placeholder')
    init_state = tf.zeros([batch_size, state_size])
    
    """
    RNN Inputs
    """
    
    # Turn our x placeholder into a list of one-hot tensors:
    # rnn_inputs is a list of num_steps tensors with shape [batch_size, num_classes]
    x_one_hot = tf.one_hot(x, num_classes)
    rnn_inputs = tf.unpack(x_one_hot, axis=1)
    
    """
    Definition of rnn_cell
    
    This is very similar to the __call__ method on Tensorflow's BasicRNNCell. See:
    https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/rnn_cell.py
    """
    with tf.variable_scope('rnn_cell'):
        W = tf.get_variable('W', [num_classes + state_size, state_size])
        b = tf.get_variable('b', [state_size], initializer=tf.constant_initializer(0.0))
    
    def rnn_cell(rnn_input, state):
        with tf.variable_scope('rnn_cell', reuse=True):
            W = tf.get_variable('W', [num_classes + state_size, state_size])
            b = tf.get_variable('b', [state_size], initializer=tf.constant_initializer(0.0))
        return tf.tanh(tf.matmul(tf.concat(1, [rnn_input, state]), W) + b)
    
    state = init_state
    rnn_outputs = []
    for rnn_input in rnn_inputs:
        state = rnn_cell(rnn_input, state)
        rnn_outputs.append(state)
    final_state = rnn_outputs[-1]
    
    #logits and predictions
    with tf.variable_scope('softmax'):
        W = tf.get_variable('W', [state_size, num_classes])
        b = tf.get_variable('b', [num_classes], initializer=tf.constant_initializer(0.0))
    logits = [tf.matmul(rnn_output, W) + b for rnn_output in rnn_outputs]
    predictions = [tf.nn.softmax(logit) for logit in logits]
    
    # Turn our y placeholder into a list labels
    y_as_list = [tf.squeeze(i, squeeze_dims=[1]) for i in tf.split(1, num_steps, y)]
    
    #losses and train_step
    losses = [tf.nn.sparse_softmax_cross_entropy_with_logits(logit,label) for \
              logit, label in zip(logits, y_as_list)]
    total_loss = tf.reduce_mean(losses)
    train_step = tf.train.AdagradOptimizer(learning_rate).minimize(total_loss)
     def train():
      with tf.Session() as sess:
        #load the model
        training_losses = []
        for idx, epoch in enumerate(gen_epochs(num_epochs, num_steps)):
            training_loss = 0
            training_state = np.zeros((batch_size, state_size))
            if verbose:
                print("\nEPOCH", idx)
            for step, (X, Y) in enumerate(epoch):
                tr_losses, training_loss_, training_state, _ = \
                    sess.run([losses,
                              total_loss,
                              final_state,
                              train_step],
                                  feed_dict={x:X, y:Y, init_state:training_state})
                training_loss += training_loss_
                if step % 100 == 0 and step > 0:
                    if verbose:
                        print("Average loss at step", step,
                              "for last 250 steps:", training_loss/100)
                    training_losses.append(training_loss/100)
                    training_loss = 0
         #save the model
    
    def generate_seq():
      with tf.Session() as sess:
        #load the model
        # load saved model for generating task
        new_sentence = [START_TOKEN]
        # Here I want to generate a sentence until END_TOKEN is generated.
        while new_sentence[-1] != END_TOKEN:
          logits = sess.run(final_state,{x:np.asarray([new_sentence])})
          # get argmax(logits) or sample(logits)
          next_word = argmax(logits[0])
          new_sentence.append(next_word)
      print(new_sentence)
    

相关问题