对影响学习的体重和偏见依赖感到困惑-Java 学习之路

我有一个工作LSTM模型，从其重复状态到输出有一个重量/偏置层 . 然后我还编写了相同的系统，但有两层 . 这意味着我将拥有LSTM，然后是隐藏层，然后是输出 . 我写了这些线来定义这个双层模型，但是没有一次使用它们 . 但是，既然这些层存在但根本没有使用，它就不会学习！所以我的权重和偏差定义如下：

weights = {                                                                                                                                                               
        # if going straight from PLSTM output to x,y prediction                                                                                                               
        'out': tf.Variable(tf.random_normal([FLAGS.n_hidden, n_out], stddev=1/FLAGS.n_hidden, dtype=tf.float32)),                                                             

        # if fully connected feed-forward hidden layer between PLSTM output and x,y prediction                                                                                
        'outHidden1': tf.Variable(tf.random_normal([FLAGS.n_hidden, FLAGS.n_middle], dtype=tf.float32)),                                                                    
        'outHidden2': tf.Variable(tf.random_normal([FLAGS.n_middle, n_out], dtype=tf.float32))                                                                              
    }                                                                                                                                                                         

    biases = {                                                                                                                                                                
        # if going straight from PLSTM output to x,y prediction                                                                                                               
        'out': tf.Variable(tf.random_normal([n_out], dtype=tf.float32)),                                                                                                      

        # if fully connected feed-forward hidden layer between PLSTM output and x,y predictio                                                                                 
        'outHidden1': tf.Variable(tf.random_normal([FLAGS.n_middle], dtype=tf.float32)),                                                                                    
        'outHidden2': tf.Variable(tf.random_normal([n_out], dtype=tf.float32))                                                                                              
    }

所以我定义了双层权重和偏差，但它们在训练或测试中从未使用过一次 .

我将权重/偏差纳入一条线：

return tf.matmul(relevant, weights['out']) + biases['out']

相关的是LSTM输出 . 所以我只使用权重和偏见词典中的'out'变量 .

它什么都学不会 . 然后，一旦我将双层变量注释掉，就像这样：

weights = {                                                                                                                                                               
        # if going straight from PLSTM output to x,y prediction                                                                                                               
        'out': tf.Variable(tf.random_normal([FLAGS.n_hidden, n_out], stddev=1/FLAGS.n_hidden, dtype=tf.float32)),                                                             

        # if fully connected feed-forward hidden layer between PLSTM output and x,y prediction                                                                                
        # 'outHidden1': tf.Variable(tf.random_normal([FLAGS.n_hidden, FLAGS.n_middle], dtype=tf.float32)),                                                                    
        # 'outHidden2': tf.Variable(tf.random_normal([FLAGS.n_middle, n_out], dtype=tf.float32))                                                                              
    }                                                                                                                                                                         

    biases = {                                                                                                                                                                
        # if going straight from PLSTM output to x,y prediction                                                                                                               
        'out': tf.Variable(tf.random_normal([n_out], dtype=tf.float32)),                                                                                                      

        # if fully connected feed-forward hidden layer between PLSTM output and x,y predictio                                                                                 
        # 'outHidden1': tf.Variable(tf.random_normal([FLAGS.n_middle], dtype=tf.float32)),                                                                                    
        # 'outHidden2': tf.Variable(tf.random_normal([n_out], dtype=tf.float32))                                                                                              
    }

......它又开始工作了 . 这些变量的存在如何阻碍学习？我初始化它们，但没有渐变应该贯穿它们，并且backprop不应该与那些未使用的变量有任何关联 . 还是我误解了什么？

对影响学习的体重和偏见依赖感到困惑

相关问题