我正在尝试训练编码器解码器网络,该网络编码某个维度的特征并将其解码为不同维度的特征 . 更具体地说,输入功能看起来像[500,215,7](批量,帧,通道),输出功能看起来像[500,110,7] . 我尝试了两种方法:

  • 线性自动编码器
def encoder(x):
    with tf.name_scope('Encoder'):
       x = np.reshape(x, [-1, 215*7])
       net = tf.layers.dense(inputs=x, units=2000, activation=tf.nn.relu)
       net = tf.layers.dense(inputs=net, units=1000, activation=tf.nn.relu)
       net = tf.layers.dense(inputs=net, units=500, activation=tf.nn.relu)
       net = tf.layers.dense(inputs=net, units=250, activation=tf.nn.relu)
       code = tf.layers.dense(inputs=net, units=125, activation=tf.nn.relu)
return code

def decoder(code):
    with tf.name_scope('Decoder'):
        net = tf.layers.dense(inputs=code, units=125, activation=tf.nn.relu)
        net = tf.layers.dense(inputs=net, units=500, activation=tf.nn.relu)
        net = tf.layers.dense(inputs=net, units=1000, activation=tf.nn.relu)
        net = tf.layers.dense(inputs=net, units=2000, activation=tf.nn.relu)
        net = tf.layers.dense(inputs=net, units=770, activation=tf.nn.relu)
        net = np.reshape(net, [-1, 110, 7])
return net

虽然训练损失从20到12减少 . 训练结束后,我输入了[500,215,7]特征和自动编码器生成的[500,110,7]特征 . 该网络能够生成3个非常接近实际[500,110,7]功能的通道 . 这是一个数据点:

Original [ 0.35101 2.6753289 -0.84253965 0.971104 -0.34277865 -0.4877893 0.011089 ] Generated [0.3437522 2.6829777 0. 0.9715183 0. 0. 0. ]

But, 4 channels are 0 in all datapoints.

  • ConvolutionalAutoEncoder
def encoder(x):
    with tf.name_scope('Encoder'):
        net = tf.reshape(x, [-1, 215, 7, 1])
        net = tf.layers.conv2d(inputs=net, filters=32, kernel_size=[20, 5], padding="same", activation=tf.nn.relu)
        net = tf.layers.max_pooling2d(inputs=net, pool_size=[10, 1], strides=2)
        net = tf.layers.conv2d(inputs=net, filters=64, kernel_size=[1, 3], padding="same", activation=tf.nn.relu)
        net = tf.layers.max_pooling2d(inputs=net, pool_size=[10, 1], strides=2)
        net = tf.layers.flatten(net)
        net = tf.layers.dense(inputs=net, units=1024, activation=tf.nn.relu)
        net = tf.layers.dropout(inputs=net, rate=keep_prob)
return net

def decoder(code):
    with tf.name_scope('Decoder'):
        net = tf.layers.dense(inputs=code, units=6528, activation=tf.nn.relu)
        net = tf.reshape(net, [-1, 51, 2, 64])
        net = tf.image.resize_bilinear(net, size=[51, 4])
        net = tf.layers.conv2d(inputs=net, filters=32, kernel_size=[1, 3], padding="same", activation=tf.nn.relu)
        net = tf.image.resize_bilinear(net, size=[110, 7])
        net = tf.layers.conv2d(inputs=net, filters=1, kernel_size=[20, 5], padding="same", activation=tf.nn.relu)
        net = tf.reshape(net, [-1, 110, 7*1])
return net

损失从5.8减少到2.7 . Here also network learn 3 channels, but other channels are 0.

Original [ 1.046344 2.77010455 -0.45842518 0.882744 -0.36058491 -0.37393818 -0.01158755] Generated [1.0245576 2.7641041 0. 0.86774087 0. 0. 0. ]

我使用这个损失函数训练了5000个时代:

# Define loss
with tf.name_scope('Loss'):
    l2 = tf.sqrt(tf.reduce_sum(tf.square(tf.subtract(output, Y)), 1))
    cost = tf.reduce_mean(l2)

# Define optimizer
with tf.name_scope('Optimizer'):
    train_op = tf.train.AdamOptimizer(learning_rate=0.001).minimize(cost)

我已经尝试了一些网络,但他们只学习了2个 Channels . Why the network can't learn all the channels? Can you point some network configurations I can try that has different features in both ends.