我正在尝试训练编码器解码器网络,该网络编码某个维度的特征并将其解码为不同维度的特征 . 更具体地说,输入功能看起来像[500,215,7](批量,帧,通道),输出功能看起来像[500,110,7] . 我尝试了两种方法:
- 线性自动编码器
def encoder(x):
with tf.name_scope('Encoder'):
x = np.reshape(x, [-1, 215*7])
net = tf.layers.dense(inputs=x, units=2000, activation=tf.nn.relu)
net = tf.layers.dense(inputs=net, units=1000, activation=tf.nn.relu)
net = tf.layers.dense(inputs=net, units=500, activation=tf.nn.relu)
net = tf.layers.dense(inputs=net, units=250, activation=tf.nn.relu)
code = tf.layers.dense(inputs=net, units=125, activation=tf.nn.relu)
return code
def decoder(code):
with tf.name_scope('Decoder'):
net = tf.layers.dense(inputs=code, units=125, activation=tf.nn.relu)
net = tf.layers.dense(inputs=net, units=500, activation=tf.nn.relu)
net = tf.layers.dense(inputs=net, units=1000, activation=tf.nn.relu)
net = tf.layers.dense(inputs=net, units=2000, activation=tf.nn.relu)
net = tf.layers.dense(inputs=net, units=770, activation=tf.nn.relu)
net = np.reshape(net, [-1, 110, 7])
return net
虽然训练损失从20到12减少 . 训练结束后,我输入了[500,215,7]特征和自动编码器生成的[500,110,7]特征 . 该网络能够生成3个非常接近实际[500,110,7]功能的通道 . 这是一个数据点:
Original [ 0.35101 2.6753289 -0.84253965 0.971104 -0.34277865 -0.4877893 0.011089 ] Generated [0.3437522 2.6829777 0. 0.9715183 0. 0. 0. ]
But, 4 channels are 0 in all datapoints.
- ConvolutionalAutoEncoder
def encoder(x):
with tf.name_scope('Encoder'):
net = tf.reshape(x, [-1, 215, 7, 1])
net = tf.layers.conv2d(inputs=net, filters=32, kernel_size=[20, 5], padding="same", activation=tf.nn.relu)
net = tf.layers.max_pooling2d(inputs=net, pool_size=[10, 1], strides=2)
net = tf.layers.conv2d(inputs=net, filters=64, kernel_size=[1, 3], padding="same", activation=tf.nn.relu)
net = tf.layers.max_pooling2d(inputs=net, pool_size=[10, 1], strides=2)
net = tf.layers.flatten(net)
net = tf.layers.dense(inputs=net, units=1024, activation=tf.nn.relu)
net = tf.layers.dropout(inputs=net, rate=keep_prob)
return net
def decoder(code):
with tf.name_scope('Decoder'):
net = tf.layers.dense(inputs=code, units=6528, activation=tf.nn.relu)
net = tf.reshape(net, [-1, 51, 2, 64])
net = tf.image.resize_bilinear(net, size=[51, 4])
net = tf.layers.conv2d(inputs=net, filters=32, kernel_size=[1, 3], padding="same", activation=tf.nn.relu)
net = tf.image.resize_bilinear(net, size=[110, 7])
net = tf.layers.conv2d(inputs=net, filters=1, kernel_size=[20, 5], padding="same", activation=tf.nn.relu)
net = tf.reshape(net, [-1, 110, 7*1])
return net
损失从5.8减少到2.7 . Here also network learn 3 channels, but other channels are 0.
Original [ 1.046344 2.77010455 -0.45842518 0.882744 -0.36058491 -0.37393818 -0.01158755] Generated [1.0245576 2.7641041 0. 0.86774087 0. 0. 0. ]
我使用这个损失函数训练了5000个时代:
# Define loss
with tf.name_scope('Loss'):
l2 = tf.sqrt(tf.reduce_sum(tf.square(tf.subtract(output, Y)), 1))
cost = tf.reduce_mean(l2)
# Define optimizer
with tf.name_scope('Optimizer'):
train_op = tf.train.AdamOptimizer(learning_rate=0.001).minimize(cost)
我已经尝试了一些网络,但他们只学习了2个 Channels . Why the network can't learn all the channels? Can you point some network configurations I can try that has different features in both ends.