如何让自动编码器处理小图像数据集-Java 学习之路

我有三个图像的数据集 . 当我创建一个自动编码器来训练这三个图像时，我得到的输出对于每个图像都是完全相同的，它看起来像是所有三个图像的混合 .

我的结果如下：

输入图片1：
enter image description here

输出图像1：
enter image description here

输入图片2：
enter image description here

输出图像2：
enter image description here

输入图片3：
enter image description here

输出图像3：
enter image description here

所以你可以看到输出为每个输入提供完全相同的东西，虽然它相对匹配，但它并不完美 .

这是一个三图像数据集 - 它应该是完美的（或者至少对于每个图像都不同） .

我关注这三个图像数据集，因为当我执行500图像数据集时，我得到的只是一个白色的空白屏幕，因为这是所有图像的最佳平均值 .

我正在使用Keras，代码非常简单 .

from keras.models                   import Sequential
from keras.layers                   import Dense, Flatten, Reshape
import numpy as np

# returns a numpy array with shape (3, 24, 32, 1)
# there are 3 images that are each 24x32 and are black and white (1 color channel)
x_train = get_data()

# this is the size of our encoded representations
# encode down to two numbers (I have tested using 3; I still have the same issue)
encoding_dim = 2
# the shape without the batch amount
input_shape = x_train.shape[1:]
# how many output neurons we need to create an image
input_dim = np.prod(input_shape)

# simple feedforward network
# I've also tried convolutional layers; same issue
autoencoder = Sequential([
              Flatten(), # flatten
              Dense(encoding_dim), # encode
              Dense(input_dim), # decode
              Reshape(input_shape) # reshape decoding
])

# adadelta optimizer works better than adam, same issue with both
autoencoder.compile(optimizer='adadelta', loss='mse')

# train it to output the same thing it gets as input
# I've tried epochs up to 30000 with no improvement;
# still predicts the same image for all three inputs
autoencoder.fit(x_train, x_train,
            epochs=10,
            batch_size=1,
            verbose=1)

out = autoencoder.predict(x_train)

然后我获取输出（ out[0] ， out[1] ， out[2] ）并将它们转换回图像 . 您可以在上面看到输出图像 .

我很担心因为这表明自动编码器没有保留任何有关输入图像的信息，这不是编码器应该如何执行的 .

如何让编码器根据输入图像显示输出差异？

编辑：

我的一位同事建议不使用自动编码器，而是使用1层前馈神经网络 . 我尝试了这个，同样的事情发生了，直到我将批量大小设置为1并训练了1400个时代，然后它完美地工作 . 这让我认为more epochs会解决这个问题，但我还不确定 .

编辑：

对10,000个时期（批量大小为3）的训练使得第二个图像看起来与编码器上的第一个和第三个不同，这正是非编码器版本在运行大约400个时期时发生的情况（也是批量大小3））提供进一步的证据，证明更多时代的培训可能是解决方案 .

要使用批量大小1进行测试，看看是否有更多帮助，然后尝试训练很多时期，看看是否完全解决了这个问题 .

1 回答

2

我的编码尺寸太小了 . 尝试将24x32图像编码为2个数字（或3个数字）对于自动编码器来说太多了 .

通过将 encoding_dim 提升到32，这个问题几乎已经解决了 . 我能够使用Adadelta优化器的默认学习速率 . 我的数据甚至不需要标准化（只需将所有像素除以255） .

"binary_crossentropy" 损失函数似乎比 "mse" 更快/更好，虽然 "mse" （均方误差）工作得很好 .

然而，在前几百个时代，它确实看起来像是混合了图像 . 然而，随着它训练的时间越长，它开始分离的越多 .

我还使编码层的输出激活为 relu ，并且解码层的激活为 sigmoid . 我测试了它 .

This page帮助了解了我做错了什么 . 我只是复制/粘贴代码，发现它在我的数据集上有效，所以剩下的就是弄清楚我做错了什么 .

这是他们的简单自动编码器架构的一些图像处理我的数据集（这是我的第一个希望的迹象）：

500时代：

2000年纪元：

回复于 2024-05-05T07:13:34+08:00

如何让自动编码器处理小图像数据集

1 回答

相关问题