我想 Build 一个具有以下特征的端到端可训练模型:
-
CNN从图像中提取特征
-
将要素重新整形为矩阵
-
然后将该矩阵的每一行馈送到LSTM1
-
然后将该矩阵的每列送入LSTM2
-
LSTM1和LSTM2的输出连接为最终输出
(它或多或少类似于本文中的图2:https://arxiv.org/pdf/1611.07890.pdf)
我现在的问题是重塑后,如何使用Keras或Tensorflow将特征矩阵的值提供给LSTM?
到目前为止,这是我使用VGG16网络的代码(也是指向Keras issues的链接):
# VGG16
model = Sequential()
model.add(Conv2D(64, (3, 3), activation='relu', padding='same', input_shape=(224, 224, 3)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
# block 2
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
# block 3
model.add(Conv2D(256, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(256, (3, 3), activation='relu'))
model.add(Conv2D(256, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
# block 4
model.add(Conv2D(512, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(512, (3, 3), activation='relu'))
model.add(Conv2D(512, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
# block 5
model.add(Conv2D(512, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(512, (3, 3), activation='relu'))
model.add(Conv2D(512, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
# block 6
model.add(Flatten())
model.add(Dense(4096, activation='relu'))
model.add(Dense(4096, activation='relu'))
# reshape the feature 4096 = 64 * 64
model.add(Reshape((64, 64)))
# How to feed each row of this to LSTM?
# This is my first solution but it doesn’t look correct:
# model.add(LSTM(256, input_shape=(64, 1))) # 256 hidden units, sequence length = 64, feature dim = 1
1 回答
考虑使用Conv2D和MaxPool2D图层构建CNN模型,直到到达Flatten图层,因为Flatten图层的矢量化输出将是您将数据输入到结构的LSTM部分 .
那么,像这样构建你的CNN模型:
现在,这是一个有趣的观点,当前版本的Keras与某些TensorFlow结构不兼容,这些结构不允许您将整个图层堆叠在一个Sequential对象中 .
因此,现在是时候使用Keras模型对象通过一个技巧完成神经网络:
最后,现在是时候将我们的2个分离模型组合在一起了:
OBS: Note that you can substitute my model_cnn example by your VGG without including the top layers, once the vectorized output from the VGG Flatten layer will be the input of the LSTM model.