我正在尝试训练2D卷积LSTM,以根据视频数据进行分类预测 . 但是,我的输出层似乎遇到了问题:
“ValueError:检查目标时出错:期望dense_1有5个维度,但得到的数组有形状(1,1939,9)”
我目前的模型基于Keras团队提供的ConvLSTM2D example . 我认为上述错误是我误解这个例子及其基本原则的结果 .
Data
我有任意数量的视频,每个视频包含任意数量的帧 . 每帧为135x240x1(最后颜色通道) . 这导致输入形状为(None,None,135,240,1),其中两个“None”值是批量大小和按该顺序的时间步长 . 如果我在1052帧的单个视频上训练,那么我的输入形状变为(1,1052,135,240,1) .
对于每个帧,模型应预测9个类中0到1之间的值 . 这意味着我的输出形状是(None,None,9) . 如果我在1052帧的单个视频上训练,那么这个形状变为(1,1052,9) .
Model
Layer (type) Output Shape Param #
=================================================================
conv_lst_m2d_1 (ConvLSTM2D) (None, None, 135, 240, 40 59200
_________________________________________________________________
batch_normalization_1 (Batch (None, None, 135, 240, 40 160
_________________________________________________________________
conv_lst_m2d_2 (ConvLSTM2D) (None, None, 135, 240, 40 115360
_________________________________________________________________
batch_normalization_2 (Batch (None, None, 135, 240, 40 160
_________________________________________________________________
conv_lst_m2d_3 (ConvLSTM2D) (None, None, 135, 240, 40 115360
_________________________________________________________________
batch_normalization_3 (Batch (None, None, 135, 240, 40 160
_________________________________________________________________
dense_1 (Dense) (None, None, 135, 240, 9) 369
=================================================================
Total params: 290,769
Trainable params: 290,529
Non-trainable params: 240
Source code
model = Sequential()
model.add(ConvLSTM2D(
filters=40,
kernel_size=(3, 3),
input_shape=(None, 135, 240, 1),
padding='same',
return_sequences=True))
model.add(BatchNormalization())
model.add(ConvLSTM2D(
filters=40,
kernel_size=(3, 3),
padding='same',
return_sequences=True))
model.add(BatchNormalization())
model.add(ConvLSTM2D(
filters=40,
kernel_size=(3, 3),
padding='same',
return_sequences=True))
model.add(BatchNormalization())
model.add(Dense(
units=classes,
activation='softmax'
))
model.compile(
loss='categorical_crossentropy',
optimizer='adadelta'
)
model.fit_generator(generator=training_sequence)
Traceback
Epoch 1/1
Traceback (most recent call last):
File ".\lstm.py", line 128, in <module>
main()
File ".\lstm.py", line 108, in main
model.fit_generator(generator=training_sequence)
File "C:\Users\matth\Anaconda3\envs\capstone-gpu\lib\site-packages\keras\legacy\interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "C:\Users\matth\Anaconda3\envs\capstone-gpu\lib\site-packages\keras\models.py", line 1253, in fit_generator
initial_epoch=initial_epoch)
File "C:\Users\matth\Anaconda3\envs\capstone-gpu\lib\site-packages\keras\legacy\interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "C:\Users\matth\Anaconda3\envs\capstone-gpu\lib\site-packages\keras\engine\training.py", line 2244, in fit_generator
class_weight=class_weight)
File "C:\Users\matth\Anaconda3\envs\capstone-gpu\lib\site-packages\keras\engine\training.py", line 1884, in train_on_batch
class_weight=class_weight)
File "C:\Users\matth\Anaconda3\envs\capstone-gpu\lib\site-packages\keras\engine\training.py", line 1487, in _standardize_user_data
exception_prefix='target')
File "C:\Users\matth\Anaconda3\envs\capstone-gpu\lib\site-packages\keras\engine\training.py", line 113, in _standardize_input_data
'with shape ' + str(data_shape))
ValueError: Error when checking target: expected dense_1 to have 5 dimensions, but got array with shape (1, 1939, 9)
打印批量大小设置为1的样本输入形状为(1,1389,135,240,1) . 这个形状符合我上面描述的要求,所以我认为我的Keras Sequence子类(在源代码中为“training_sequence”)是正确的 .
我怀疑这个问题是由我直接从BatchNormalization()转到Dense()引起的 . 毕竟,回溯表明问题发生在dense_1(最后一层) . 但是,我不想让任何人误入歧途,因为我的初学者知识也是如此,所以请大家用我的评估 .
Edit 3/27/2018
在阅读涉及类似模型的this thread之后,我更改了最终的ConvLSTM2D图层,以便将return_sequences参数设置为False而不是True . 我还在Dense图层之前添加了一个GlobalAveragePooling2D图层 . 更新的模型如下:
Layer (type) Output Shape Param #
=================================================================
conv_lst_m2d_1 (ConvLSTM2D) (None, None, 135, 240, 40 59200
_________________________________________________________________
batch_normalization_1 (Batch (None, None, 135, 240, 40 160
_________________________________________________________________
conv_lst_m2d_2 (ConvLSTM2D) (None, None, 135, 240, 40 115360
_________________________________________________________________
batch_normalization_2 (Batch (None, None, 135, 240, 40 160
_________________________________________________________________
conv_lst_m2d_3 (ConvLSTM2D) (None, 135, 240, 40) 115360
_________________________________________________________________
batch_normalization_3 (Batch (None, 135, 240, 40) 160
_________________________________________________________________
global_average_pooling2d_1 ( (None, 40) 0
_________________________________________________________________
dense_1 (Dense) (None, 9) 369
=================================================================
Total params: 290,769
Trainable params: 290,529
Non-trainable params: 240
这是回溯的新副本:
Traceback (most recent call last):
File ".\lstm.py", line 131, in <module>
main()
File ".\lstm.py", line 111, in main
model.fit_generator(generator=training_sequence)
File "C:\Users\matth\Anaconda3\envs\capstone-gpu\lib\site-packages\keras\legacy\interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "C:\Users\matth\Anaconda3\envs\capstone-gpu\lib\site-packages\keras\models.py", line 1253, in fit_generator
initial_epoch=initial_epoch)
File "C:\Users\matth\Anaconda3\envs\capstone-gpu\lib\site-packages\keras\legacy\interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "C:\Users\matth\Anaconda3\envs\capstone-gpu\lib\site-packages\keras\engine\training.py", line 2244, in fit_generator
class_weight=class_weight)
File "C:\Users\matth\Anaconda3\envs\capstone-gpu\lib\site-packages\keras\engine\training.py", line 1884, in train_on_batch
class_weight=class_weight)
File "C:\Users\matth\Anaconda3\envs\capstone-gpu\lib\site-packages\keras\engine\training.py", line 1487, in _standardize_user_data
exception_prefix='target')
File "C:\Users\matth\Anaconda3\envs\capstone-gpu\lib\site-packages\keras\engine\training.py", line 113, in _standardize_input_data
'with shape ' + str(data_shape))
ValueError: Error when checking target: expected dense_1 to have 2 dimensions, but got array with shape (1, 1034, 9)
我在这次运行中打印了x和y形状 . x为(1,1034,135,240,1),y为(1,1034,9) . 这可能会缩小问题范围 . 看起来问题与y数据而不是x数据有关 . 具体来说,密集层不喜欢时间暗淡 . 但是,我不确定如何纠正这个问题 .
Edit 3/28/2018
Yu-Yang的解决方案奏效了 . 对于有类似问题想要查看最终模型外观的人,以下是摘要:
Layer (type) Output Shape Param #
=================================================================
conv_lst_m2d_1 (ConvLSTM2D) (None, None, 135, 240, 40 59200
_________________________________________________________________
batch_normalization_1 (Batch (None, None, 135, 240, 40 160
_________________________________________________________________
conv_lst_m2d_2 (ConvLSTM2D) (None, None, 135, 240, 40 115360
_________________________________________________________________
batch_normalization_2 (Batch (None, None, 135, 240, 40 160
_________________________________________________________________
conv_lst_m2d_3 (ConvLSTM2D) (None, None, 135, 240, 40 115360
_________________________________________________________________
batch_normalization_3 (Batch (None, None, 135, 240, 40 160
_________________________________________________________________
average_pooling3d_1 (Average (None, None, 1, 1, 40) 0
_________________________________________________________________
reshape_1 (Reshape) (None, None, 40) 0
_________________________________________________________________
dense_1 (Dense) (None, None, 9) 369
=================================================================
Total params: 290,769
Trainable params: 290,529
Non-trainable params: 240
另外,源代码:
model = Sequential()
model.add(ConvLSTM2D(
filters=40,
kernel_size=(3, 3),
input_shape=(None, 135, 240, 1),
padding='same',
return_sequences=True))
model.add(BatchNormalization())
model.add(ConvLSTM2D(
filters=40,
kernel_size=(3, 3),
padding='same',
return_sequences=True))
model.add(BatchNormalization())
model.add(ConvLSTM2D(
filters=40,
kernel_size=(3, 3),
padding='same',
return_sequences=True))
model.add(BatchNormalization())
model.add(AveragePooling3D((1, 135, 240)))
model.add(Reshape((-1, 40)))
model.add(Dense(
units=9,
activation='sigmoid'))
model.compile(
loss='categorical_crossentropy',
optimizer='adadelta'
)
1 回答
如果你想要每帧预测,那么你一定要在最后的
ConvLSTM2D
层设置return_sequences=True
.对于
ValueError
目标形状,将GlobalAveragePooling2D()
图层替换为AveragePooling3D((1, 135, 240))
加Reshape((-1, 40))
,以使输出形状与目标阵列兼容 .