我试图使用双向LSTM,将文本数据(句子)分类到某些类 . 我用其中3个作为例子 . 我跟着multilabel-classification-post,即"Use sigmoid for activation of your output layer","Use binary_crossentropy for loss function" . 我使用了一个嵌入层(大小为300的字向量) . 我的句子被填充和截断,以便每个句子有100个令牌 . 这是我的模型的代码:
model = Sequential()
embedding_layer = Embedding(6695,
300,
weights=[embedding_matrix],
input_length=100,
trainable=True)
model.add(embedding_layer)
model.add(Bidirectional(LSTM(32,
return_sequences=False)))
model.add(Dense(3,
activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=['acc'])
print("model fitting - Bidirectional LSTM")
model.summary()
x= model.fit(X_train, y_train,
batch_size=256,
epochs=6,
validation_data=(X_val, y_val),
shuffle = True,
verbose = 1
)
这是模型摘要,这是预期的:enter image description here
但是,我收到了这个错误:
Traceback (most recent call last):
File "/Users/master/Documents/Deep Learning/Learning Keras/reveiw_classification.py", line 159, in <module>
verbose = 1
File "/Users/master/.pyenv/versions/ENV4/lib/python3.6/site-packages/keras/engine/training.py", line 955, in fit
batch_size=batch_size)
File "/Users/master/.pyenv/versions/ENV4/lib/python3.6/site-packages/keras/engine/training.py", line 792, in _standardize_user_data
exception_prefix='target')
File "/Users/master/.pyenv/versions/ENV4/lib/python3.6/site-packages/keras/engine/training_utils.py", line 136, in standardize_input_data
str(data_shape))
ValueError: Error when checking target: expected dense_1 to have shape (3,) but got array with shape (100,)
我不需要LSTM返回一系列隐藏状态输出,我只需要最后一个输出 . 我以为我在LSTM中使用了return_sequences = False,因此输出应该具有维度1,然后具有32个单位的双向LSTM将具有输出维度(None,64),如模型摘要中所示 . 但为什么它说预期dense_1有形状(3,)但是有阵形(100,)?有人可以帮我吗?
1 回答
看起来你的目标
y_train
实际上是句子,而不是标签的矢量[1, 0, 1]
. 错误与模型无关,而与您传递的数据有关 .y_train
应该是一个形状为(num_samples, 3)
的二维数组,所以对于每个样本(句子),你有一个3个标签的矢量目标 .在这种情况下
X_train
将类似于(num_samples, 100)
,其长度为100的句子作为您的输入 .