我正在尝试使用Keras设计LSTM网络,以便在二进制分类设置中组合字嵌入和其他功能 . 我的测试集包含每班250个样本 .

当我仅使用单词嵌入层(代码中的“模型”层)运行我的模型时,我得到的平均F1大约为0.67 . 当我创建一个具有固定大小的其他功能的新分支时,我单独计算(“branch2”)并使用“concat”将它们与单词embeddings合并,预测全部恢复为单个类(为该类提供完美的回忆) ,平均F1下降到0.33 .

我是否错误地添加了功能和培训/测试?

def create_model(embedding_index, sequence_features, optimizer='rmsprop'):
    # Branch 1: word embeddings
    model = Sequential()
    embedding_layer = create_embedding_matrix(embedding_index, word_index)
    model.add(embedding_layer)
    model.add(Convolution1D(nb_filter=32, filter_length=3, border_mode='same', activation='tanh'))
    model.add(MaxPooling1D(pool_length=2))
    model.add(Bidirectional(LSTM(100)))
    model.add(Dropout(0.2))
    model.add(Dense(2, activation='sigmoid'))

    # Branch 2: other features
    branch2 = Sequential()
    dim = sequence_features.shape[1]
    branch2.add(Dense(15, input_dim=dim, init='normal', activation='tanh'))
    branch2.add(BatchNormalization())

    # Merging branches to create final model
    final_model = Sequential()
    final_model.add(Merge([model,branch2], mode='concat'))
    final_model.add(Dense(2, init='normal', activation='sigmoid'))
    final_model.compile(loss='categorical_crossentropy', optimizer=optimizer,
           metrics=['accuracy','precision','recall','fbeta_score','fmeasure'])
    return final_model

def run(input_train, input_dev, input_test, text_col, label_col, resfile, embedding_index):
    # Processing text and features
    data_train, labels_train, data_test, labels_test = vectorize_text(input_train, input_test, text_col,label_col)
    x_train, y_train = data_train, labels_train
    x_test, y_test = data_test, labels_test
    seq_train = get_sequence_features(input_train).as_matrix()
    seq_test = get_sequence_features(input_test).as_matrix()

    # Generating model
    filepath = lstm_config.WEIGHTS_PATH
    checkpoint = ModelCheckpoint(filepath, monitor='val_fmeasure', verbose=1, save_best_only=True, mode='max')
    callbacks_list = [checkpoint]
    model = create_model(embedding_index, seq_train)
    model.fit([x_train, seq_train], y_train, validation_split=0.33, nb_epoch=3, batch_size=100, callbacks=callbacks_list, verbose=1)

    # Evaluating
    scores = model.evaluate([x_test, seq_test], y_test, verbose=1)
    time.sleep(0.2)
    preds = model.predict_classes([x_test, seq_test])
    preds = to_categorical(preds)
    print(metrics.f1_score(y_true=y_test, y_pred=preds, average="micro"))
    print(metrics.f1_score(y_true=y_test, y_pred=preds, average="macro"))
    print(metrics.classification_report(y_test, preds))

输出:

使用Theano后端 . 找到2999999个单词向量 . 处理文本数据集找到7165个独特的令牌 . 数据张量的形状:(1996,50)标签张量的形状:(1996,2)1996年列车500测试列车1337个样本,验证659个样本大纪元1/3 1300/1337 [========= ===================> . ] - ETA:0s - 损失:0.6767 - acc:0.6669 - 精度:0.5557 - 召回:0.6815 - fbeta_score:0.6120 - fmeasure:0.6120 Epoch 00000:val_fmeasure im1337 / 1337 [==============================] - 10s - 损失:0.6772 - acc:0.6672 - 精度:0.5551 - 召回:0.6806 - fbeta_score:0.6113 - fmeasure:0.6113 - val_loss:0.7442 - val_acc:0 .0000e 00 - val_precision:0.0000e 00 - val_recall:0.0000e 00 - val_fbeta_score:0.0000e 00 - val_fmeasure:0.0000e 00 Epoch 2/3 1300/1337 [============================> . ] - ETA:0s - 损失:0.6634 - acc: 0.7269 - 精度:0.5819 - 召回:0.7292 - fbeta_score:0.6462 - fmeasure:0.6462Epoch 00001:val_fmeasure di1337 / 1337 [========================= =====] - 9s - 损失:0.6634 - acc:0.7263 - 精度:0.5830 - 召回:0.7300 - fbeta_score:0.6472 - fmeasure:0.6472 - val_loss :0.7616 - val_acc:0 . 0000e 00 - val_precision:0.0000e 00 - val_recall:0.0000e 00 - val_fbeta_score:0.0000e 00 - val_fmeasure:0.0000e 00 Epoch 3/3 1300/1337 [========= ===================> . ] - ETA:0s - 损失:0.6542 - acc:0.7354 - 精度:0.5879 - 召回:0.7308 - fbeta_score:0.6508 - fmeasure:0.6508纪元00002:val_fmeasure di1337 / 1337 [==============================] - 8s - 损失:0.6545 - acc:0.7337 - 精度:0.5866 - 召回:0.7307 - fbeta_score:0.6500 - fmeasure:0.6500 - val_loss:0.7801 - val_acc:0 . 0000e 00 - val_precision:0.0000e 00 - val_recall:0.0000e 00 - val_fbeta_score:0.0000e 00 - val_fmeasure:0.0000e 00 500/500 [==============================] - 0s 500/500 [========= =====================] - 1s 0.5 /usr/local/lib/python3.4/dist-packages/sklearn/metrics/classification.py:1074: UndefinedMetricWarning:F-score定义不明确,在没有预测样本的标签中设置为0.0 . 'precision','predict',average,warn_for)0.333333333333 /usr/local/lib/python3.4/dist-packages/sklearn/metrics/classification.py:1074:UndefinedMetricWarning:精度和F-score定义不明确在没有预测样本的标签中设置为0.0 . 精确召回f1-score支持

0 0.00 0.00 0.00 250
1 0.50 1.00 0.67 250
平均/总计0.25 0.50 0.33 500