首页 文章

当使用keras训练多级nn时,丢失的原因可能不会更进一步

提问于
浏览
1

我正在使用keras训练一个多类神经网络(后端是张量流) . 我将在最终位置给出我的设置和一些代码 .

描述是:当我进行10个文件夹交叉验证时,训练损失和验证损失在最初的10-15个时期下降,但是在15个时期之后不能再进一步下降并且保持在约(损失:1.0606 - acc:0.6301- val_loss:1.1577 - val_acc:0.5774) .

我为我的设置尝试了几处更改 . 例如,添加隐藏图层,添加 normalization.BatchNormalization() ,将优化器从 adam 更改为 sgdrmsprop ,将损失函数从 categorical_crossentropy 更改为其他 . 但没有效果 .

我想讨论一下这种事情的可能原因是什么 . 如果这里有摘要文件或介绍,我会很高兴 .

我的数据有10000行 . 并且功能具有0/1的507属性 . 标签是多类的,类num = 7.类之间的 balancer 几乎没问题,因为我从更大的数据集中选择了10000个数据 .

我的模型如下:

model = Sequential()
model.add(Dense(500, activation='relu', input_dim=self.feature_dim, 
kernel_regularizer=regularizers.l2(0.01)))
model.add(Dense(100, activation='relu'))
model.add(Dense(self.label_dim, activation='softmax'))
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

一些日志如下:

Running Fold 1/10
Train on 9534 samples, validate on 1060 samples
Epoch 1/100
1000/9534 [==>...........................] - ETA: 7s - loss: 6.9644 - acc: 0.1150
2000/9534 [=====>........................] - ETA: 3s - loss: 6.8357 - acc: 0.1715
3000/9534 [========>.....................] - ETA: 2s - loss: 6.7147 - acc: 0.2243
4000/9534 [===========>..................] - ETA: 1s - loss: 6.5922 - acc: 0.2683
5000/9534 [==============>...............] - ETA: 1s - loss: 6.4779 - acc: 0.2908
6000/9534 [=================>............] - ETA: 0s - loss: 6.3618 - acc: 0.3097
7000/9534 [=====================>........] - ETA: 0s - loss: 6.2513 - acc: 0.3244
8000/9534 [========================>.....] - ETA: 0s - loss: 6.1465 - acc: 0.3340
9000/9534 [===========================>..] - ETA: 0s - loss: 6.0439 - acc: 0.3411
9534/9534 [==============================] - 1s - loss: 5.9900 - acc: 0.3442 - val_loss: 4.8716 - val_acc: 0.4377
Epoch 2/100
1000/9534 [==>...........................] - ETA: 0s - loss: 4.8370 - acc: 0.4340
2000/9534 [=====>........................] - ETA: 0s - loss: 4.7593 - acc: 0.4415
3000/9534 [========>.....................] - ETA: 0s - loss: 4.6923 - acc: 0.4423
4000/9534 [===========>..................] - ETA: 0s - loss: 4.6176 - acc: 0.4557
5000/9534 [==============>...............] - ETA: 0s - loss: 4.5517 - acc: 0.4642
6000/9534 [=================>............] - ETA: 0s - loss: 4.4809 - acc: 0.4703
7000/9534 [=====================>........] - ETA: 0s - loss: 4.4036 - acc: 0.4804
8000/9534 [========================>.....] - ETA: 0s - loss: 4.3364 - acc: 0.4821
9000/9534 [===========================>..] - ETA: 0s - loss: 4.2652 - acc: 0.4901
9534/9534 [==============================] - 1s - loss: 4.2316 - acc: 0.4928 - val_loss: 3.5151 - val_acc: 0.5179
Epoch 3/100
1000/9534 [==>...........................] - ETA: 1s - loss: 3.4892 - acc: 0.5370
2000/9534 [=====>........................] - ETA: 1s - loss: 3.4573 - acc: 0.5395
3000/9534 [========>.....................] - ETA: 0s - loss: 3.4006 - acc: 0.5450
4000/9534 [===========>..................] - ETA: 0s - loss: 3.3430 - acc: 0.5435
5000/9534 [==============>...............] - ETA: 0s - loss: 3.2929 - acc: 0.5448
6000/9534 [=================>............] - ETA: 0s - loss: 3.2414 - acc: 0.5448
7000/9534 [=====================>........] - ETA: 0s - loss: 3.1959 - acc: 0.5446
8000/9534 [========================>.....] - ETA: 0s - loss: 3.1489 - acc: 0.5485
9000/9534 [===========================>..] - ETA: 0s - loss: 3.1021 - acc: 0.5501
9534/9534 [==============================] - 1s - loss: 3.0832 - acc: 0.5481 - val_loss: 2.6184 - val_acc: 0.5349
Epoch 4/100
1000/9534 [==>...........................] - ETA: 1s - loss: 2.5950 - acc: 0.5640
2000/9534 [=====>........................] - ETA: 1s - loss: 2.5570 - acc: 0.5705
3000/9534 [========>.....................] - ETA: 0s - loss: 2.5197 - acc: 0.5743
4000/9534 [===========>..................] - ETA: 0s - loss: 2.4929 - acc: 0.5650
5000/9534 [==============>...............] - ETA: 0s - loss: 2.4703 - acc: 0.5646
6000/9534 [=================>............] - ETA: 0s - loss: 2.4388 - acc: 0.5648
7000/9534 [=====================>........] - ETA: 0s - loss: 2.4054 - acc: 0.5680
8000/9534 [========================>.....] - ETA: 0s - loss: 2.3798 - acc: 0.5649
9000/9534 [===========================>..] - ETA: 0s - loss: 2.3522 - acc: 0.5662
9534/9534 [==============================] - 1s - loss: 2.3342 - acc: 0.5685 - val_loss: 2.0442 - val_acc: 0.5491
Epoch 5/100
1000/9534 [==>...........................] - ETA: 0s - loss: 2.0090 - acc: 0.5830
2000/9534 [=====>........................] - ETA: 0s - loss: 1.9990 - acc: 0.5865
3000/9534 [========>.....................] - ETA: 0s - loss: 1.9812 - acc: 0.5833
4000/9534 [===========>..................] - ETA: 0s - loss: 1.9558 - acc: 0.5835
5000/9534 [==============>...............] - ETA: 0s - loss: 1.9377 - acc: 0.5832
6000/9534 [=================>............] - ETA: 0s - loss: 1.9173 - acc: 0.5832
7000/9534 [=====================>........] - ETA: 0s - loss: 1.8968 - acc: 0.5850
8000/9534 [========================>.....] - ETA: 0s - loss: 1.8759 - acc: 0.5851
9000/9534 [===========================>..] - ETA: 0s - loss: 1.8582 - acc: 0.5846
9534/9534 [==============================] - 1s - loss: 1.8501 - acc: 0.5834 - val_loss: 1.6868 - val_acc: 0.5500
Epoch 6/100
1000/9534 [==>...........................] - ETA: 0s - loss: 1.6716 - acc: 0.5790
2000/9534 [=====>........................] - ETA: 0s - loss: 1.6387 - acc: 0.5910
3000/9534 [========>.....................] - ETA: 0s - loss: 1.6163 - acc: 0.5910
4000/9534 [===========>..................] - ETA: 0s - loss: 1.6130 - acc: 0.5882
5000/9534 [==============>...............] - ETA: 0s - loss: 1.5982 - acc: 0.5890
6000/9534 [=================>............] - ETA: 0s - loss: 1.5861 - acc: 0.5892
7000/9534 [=====================>........] - ETA: 0s - loss: 1.5724 - acc: 0.5914
8000/9534 [========================>.....] - ETA: 0s - loss: 1.5578 - acc: 0.5922
9000/9534 [===========================>..] - ETA: 0s - loss: 1.5492 - acc: 0.5904
9534/9534 [==============================] - 0s - loss: 1.5468 - acc: 0.5893 - val_loss: 1.4677 - val_acc: 0.5585
Epoch 7/100
1000/9534 [==>...........................] - ETA: 0s - loss: 1.4380 - acc: 0.5790
2000/9534 [=====>........................] - ETA: 0s - loss: 1.4332 - acc: 0.5900
3000/9534 [========>.....................] - ETA: 0s - loss: 1.4208 - acc: 0.5957
4000/9534 [===========>..................] - ETA: 0s - loss: 1.4073 - acc: 0.5985
5000/9534 [==============>...............] - ETA: 0s - loss: 1.4027 - acc: 0.5960
6000/9534 [=================>............] - ETA: 0s - loss: 1.3922 - acc: 0.5950
7000/9534 [=====================>........] - ETA: 0s - loss: 1.3842 - acc: 0.5951
8000/9534 [========================>.....] - ETA: 0s - loss: 1.3729 - acc: 0.5988
9000/9534 [===========================>..] - ETA: 0s - loss: 1.3611 - acc: 0.6012
9534/9534 [==============================] - 1s - loss: 1.3588 - acc: 0.6015 - val_loss: 1.3387 - val_acc: 0.5717
Epoch 8/100
1000/9534 [==>...........................] - ETA: 0s - loss: 1.3429 - acc: 0.5750
2000/9534 [=====>........................] - ETA: 0s - loss: 1.3071 - acc: 0.5980
3000/9534 [========>.....................] - ETA: 0s - loss: 1.2915 - acc: 0.6007
4000/9534 [===========>..................] - ETA: 0s - loss: 1.2834 - acc: 0.5977
5000/9534 [==============>...............] - ETA: 0s - loss: 1.2791 - acc: 0.6008
6000/9534 [=================>............] - ETA: 0s - loss: 1.2636 - acc: 0.6043
7000/9534 [=====================>........] - ETA: 0s - loss: 1.2521 - acc: 0.6049
8000/9534 [========================>.....] - ETA: 0s - loss: 1.2495 - acc: 0.6041
9000/9534 [===========================>..] - ETA: 0s - loss: 1.2506 - acc: 0.6031
9534/9534 [==============================] - 1s - loss: 1.2491 - acc: 0.6022 - val_loss: 1.2617 - val_acc: 0.5698
Epoch 9/100
1000/9534 [==>...........................] - ETA: 0s - loss: 1.1627 - acc: 0.6240
2000/9534 [=====>........................] - ETA: 0s - loss: 1.1709 - acc: 0.6235
3000/9534 [========>.....................] - ETA: 0s - loss: 1.2001 - acc: 0.6127
4000/9534 [===========>..................] - ETA: 0s - loss: 1.2000 - acc: 0.6098
5000/9534 [==============>...............] - ETA: 0s - loss: 1.2002 - acc: 0.6096
6000/9534 [=================>............] - ETA: 0s - loss: 1.1969 - acc: 0.6085
7000/9534 [=====================>........] - ETA: 0s - loss: 1.1894 - acc: 0.6117
9534/9534 [==============================] - 1s - loss: 1.1793 - acc: 0.6094 - val_loss: 1.2151 - val_acc: 0.5679
Epoch 10/100
1000/9534 [==>...........................] - ETA: 1s - loss: 1.1436 - acc: 0.6190
2000/9534 [=====>........................] - ETA: 0s - loss: 1.1369 - acc: 0.6260
3000/9534 [========>.....................] - ETA: 0s - loss: 1.1366 - acc: 0.6207
4000/9534 [===========>..................] - ETA: 0s - loss: 1.1293 - acc: 0.6210
5000/9534 [==============>...............] - ETA: 0s - loss: 1.1276 - acc: 0.6232
6000/9534 [=================>............] - ETA: 0s - loss: 1.1289 - acc: 0.6217
7000/9534 [=====================>........] - ETA: 0s - loss: 1.1321 - acc: 0.6180
8000/9534 [========================>.....] - ETA: 0s - loss: 1.1352 - acc: 0.6150
9000/9534 [===========================>..] - ETA: 0s - loss: 1.1341 - acc: 0.6141
9534/9534 [==============================] - 0s - loss: 1.1349 - acc: 0.6129 - val_loss: 1.1946 - val_acc: 0.5632
Epoch 11/100
1000/9534 [==>...........................] - ETA: 0s - loss: 1.1684 - acc: 0.5930
2000/9534 [=====>........................] - ETA: 0s - loss: 1.1338 - acc: 0.6075
3000/9534 [========>.....................] - ETA: 0s - loss: 1.1177 - acc: 0.6140
4000/9534 [===========>..................] - ETA: 0s - loss: 1.1293 - acc: 0.6075
5000/9534 [==============>...............] - ETA: 0s - loss: 1.1235 - acc: 0.6154
6000/9534 [=================>............] - ETA: 0s - loss: 1.1188 - acc: 0.6173
7000/9534 [=====================>........] - ETA: 0s - loss: 1.1147 - acc: 0.6179
8000/9534 [========================>.....] - ETA: 0s - loss: 1.1068 - acc: 0.6196
9000/9534 [===========================>..] - ETA: 0s - loss: 1.1090 - acc: 0.6190
9534/9534 [==============================] - 0s - loss: 1.1092 - acc: 0.6177 - val_loss: 1.1788 - val_acc: 0.5689
Epoch 12/100
1000/9534 [==>...........................] - ETA: 0s - loss: 1.0702 - acc: 0.6280
2000/9534 [=====>........................] - ETA: 0s - loss: 1.0742 - acc: 0.6280
3000/9534 [========>.....................] - ETA: 0s - loss: 1.0821 - acc: 0.6237
4000/9534 [===========>..................] - ETA: 0s - loss: 1.0868 - acc: 0.6233
5000/9534 [==============>...............] - ETA: 0s - loss: 1.0807 - acc: 0.6258
6000/9534 [=================>............] - ETA: 0s - loss: 1.0884 - acc: 0.6208
7000/9534 [=====================>........] - ETA: 0s - loss: 1.0905 - acc: 0.6187
8000/9534 [========================>.....] - ETA: 0s - loss: 1.0895 - acc: 0.6205
9000/9534 [===========================>..] - ETA: 0s - loss: 1.0899 - acc: 0.6200
9534/9534 [==============================] - 1s - loss: 1.0900 - acc: 0.6205 - val_loss: 1.1598 - val_acc: 0.5830
Epoch 13/100
1000/9534 [==>...........................] - ETA: 0s - loss: 1.0730 - acc: 0.6340
2000/9534 [=====>........................] - ETA: 0s - loss: 1.0649 - acc: 0.6445
3000/9534 [========>.....................] - ETA: 0s - loss: 1.0600 - acc: 0.6430
4000/9534 [===========>..................] - ETA: 0s - loss: 1.0718 - acc: 0.6350
5000/9534 [==============>...............] - ETA: 0s - loss: 1.0821 - acc: 0.6280
6000/9534 [=================>............] - ETA: 0s - loss: 1.0779 - acc: 0.6295
7000/9534 [=====================>........] - ETA: 0s - loss: 1.0713 - acc: 0.6316
8000/9534 [========================>.....] - ETA: 0s - loss: 1.0737 - acc: 0.6289
9000/9534 [===========================>..] - ETA: 0s - loss: 1.0767 - acc: 0.6261
9534/9534 [==============================] - 1s - loss: 1.0752 - acc: 0.6259 - val_loss: 1.1589 - val_acc: 0.5642
Epoch 14/100
1000/9534 [==>...........................] - ETA: 0s - loss: 1.0148 - acc: 0.6520
2000/9534 [=====>........................] - ETA: 0s - loss: 1.0395 - acc: 0.6430
3000/9534 [========>.....................] - ETA: 0s - loss: 1.0503 - acc: 0.6377
4000/9534 [===========>..................] - ETA: 0s - loss: 1.0521 - acc: 0.6382
5000/9534 [==============>...............] - ETA: 0s - loss: 1.0529 - acc: 0.6388
6000/9534 [=================>............] - ETA: 0s - loss: 1.0519 - acc: 0.6392
7000/9534 [=====================>........] - ETA: 0s - loss: 1.0561 - acc: 0.6359
8000/9534 [========================>.....] - ETA: 0s - loss: 1.0547 - acc: 0.6332
9000/9534 [===========================>..] - ETA: 0s - loss: 1.0591 - acc: 0.6313
9534/9534 [==============================] - 0s - loss: 1.0606 - acc: 0.6301 - val_loss: 1.1577 - val_acc: 0.5774
Epoch 15/100
1000/9534 [==>...........................] - ETA: 0s - loss: 1.0513 - acc: 0.6410
2000/9534 [=====>........................] - ETA: 0s - loss: 1.0635 - acc: 0.6245
3000/9534 [========>.....................] - ETA: 0s - loss: 1.0500 - acc: 0.6280
4000/9534 [===========>..................] - ETA: 0s - loss: 1.0530 - acc: 0.6257
5000/9534 [==============>...............] - ETA: 0s - loss: 1.0585 - acc: 0.6232
6000/9534 [=================>............] - ETA: 0s - loss: 1.0562 - acc: 0.6233
7000/9534 [=====================>........] - ETA: 0s - loss: 1.0507 - acc: 0.6267
8000/9534 [========================>.....] - ETA: 0s - loss: 1.0540 - acc: 0.6267
9000/9534 [===========================>..] - ETA: 0s - loss: 1.0513 - acc: 0.6286
9534/9534 [==============================] - 0s - loss: 1.0492 - acc: 0.6290 - val_loss: 1.1608 - val_acc: 0.5802
Epoch 16/100
1000/9534 [==>...........................] - ETA: 0s - loss: 1.0553 - acc: 0.6300
2000/9534 [=====>........................] - ETA: 0s - loss: 1.0582 - acc: 0.6305
3000/9534 [========>.....................] - ETA: 0s - loss: 1.0341 - acc: 0.6407
4000/9534 [===========>..................] - ETA: 0s - loss: 1.0312 - acc: 0.6398
5000/9534 [==============>...............] - ETA: 0s - loss: 1.0454 - acc: 0.6324
6000/9534 [=================>............] - ETA: 0s - loss: 1.0438 - acc: 0.6332
7000/9534 [=====================>........] - ETA: 0s - loss: 1.0445 - acc: 0.6323
8000/9534 [========================>.....] - ETA: 0s - loss: 1.0426 - acc: 0.6331
9000/9534 [===========================>..] - ETA: 0s - loss: 1.0439 - acc: 0.6323
9534/9534 [==============================] - 0s - loss: 1.0427 - acc: 0.6323 - val_loss: 1.1544 - val_acc: 0.5764
Epoch 17/100
1000/9534 [==>...........................] - ETA: 0s - loss: 1.0633 - acc: 0.6190
2000/9534 [=====>........................] - ETA: 0s - loss: 1.0407 - acc: 0.6300
3000/9534 [========>.....................] - ETA: 0s - loss: 1.0417 - acc: 0.6343
4000/9534 [===========>..................] - ETA: 0s - loss: 1.0322 - acc: 0.6402
5000/9534 [==============>...............] - ETA: 0s - loss: 1.0283 - acc: 0.6426
6000/9534 [=================>............] - ETA: 0s - loss: 1.0355 - acc: 0.6400
7000/9534 [=====================>........] - ETA: 0s - loss: 1.0361 - acc: 0.6413
8000/9534 [========================>.....] - ETA: 0s - loss: 1.0336 - acc: 0.6392
9000/9534 [===========================>..] - ETA: 0s - loss: 1.0309 - acc: 0.6394
9534/9534 [==============================] - 0s - loss: 1.0342 - acc: 0.6382 - val_loss: 1.1575 - val_acc: 0.5755
Epoch 18/100
1000/9534 [==>...........................] - ETA: 0s - loss: 1.0289 - acc: 0.6510
2000/9534 [=====>........................] - ETA: 0s - loss: 1.0233 - acc: 0.6505
3000/9534 [========>.....................] - ETA: 0s - loss: 1.0176 - acc: 0.6507
4000/9534 [===========>..................] - ETA: 0s - loss: 1.0194 - acc: 0.6500
5000/9534 [==============>...............] - ETA: 0s - loss: 1.0242 - acc: 0.6442
6000/9534 [=================>............] - ETA: 0s - loss: 1.0239 - acc: 0.6423
7000/9534 [=====================>........] - ETA: 0s - loss: 1.0249 - acc: 0.6413
8000/9534 [========================>.....] - ETA: 0s - loss: 1.0264 - acc: 0.6404
9000/9534 [===========================>..] - ETA: 0s - loss: 1.0277 - acc: 0.6406
9534/9534 [==============================] - 0s - loss: 1.0299 - acc: 0.6389 - val_loss: 1.1597 - val_acc: 0.5708
Epoch 19/100
1000/9534 [==>...........................] - ETA: 0s - loss: 1.0271 - acc: 0.6420
2000/9534 [=====>........................] - ETA: 0s - loss: 1.0114 - acc: 0.6445
3000/9534 [========>.....................] - ETA: 0s - loss: 1.0046 - acc: 0.6510
4000/9534 [===========>..................] - ETA: 0s - loss: 1.0137 - acc: 0.6453
5000/9534 [==============>...............] - ETA: 0s - loss: 1.0074 - acc: 0.6492
6000/9534 [=================>............] - ETA: 0s - loss: 1.0112 - acc: 0.6490
7000/9534 [=====================>........] - ETA: 0s - loss: 1.0072 - acc: 0.6504
8000/9534 [========================>.....] - ETA: 0s - loss: 1.0093 - acc: 0.6496
9000/9534 [===========================>..] - ETA: 0s - loss: 1.0137 - acc: 0.6452
9534/9534 [==============================] - 0s - loss: 1.0159 - acc: 0.6451 - val_loss: 1.1603 - val_acc: 0.5651

3 回答

  • 0

    交叉验证码如下:

    skf = KFold(n_splits=cross_validation, shuffle=True)
    for train_index, test_index in skf.split(X):
        X_train, X_test = X[train_index], X[test_index]
        y_train, y_test = Y[train_index], Y[test_index]
        model = None
        model = self.__create_model()
        model.fit(X_train, y_train, epochs=epochs, batch_size=batch_size, validation_data=(X_test, y_test))
    

    X和Y是两个矩阵形状(10000,507)和(10000,7)

  • 0

    您可以使用任何模型实现多少精度,这在很大程度上取决于您的数据集和标签的准确性 .

    使用训练有素的模型进行预测并进行混淆矩阵 . 看看假阳性和假阴性预测的具体例子 . 这些预测实际上是假的还是模型预测比标签更准确?这在项目中多次发生在我身上 .

    我建议你首先尝试训练你的模型,直到它过度拟合 . 从我所看到的,你的模型仍在学习或处于过度拟合的边缘 . 添加更多时期,直到验证丢失和/或准确性再次恶化 . 对于后续尝试,如果需要,应用正则化 . 从辍学开始可能是.1或.2到最大.5 .

    设置Tensorboard,以便您可以跟踪准确度和验证准确度之间的差异 .

    有这么多的分类变量:你确定你避免使用虚拟变量陷阱吗?您需要确保虚拟变量的数量比您的类别的数量少一个:http://www.algosome.com/articles/dummy-variable-trap-regression.html

  • 0

    非常感谢petezurich回复你 . 我不能同意你的意见 . 你是个好人 .

    我正在尝试另一个多标签分类项目 .

    数据是503二进制特征和64个二进制标签 . 我在输出层上使用sigmoid,在第一个隐藏层上使用binary_crossentrophy对丢失函数和正则化l2 . 隐藏层结构为500 * 100 . 所以整个网络就像503 * 500 * 100 * 64 . 我在这里发布我的数据:https://ufile.io/f4tvf

    当我在10个文件夹交叉验证中每个10000个时期时,我得到了性能分数:

    coverage error: 10.485887, 
    ranking average precision: 0.766574, 
    ranking loss: 0.045134.
    

    如果我为每个标签设置0.5阈值,我会得到基于标签的指标

    zero one loss: 0.848790, 
    hamming loss: 0.037109, 
    macro precision: 0.426696, 
    micro precision: 0.672705, 
    macro recall: 0.371033, 
    micro recall: 0.636571, 
    macro f1: 0.383845, 
    micro f1: 0.654140.
    

    对结果有好处吗?有人有兴趣培训吗?

相关问题