首页 文章

与Theano的Keras:损失减少但准确性没有变化

提问于
浏览
-1

这是我的代码 . 我尝试构建一个VGG 11层网络,混合了ReLu和ELu激活以及许多关于内核和活动的规范化 . 结果真的令人困惑:代码处于第10个时代 . 我在火车和火箭上的损失从2000减少到1.5,但我对火车和火箭的损失保持不变,为50% . 有人可以向我解释一下吗?

# VGG 11
from keras.regularizers import l2
from keras.layers.advanced_activations import ELU
from keras.optimizers import Adam
model = Sequential()

model.add(Conv2D(64, (3, 3), kernel_initializer='he_normal', 
          kernel_regularizer=l2(0.0001), activity_regularizer=l2(0.0001), 
          input_shape=(1, 96, 96), activation='relu'))
model.add(Conv2D(64, (3, 3), kernel_initializer='he_normal', 
          kernel_regularizer=l2(0.0001), activity_regularizer=l2(0.0001), 
          activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(128, (3, 3), kernel_initializer='he_normal', 
          kernel_regularizer=l2(0.0001),activity_regularizer=l2(0.0001), 
          activation='relu'))
model.add(Conv2D(128, (3, 3), kernel_initializer='he_normal',     
          kernel_regularizer=l2(0.0001), activity_regularizer=l2(0.0001), 
          activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(256, (3, 3), kernel_initializer='he_normal',     
          kernel_regularizer=l2(0.0001), activity_regularizer=l2(0.0001), 
          activation='relu'))
model.add(Conv2D(256, (3, 3), kernel_initializer='he_normal',     
          kernel_regularizer=l2(0.0001), activity_regularizer=l2(0.0001), 
          activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(512, (3, 3), kernel_initializer='he_normal', 
          kernel_regularizer=l2(0.0001), activity_regularizer=l2(0.0001), 
          activation='relu'))
model.add(Conv2D(512, (3, 3), kernel_initializer='he_normal', 
          kernel_regularizer=l2(0.0001), activity_regularizer=l2(0.0001), 
          activation='relu'))
model.add(Conv2D(512, (3, 3), kernel_initializer='he_normal', 
          kernel_regularizer=l2(0.0001), activity_regularizer=l2(0.0001),     
          activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

# convert convolutional filters to flat so they can be feed to fully connected layers
model.add(Flatten())

model.add(Dense(2048, kernel_initializer='he_normal',
               kernel_regularizer=l2(0.0001), activity_regularizer=l2(0.01)))
model.add(ELU(alpha=1.0))
model.add(Dropout(0.5))

model.add(Dense(1024, kernel_initializer='he_normal',
               kernel_regularizer=l2(0.0001), activity_regularizer=l2(0.01)))
model.add(ELU(alpha=1.0))
model.add(Dropout(0.5))

model.add(Dense(2))
model.add(Activation('softmax'))

adammo = Adam(lr=0.0008, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
model.compile(loss='categorical_crossentropy', optimizer=adammo, metrics=['accuracy'])
hist = model.fit(X_train, y_train, batch_size=48, epochs=20, verbose=1, validation_data=(X_val, y_val))

1 回答

  • 0

    这不是缺陷,事实上,它完全有可能!

    分类cross entropy loss不要求精度随损失减少而增加 . 这不是keras或theano中的错误,而是网络或数据问题 .

    对于您可能尝试执行的操作,此网络结构可能过于复杂 . 您应该删除一些正则化,仅使用ReLu,使用更少的图层,使用标准的adam优化器,更大的批次等 . 首先尝试使用像VGG16这样的keras'default models

    如果你想看到他们的实现,为不同的VGG11结构编辑它 . 是这里:

    def VGG_16(weights_path=None):
        model = Sequential()
        model.add(ZeroPadding2D((1,1),input_shape=(3,224,224)))
        model.add(Convolution2D(64, 3, 3, activation='relu'))
        model.add(ZeroPadding2D((1,1)))
        model.add(Convolution2D(64, 3, 3, activation='relu'))
        model.add(MaxPooling2D((2,2), strides=(2,2)))
    
        model.add(ZeroPadding2D((1,1)))
        model.add(Convolution2D(128, 3, 3, activation='relu'))
        model.add(ZeroPadding2D((1,1)))
        model.add(Convolution2D(128, 3, 3, activation='relu'))
        model.add(MaxPooling2D((2,2), strides=(2,2)))
    
        model.add(ZeroPadding2D((1,1)))
        model.add(Convolution2D(256, 3, 3, activation='relu'))
        model.add(ZeroPadding2D((1,1)))
        model.add(Convolution2D(256, 3, 3, activation='relu'))
        model.add(ZeroPadding2D((1,1)))
        model.add(Convolution2D(256, 3, 3, activation='relu'))
        model.add(MaxPooling2D((2,2), strides=(2,2)))
    
        model.add(ZeroPadding2D((1,1)))
        model.add(Convolution2D(512, 3, 3, activation='relu'))
        model.add(ZeroPadding2D((1,1)))
        model.add(Convolution2D(512, 3, 3, activation='relu'))
        model.add(ZeroPadding2D((1,1)))
        model.add(Convolution2D(512, 3, 3, activation='relu'))
        model.add(MaxPooling2D((2,2), strides=(2,2)))
    
        model.add(ZeroPadding2D((1,1)))
        model.add(Convolution2D(512, 3, 3, activation='relu'))
        model.add(ZeroPadding2D((1,1)))
        model.add(Convolution2D(512, 3, 3, activation='relu'))
        model.add(ZeroPadding2D((1,1)))
        model.add(Convolution2D(512, 3, 3, activation='relu'))
        model.add(MaxPooling2D((2,2), strides=(2,2)))
    
        model.add(Flatten())
        model.add(Dense(4096, activation='relu'))
        model.add(Dropout(0.5))
        model.add(Dense(4096, activation='relu'))
        model.add(Dropout(0.5))
        model.add(Dense(1000, activation='softmax'))
    
        if weights_path:
            model.load_weights(weights_path)
    
        return model
    

    你可以看到它更简单 . 它只使用依赖(这些日子已经流行)没有正规化,不同的卷积结构等 . 根据您的需要修改它!

相关问题