首页 文章

Tensorflow返回VGG模型的10%验证准确度(不考虑时期数)?

提问于
浏览
0

我正在尝试使用张量流中的keras包在CIFAR-10上训练神经网络 . 考虑的神经网络是VGG-16,我直接借用官方的keras模型 . 定义是:

def cnn_model(nb_classes=10):
# VGG-16 official keras model
img_input= Input(shape=(32,32,3))
vgg_layer= Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv1')(img_input)
vgg_layer= Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv2')(vgg_layer)
vgg_layer= MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(vgg_layer)

# Block 2
vgg_layer= Conv2D(64, (3, 3), activation='relu', padding='same', name='block2_conv1')(vgg_layer)
vgg_layer= Conv2D(64, (3, 3), activation='relu', padding='same', name='block2_conv2')(vgg_layer)
vgg_layer= MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(vgg_layer)

# Block 3
vgg_layer= Conv2D(128, (3, 3), activation='relu', padding='same', name='block3_conv1')(vgg_layer)
vgg_layer= Conv2D(128, (3, 3), activation='relu', padding='same', name='block3_conv2')(vgg_layer)
vgg_layer= Conv2D(128, (3, 3), activation='relu', padding='same', name='block3_conv3')(vgg_layer)
vgg_layer= MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(vgg_layer)

# Block 4
vgg_layer= Conv2D(256, (3, 3), activation='relu', padding='same', name='block4_conv1')(vgg_layer)
vgg_layer= Conv2D(256, (3, 3), activation='relu', padding='same', name='block4_conv2')(vgg_layer)
vgg_layer= Conv2D(256, (3, 3), activation='relu', padding='same', name='block4_conv3')(vgg_layer)
vgg_layer= MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(vgg_layer)

# Classification block
vgg_layer= Flatten(name='flatten')(vgg_layer)
vgg_layer= Dense(1024, activation='relu', name='fc1')(vgg_layer)
vgg_layer= Dense(1024, activation='relu', name='fc2')(vgg_layer)
vgg_layer= Dense(nb_classes, activation='softmax', name='predictions')(vgg_layer)

return Model(inputs=img_input, outputs=vgg_layer)

但是在训练期间,我总是得到0.1%,即10% .

validation accuracy for adv. training of model for epoch 1=  0.1
validation accuracy for adv. training of model for epoch 2=  0.1
validation accuracy for adv. training of model for epoch 3=  0.1
validation accuracy for adv. training of model for epoch 4=  0.1
validation accuracy for adv. training of model for epoch 5=  0.1

作为调试的一步,每当我更换任何其他模型(例如,任何简单的CNN模型)时,它都能很好地工作 . 这表明脚本的其余部分运行良好 .

例如,以下CNN模型非常好地工作,并且在30个时期之后实现了75%的准确度 .

def cnn_model(nb_classes=10, num_hidden=1024, weight_decay= 0.0001, cap_factor=4):
model=Sequential()
input_shape = (32,32,3)
model.add(Conv2D(32*cap_factor, kernel_size=(3,3), strides=(1,1), kernel_regularizer=keras.regularizers.l2(weight_decay), kernel_initializer="he_normal", activation='relu', padding='same', input_shape=input_shape))
model.add(Conv2D(32*cap_factor, kernel_size=(3,3), strides=(1,1), kernel_regularizer=keras.regularizers.l2(weight_decay), kernel_initializer="he_normal", activation="relu", padding="same"))
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2)))
model.add(BatchNormalization())
model.add(Dropout(0.25))

model.add(Conv2D(64*cap_factor, kernel_size=(3,3), strides=(1,1), kernel_regularizer=keras.regularizers.l2(weight_decay), kernel_initializer="he_normal", activation="relu", padding="same"))
model.add(Conv2D(64*cap_factor, kernel_size=(3,3), strides=(1,1), kernel_regularizer=keras.regularizers.l2(weight_decay), kernel_initializer="he_normal", activation="relu", padding="same"))
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2)))
model.add(BatchNormalization())
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(num_hidden, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes, activation='softmax'))
return model

在我看来,这两个模型都是正确定义的 . 然而,一个完美而另一个完全不学习 . 我也尝试将VGG模型编写为顺序结构,即类似于第二种结构,但它仍然给了我10%的准确度 .

即使模型没有更新任何权重,“he_normal”初始化程序仍然可以轻松获得比纯机会更好的准确性 . 看起来,某种程度上,张量流计算模型的输出对数,这导致准确性纯粹的机会 .

如果有人可以指出我的错误,我会非常乐于助人 .

1 回答

  • 2

    你的10%与nr类= 10相对应 . 这让我觉得无论训练如何,对于所有类别,你的答案总是“1”,在10个课程中你的答案总是提高10% .

    • 检查未经训练的模型的输出,如果它始终为1

    • 如果是这样,检查模型的初始权重,可能是's wrongly initialized, gradients are zero and it can't收敛

相关问题