首页 文章

在keras中的预训练密集层之间添加辍学图层

提问于
浏览
9

keras.applications 中,有一个在imagenet上预训练的VGG16模型 .

from keras.applications import VGG16
model = VGG16(weights='imagenet')

该模型具有以下结构 .

Layer (type)                     Output Shape          Param #     Connected to                     
====================================================================================================
input_1 (InputLayer)             (None, 3, 224, 224)   0                                            
____________________________________________________________________________________________________
block1_conv1 (Convolution2D)     (None, 64, 224, 224)  1792        input_1[0][0]                    
____________________________________________________________________________________________________
block1_conv2 (Convolution2D)     (None, 64, 224, 224)  36928       block1_conv1[0][0]               
____________________________________________________________________________________________________
block1_pool (MaxPooling2D)       (None, 64, 112, 112)  0           block1_conv2[0][0]               
____________________________________________________________________________________________________
block2_conv1 (Convolution2D)     (None, 128, 112, 112) 73856       block1_pool[0][0]                
____________________________________________________________________________________________________
block2_conv2 (Convolution2D)     (None, 128, 112, 112) 147584      block2_conv1[0][0]               
____________________________________________________________________________________________________
block2_pool (MaxPooling2D)       (None, 128, 56, 56)   0           block2_conv2[0][0]               
____________________________________________________________________________________________________
block3_conv1 (Convolution2D)     (None, 256, 56, 56)   295168      block2_pool[0][0]                
____________________________________________________________________________________________________
block3_conv2 (Convolution2D)     (None, 256, 56, 56)   590080      block3_conv1[0][0]               
____________________________________________________________________________________________________
block3_conv3 (Convolution2D)     (None, 256, 56, 56)   590080      block3_conv2[0][0]               
____________________________________________________________________________________________________
block3_pool (MaxPooling2D)       (None, 256, 28, 28)   0           block3_conv3[0][0]               
____________________________________________________________________________________________________
block4_conv1 (Convolution2D)     (None, 512, 28, 28)   1180160     block3_pool[0][0]                
____________________________________________________________________________________________________
block4_conv2 (Convolution2D)     (None, 512, 28, 28)   2359808     block4_conv1[0][0]               
____________________________________________________________________________________________________
block4_conv3 (Convolution2D)     (None, 512, 28, 28)   2359808     block4_conv2[0][0]               
____________________________________________________________________________________________________
block4_pool (MaxPooling2D)       (None, 512, 14, 14)   0           block4_conv3[0][0]               
____________________________________________________________________________________________________
block5_conv1 (Convolution2D)     (None, 512, 14, 14)   2359808     block4_pool[0][0]                
____________________________________________________________________________________________________
block5_conv2 (Convolution2D)     (None, 512, 14, 14)   2359808     block5_conv1[0][0]               
____________________________________________________________________________________________________
block5_conv3 (Convolution2D)     (None, 512, 14, 14)   2359808     block5_conv2[0][0]               
____________________________________________________________________________________________________
block5_pool (MaxPooling2D)       (None, 512, 7, 7)     0           block5_conv3[0][0]               
____________________________________________________________________________________________________
flatten (Flatten)                (None, 25088)         0           block5_pool[0][0]                
____________________________________________________________________________________________________
fc1 (Dense)                      (None, 4096)          102764544   flatten[0][0]                    
____________________________________________________________________________________________________
fc2 (Dense)                      (None, 4096)          16781312    fc1[0][0]                        
____________________________________________________________________________________________________
predictions (Dense)              (None, 1000)          4097000     fc2[0][0]                        
====================================================================================================
Total params: 138,357,544
Trainable params: 138,357,544
Non-trainable params: 0
____________________________________________________________________________________________________

我想用密集层(fc1,fc2和预测)之间的辍学层来微调这个模型,同时保持模型的所有预先训练的权重完整 . 我知道可以使用 model.layers 单独访问每个图层,但我没有找到任何地方如何在现有图层之间添加新图层 .

这样做的最佳做法是什么?

1 回答

  • 20

    我通过使用Keras functional API找到了答案

    from keras.applications import VGG16
    from keras.layers import Dropout
    from keras.models import Model
    
    model = VGG16(weights='imagenet')
    
    # Store the fully connected layers
    fc1 = model.layers[-3]
    fc2 = model.layers[-2]
    predictions = model.layers[-1]
    
    # Create the dropout layers
    dropout1 = Dropout(0.85)
    dropout2 = Dropout(0.85)
    
    # Reconnect the layers
    x = dropout1(fc1.output)
    x = fc2(x)
    x = dropout2(x)
    predictors = predictions(x)
    
    # Create a new model
    model2 = Model(input=model.input, output=predictors)
    

    model2 有我想要的辍学层

    ____________________________________________________________________________________________________
    Layer (type)                     Output Shape          Param #     Connected to                     
    ====================================================================================================
    input_1 (InputLayer)             (None, 3, 224, 224)   0                                            
    ____________________________________________________________________________________________________
    block1_conv1 (Convolution2D)     (None, 64, 224, 224)  1792        input_1[0][0]                    
    ____________________________________________________________________________________________________
    block1_conv2 (Convolution2D)     (None, 64, 224, 224)  36928       block1_conv1[0][0]               
    ____________________________________________________________________________________________________
    block1_pool (MaxPooling2D)       (None, 64, 112, 112)  0           block1_conv2[0][0]               
    ____________________________________________________________________________________________________
    block2_conv1 (Convolution2D)     (None, 128, 112, 112) 73856       block1_pool[0][0]                
    ____________________________________________________________________________________________________
    block2_conv2 (Convolution2D)     (None, 128, 112, 112) 147584      block2_conv1[0][0]               
    ____________________________________________________________________________________________________
    block2_pool (MaxPooling2D)       (None, 128, 56, 56)   0           block2_conv2[0][0]               
    ____________________________________________________________________________________________________
    block3_conv1 (Convolution2D)     (None, 256, 56, 56)   295168      block2_pool[0][0]                
    ____________________________________________________________________________________________________
    block3_conv2 (Convolution2D)     (None, 256, 56, 56)   590080      block3_conv1[0][0]               
    ____________________________________________________________________________________________________
    block3_conv3 (Convolution2D)     (None, 256, 56, 56)   590080      block3_conv2[0][0]               
    ____________________________________________________________________________________________________
    block3_pool (MaxPooling2D)       (None, 256, 28, 28)   0           block3_conv3[0][0]               
    ____________________________________________________________________________________________________
    block4_conv1 (Convolution2D)     (None, 512, 28, 28)   1180160     block3_pool[0][0]                
    ____________________________________________________________________________________________________
    block4_conv2 (Convolution2D)     (None, 512, 28, 28)   2359808     block4_conv1[0][0]               
    ____________________________________________________________________________________________________
    block4_conv3 (Convolution2D)     (None, 512, 28, 28)   2359808     block4_conv2[0][0]               
    ____________________________________________________________________________________________________
    block4_pool (MaxPooling2D)       (None, 512, 14, 14)   0           block4_conv3[0][0]               
    ____________________________________________________________________________________________________
    block5_conv1 (Convolution2D)     (None, 512, 14, 14)   2359808     block4_pool[0][0]                
    ____________________________________________________________________________________________________
    block5_conv2 (Convolution2D)     (None, 512, 14, 14)   2359808     block5_conv1[0][0]               
    ____________________________________________________________________________________________________
    block5_conv3 (Convolution2D)     (None, 512, 14, 14)   2359808     block5_conv2[0][0]               
    ____________________________________________________________________________________________________
    block5_pool (MaxPooling2D)       (None, 512, 7, 7)     0           block5_conv3[0][0]               
    ____________________________________________________________________________________________________
    flatten (Flatten)                (None, 25088)         0           block5_pool[0][0]                
    ____________________________________________________________________________________________________
    fc1 (Dense)                      (None, 4096)          102764544   flatten[0][0]                    
    ____________________________________________________________________________________________________
    dropout_1 (Dropout)              (None, 4096)          0           fc1[0][0]                        
    ____________________________________________________________________________________________________
    fc2 (Dense)                      (None, 4096)          16781312    dropout_1[0][0]                  
    ____________________________________________________________________________________________________
    dropout_2 (Dropout)              (None, 4096)          0           fc2[1][0]                        
    ____________________________________________________________________________________________________
    predictions (Dense)              (None, 1000)          4097000     dropout_2[0][0]                  
    ====================================================================================================
    Total params: 138,357,544
    Trainable params: 138,357,544
    Non-trainable params: 0
    ____________________________________________________________________________________________________
    

相关问题