首页 文章

删除然后在Keras模型中插入新的中间层

提问于
浏览
5

给定预定义的Keras模型,我试图首先加载预先训练的权重,然后删除一到三个模型内部(非最后几个)层,然后用另一个层替换它 .

我似乎无法在keras.io上找到任何关于做这样的事情或从预定义模型中删除图层的文档 .

我使用的模型是一个良好的ole VGG-16网络,它在一个函数中实例化,如下所示:

def model(self, output_shape):

    # Prepare image for input to model
    img_input = Input(shape=self._input_shape)

    # Block 1
    x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv1')(img_input)
    x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv2')(x)
    x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)

    # Block 2
    x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv1')(x)
    x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv2')(x)
    x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x)

    # Block 3
    x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv1')(x)
    x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv2')(x)
    x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv3')(x)
    x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(x)

    # Block 4
    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv1')(x)
    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv2')(x)
    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv3')(x)
    x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(x)

    # Block 5
    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv1')(x)
    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv2')(x)
    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv3')(x)
    x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool')(x)

    # Classification block
    x = Flatten(name='flatten')(x)
    x = Dense(4096, activation='relu', name='fc1')(x)
    x = Dropout(0.5)(x)
    x = Dense(4096, activation='relu', name='fc2')(x)
    x = Dropout(0.5)(x)
    x = Dense(output_shape, activation='softmax', name='predictions')(x)

    inputs = img_input

    # Create model.
    model = Model(inputs, x, name=self._name)

    return model

举个例子,在将原始权重加载到所有其他层之后,我想在块1中取两个Conv层并用一个Conv层替换它们 .

有任何想法吗?

2 回答

  • 1

    假设您有一个模型 vgg16_model ,由上面的函数或 keras.applications.VGG16(weights='imagenet') 初始化 . 现在,您需要在中间插入一个新图层,以便保存其他图层的权重 .

    我们的想法是将整个网络拆分为单独的层,然后再组装 . 以下是专门针对您的任务的代码:

    vgg_model = applications.VGG16(include_top=True, weights='imagenet')
    
    # Disassemble layers
    layers = [l for l in vgg_model.layers]
    
    # Defining new convolutional layer.
    # Important: the number of filters should be the same!
    # Note: the receiptive field of two 3x3 convolutions is 5x5.
    new_conv = Conv2D(filters=64, 
                      kernel_size=(5, 5),
                      name='new_conv',
                      padding='same')(layers[0].output)
    
    # Now stack everything back
    # Note: If you are going to fine tune the model, do not forget to
    #       mark other layers as un-trainable
    
    x = new_conv
    for i in range(3, len(layers)):
        layers[i].trainable = False
        x = layers[i](x)
    
    # Final touch
    result_model = Model(input=layer[0].input, output=x)
    result_model.summary()
    

    以上代码的输出是:

    _________________________________________________________________
    Layer (type)                 Output Shape              Param #   
    =================================================================
    input_50 (InputLayer)        (None, 224, 224, 3)       0         
    _________________________________________________________________
    new_conv (Conv2D)            (None, 224, 224, 64)      1792      
    _________________________________________________________________
    block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
    _________________________________________________________________
    block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
    _________________________________________________________________
    block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
    _________________________________________________________________
    block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0         
    _________________________________________________________________
    block3_conv1 (Conv2D)        (None, 56, 56, 256)       295168    
    _________________________________________________________________
    block3_conv2 (Conv2D)        (None, 56, 56, 256)       590080    
    _________________________________________________________________
    block3_conv3 (Conv2D)        (None, 56, 56, 256)       590080    
    _________________________________________________________________
    block3_pool (MaxPooling2D)   (None, 28, 28, 256)       0         
    _________________________________________________________________
    block4_conv1 (Conv2D)        (None, 28, 28, 512)       1180160   
    _________________________________________________________________
    block4_conv2 (Conv2D)        (None, 28, 28, 512)       2359808   
    _________________________________________________________________
    block4_conv3 (Conv2D)        (None, 28, 28, 512)       2359808   
    _________________________________________________________________
    block4_pool (MaxPooling2D)   (None, 14, 14, 512)       0         
    _________________________________________________________________
    block5_conv1 (Conv2D)        (None, 14, 14, 512)       2359808   
    _________________________________________________________________
    block5_conv2 (Conv2D)        (None, 14, 14, 512)       2359808   
    _________________________________________________________________
    block5_conv3 (Conv2D)        (None, 14, 14, 512)       2359808   
    _________________________________________________________________
    block5_pool (MaxPooling2D)   (None, 7, 7, 512)         0         
    _________________________________________________________________
    flatten (Flatten)            (None, 25088)             0         
    _________________________________________________________________
    fc1 (Dense)                  (None, 4096)              102764544 
    _________________________________________________________________
    fc2 (Dense)                  (None, 4096)              16781312  
    _________________________________________________________________
    predictions (Dense)          (None, 1000)              4097000   
    =================================================================
    Total params: 138,320,616
    Trainable params: 1,792
    Non-trainable params: 138,318,824
    _________________________________________________________________
    
  • 6

    另一种方法是构建一个Sequential模型 . 请参阅以下示例,其中我为PReLU交换ReLU层 . 您只需要添加不需要的图层,然后添加新图层即可 .

    def convert_model_relu(model):
        from keras.layers.advanced_activations import PReLU
        from keras.activations import linear as linear_activation
        from keras.models import Sequential
        new_model = Sequential()
        # Go through all layers, if it has a ReLU activation, replace it with PrELU
        for layer in tuple(model.layers):
            layer_type = type(layer).__name__
            if hasattr(layer, 'activation') and layer.activation.__name__ == 'relu':
                # Set activation to linear, add PReLU
                prelu_name = layer.name + "_prelu"
                prelu = PReLU(shared_axes=(1, 2), name=prelu_name) \ 
                    if layer_type == "Conv2D" else PReLU(name=prelu_name)
                layer.activation = linear_activation
                new_model.add(layer)
                new_model.add(prelu)
            else:
                new_model.add(layer)
        return new_model
    

相关问题