如何使Keras在密集层中有两个不同的初始化器？-Java 学习之路

我有两个单独设计的CNN，用于相同数据的两个不同特征（图像和文本），输出有两个类

在最后一层：

对于图像（resnet），我想使用“he_normal”作为初始化器

flatten1 = Flatten()(image_maxpool)
dense = Dense(output_dim=2, kernel_initializer="he_normal")(flatten1)

但对于文本CNN，我想使用默认的“glorot_normal”

flatten2 = Flatten()(text_maxpool)
output = Dense(output_dim=2, kernel_initializer="glorot_normal")(flatten2)

flatten1和flatten2有尺寸：

flatten_1（展平）（无，512）

flatten_2（展平）（无，192）

无论如何，我可以连接这两个展平的层，并有一个长的密集层，大小为192 512 = 704，其中第一个192和第二个512有两个单独的kernel_initializer，并产生一个2类输出？

像这样的东西：

merged_tensor = merge([flatten1, flatten2], mode='concat', concat_axis=1)
output = Dense(output_dim=2, 
    kernel_initializer for [:512]='he_normal',
    kernel_initializer for [512:]='glorot_normal')(merged_tensor)

Edit: I think I have gotten this work by having the following codes(thanks to @Aechlys):

def my_init(shape, shape1, shape2):
    x = initializers.he_normal()(shape1)
    y = initializers.glorot_normal()(shape2)
    return tf.concat([x,y], 0)

class_num = 2

flatten1 = Flatten()(image_maxpool)
flatten2 = Flatten()(text_maxpool)

merged_tensor = concatenate([flatten1, flatten2],axis=-1)

output = Dense(output_dim=class_num, kernel_initializer=lambda shape: my_init(shape,\
              shape1=(512,class_num),\
              shape2=(192,class_num)),\
              activation='softmax')(merged_tensor)

我必须手动添加形状大小512和192，因为我无法通过代码获得flatten1和flatten1的大小

flatten1.get_shape().as_list()

，这给了我[无，无]，虽然它应该是[无，512]，除此之外它应该没问题

1 回答

噢，我玩这个玩得开心 . 您必须创建自己的内核intializer：

def my_init(shape, dtype=None, *, shape1, shape2):
    x = keras.initializers.he_normal()(shape1, dtype=dtype)
    y = keras.initializers.glorot_normal()(shape2, dtype=dtype)
    return tf.concat([x,y], 0)

然后你将通过 Dense 函数中的lambda函数调用它：

不幸的是，正如你所看到的，我还没有能够以编程方式推断出形状 . 我这样做时可能会更新这个答案 . 但是，如果您事先知道形状，则可以将它们作为常量传递：

DENSE_UNITS = 64

input_t = Input((1,25))
input_i = Input((1,35))
input_a = Concatenate(axis=-1)([input_t, input_i])

dense = Dense(DENSE_UNITS, kernel_initializer=lambda shape: my_init(shape, 
                                shape1=(int(input_t.shape[-1]), DENSE_UNITS),
                                shape2=(int(input_i.shape[-1]), DENSE_UNITS)))(input_a)
tf.keras.Model(inputs=[input_t, input_i], outputs=dense)

Out: <tensorflow.python.keras._impl.keras.engine.training.Model at 0x19ff7baac88>

回复于 2024-05-08T03:02:47+08:00

如何使Keras在密集层中有两个不同的初始化器？

1 回答

相关问题