为什么在tensorflow中的conv2d给出的输出具有与输入相同的形状-Java 学习之路

根据这个深度学习课程http://cs231n.github.io/convolutional-networks/#conv，它表示如果有一个输入 x ，形状 [W,W] （其中 W = width = height ）经过 Convolutional Layer filter 形状 [F,F] 和 stride S ，则 Layer 将返回 output 形状 [(W-F)/S +1, (W-F)/S +1]

但是，当我试图遵循Tensorflow教程时：https://www.tensorflow.org/versions/r0.11/tutorials/mnist/pros/index.html . 似乎有功能的差异 tf.nn.conv2d(inputs, filter, stride)

无论如何更改我的滤镜大小， conv2d 将不断返回一个与输入形状相同的值 .

在我的情况下，我使用 MNIST 数据集，表明每个图像的大小 [28,28] （忽略 channel_num = 1 ）

但在我定义了第一个 conv1 图层后，我使用 conv1.get_shape() 查看其输出，它给了我 [28,28, num_of_filters]

为什么是这样？我认为返回值应该遵循上面的公式 .

附录：代码段

#reshape x from 2d to 4d

x_image = tf.reshape(x, [-1, 28, 28, 1]) #[num_samples, width, height, channel_num]

## define the shape of weights and bias
w_shape = [5, 5, 1, 32] #patch_w, patch_h, in_channel, output_num(out_channel)
b_shape =          [32] #bias only need to be consistent with output_num

## init weights of conv1 layers
W_conv1 = weight_variable(w_shape)
b_conv1 = bias_variable(b_shape)

## first layer x_iamge->conv1/relu->pool1

#Our convolutions uses a stride of one 
#and are zero padded 
#so that the output is the same size as the input
h_conv1 = tf.nn.relu(
    conv2d(x_image, W_conv1) + b_conv1
                    )

print 'conv1.shape=',h_conv1.get_shape() 
## conv1.shape= (?, 28, 28, 32) 
## I thought conv1.shape should be (?, (28-5)/1+1, 24 ,32)

h_pool1 = max_pool_2x2(h_conv1) #output 32 num
print 'pool1.shape=',h_pool1.get_shape() ## pool1.shape= (?, 14, 14, 32)

2 回答

2

它取决于填充参数 . 'SAME'将输出保持为WxW（假设stride = 1，）'VALID'将输出的大小缩小为（W-F 1）x（W-F 1）

回复于 2024-05-04T08:00:13+08:00
2

Conv2d有一个名为padding的参数see here

如果您将填充设置为“有效”，它将满足您的公式 . 默认为“SAME”，填充零的图像（与添加边框相同），使得输出将保持与输入相同的形状 .

回复于 2024-05-04T08:00:13+08:00

为什么在tensorflow中的conv2d给出的输出具有与输入相同的形状

2 回答

相关问题