Tensorflow VGG16（由caffe转换而来）评估准确度低-Java 学习之路

我没有自己转换权重，而是使用vgg16_weights.npz来自www（点）cs（点）多伦多（点）edu / ~frossard / post / vgg16 / . 在那里，它被提到

我们使用专用工具转换作者的GitHub配置文件（gist（dot）github（dot）com / ksimonyan / 211839e770f7b538e2d8＃file-readme-md）中公开提供的Caffe权重（github（dot）com / ethereon / caffe-tensorflow ） .

但是，在该页面中，没有验证代码，所以我指的是tensorflow MNIST和inception代码 .

How I create TFRecords of Imagenet

我从一开始就使用build_imagenet_data.py . 我改变了

label_index = 0 #originally label_index = 1

因为初始使用label_index 0作为背景类（所以总共有1001个类） . Caffe格式不使用它，因为输出的数量是1000.我更喜欢使用TFRecord格式，因为我将改变处理权重和重新训练 .

How I load the weights

从MNIST的mnist.py中获取的推理函数被修改，因此变量取自vgg16_weights.npz

我如何加载权重：

weights = np.load('/the_path/vgg16_weights.npz')

我如何将变量放在conv1_1中：

with tf.name_scope('conv1_1') as scope:
    kernel = tf.Variable(tf.constant(weights['conv1_1_W']), name='weights')
    conv = tf.nn.conv2d(images, kernel, [1, 1, 1, 1], padding='SAME')
    biases = tf.Variable(tf.constant(weights['conv1_1_b']), name='biases')

    out = tf.nn.bias_add(conv, biases)
    conv1_1 = tf.nn.relu(out, name=scope)
    sess.run(conv1_1)

How I read the TFRecords

我在没有任何变化的情况下使用了inception的image_processing.py，dataset.py和ImagenetData.py . 然后，我运行inception的inception_eval.py evaluate函数，改变推理代码并从检查点删除恢复移动变量（因为我已经在变量初始化中手动恢复） . 但是，Vffe中的VGG-16的准确度并不相同 . 前5名的准确率约为9％ .

Closing

这个方法有什么问题？有几部分代码我仍然不明白：

处理1批图像后，TFReader如何移动到下一批图像？初始的image_processing.py大小的输出只是批量大小的数量 . 要完成，这是基于文档的输出：

图像：图像 . 4D张量大小[batch_size，FLAGS.image_size，image_size，3] . 标签：[FLAGS.batch_size]的1-D整数张量 .

在tf.in_top_k之前我需要softmax logits吗？（好吧，我觉得这不重要，因为 Value 序列是相同的）

感谢您的帮助 . 很抱歉，如果链接很乱，我只能在1个帖子中发布2个链接，因为我的声誉 .

UPDATE

我通过改变咖啡重量来尝试自己 . 反转conv1_1的通道输入维度（因为caffe接收BGR，所以权重是BGR而不是tensorflow中的RGB），并且与网站的权重相同：前5中约为9％ .

我发现在tensorflow inception的image_processing.py中没有平均图像减法 . 我用tf.reduce_mean添加平均减法（在eval_image函数中）并且准确率为11％ .

然后我尝试用更改eval_image函数

# source: https://github.com/ethereon/caffe-tensorflow/blob/master/examples/imagenet/dataset.py
img_shape = tf.to_float(tf.shape(image)[:2])
min_length = tf.minimum(img_shape[0], img_shape[1])
new_shape = tf.to_int32((256 / min_length) * img_shape) #isotropic case

# new_shape = tf.pack([256,256]) #non isotropic case

image = tf.image.resize_images(image, [new_shape[0], new_shape[1]])

offset = tf.to_int32((new_shape - 224) / 2)
image = tf.slice(image, begin=tf.pack([offset[0], offset[1], 0]), size=tf.pack([224, 224, -1]))

# mean_subs_image = tf.reduce_mean(image,axis=[0,1],keep_dims=True)

return image - mean_subs_image

我得到了13％ . 增加但仍然缺乏很多 . 似乎这是问题之一 . 我不确定其他问题是什么 .

1 回答

0

通常，跨库移植整个模型权重会很困难 . 你指出了与caffe的一些区别，但可能还有其他的 . 在TensorFlow中重新训练模型可能更容易 .

回复于 2024-04-20T09:58:23+08:00

Tensorflow VGG16（由caffe转换而来）评估准确度低

1 回答

相关问题