为什么我在Keras的Theano和Tensorflow后端为一对图像之间的L2距离得到不同的结果？-Java 学习之路

我附上一个脚本，从Keras预先训练好的Resnet50 convnet中提取特征 . 您可以将任意一对图像提供给它，它将打印出来自网络特定层的特征嵌入的L2距离（我选择了'activation_43'） .

我的问题是，当在Keras使用的两个可用后端进行数值计算之间切换时，我得到了不同的结果：Theano和Tensorflow . 据我所知，我试图处理各自的图像轴惯例，但我一定忽略了一些东西 .

以下是为一对图像打印L2距离的代码：

'''

Compare the L2 distance between features extracted from 2 images. Which specific images we use doesn't matter --
what we're interested in comparing is the L2 distance between an image pair in the THEANO backend vs the TENSORFLOW
backend.

I pasted my personal results at the bottom of the script in comments.

usage: $python this_script.py image1.jpg image2.jpg

'''

import cv2
import numpy as np
import keras.backend as K
from keras.applications import ResNet50
from keras.models import Model
from sklearn.preprocessing import normalize
import sys

def preprocess_cv2(images, dim_ordering='default'):
    '''
    :param images: rank 4 tensor of concatenated cv2_images
                    note: channels will be ordered BGR by default
    :param dim_ordering: keras backend - either 'tf' or 'th'
                    note: if 'th', images must be (batch, channels, height, width)
                          if 'tf', images must be (batch, height, width channels)
    :return: preprocessed batch of images
    '''
    images = images.astype(np.float64)
    if dim_ordering == 'default':
        dim_ordering = K.image_dim_ordering()
        assert dim_ordering in {'tf', 'th'}
        if dim_ordering == 'th':
            # need to transpose axes to make (batch, channels, height, width)
            print('Image batch arrived with shape: {}'.format(str(images.shape)))
            images = np.transpose(images, (0, 3, 1, 2))
            print('Image batch axes were transposed to shape: {} for THEANO dim-ordering convention'.format(str(images.shape)))
            # # 'RGB'->'BGR'
            # x = x[:, ::-1, :, :]
            # Zero-center by mean pixel
            images[:, 0, :, :] -= 103.939
            images[:, 1, :, :] -= 116.779
            images[:, 2, :, :] -= 123.68
        else:
            # 'RGB'->'BGR'
            # x = x[:, :, :, ::-1]
            # # Zero-center by mean pixel
            images[:, :, :, 0] -= 103.939
            images[:, :, :, 1] -= 116.779
            images[:, :, :, 2] -= 123.68
        return images

def extract_resnet_features(x, layer_name):
    net = ResNet50(include_top=False, weights='imagenet')
    model = Model(input=net.input, output=net.get_layer(layer_name).output)
    return model.predict(x)

def l2_distance(A, B):
    return np.linalg.norm(A - B)

########################################################################################################################

print('Using backend {}'.format(K.image_dim_ordering()))

layer = 'activation_43'

# img_path_1 = '/home/hal9000/Pictures/eeeeeeeeeeeeeeeeee.png'
# img_path_2 = '/home/hal9000/Pictures/joe_camel2.png'
img_path_1 = sys.argv[1]
img_path_2 = sys.argv[2]
im_1 = cv2.imread(img_path_1)
im_2 = cv2.imread(img_path_2)

# resize both images such they they have the same size (so that the extracted features have the same dimension)
h, w, c = 320, 320, 3
im_1 = cv2.resize(im_1, (w, h), interpolation=cv2.INTER_CUBIC)
im_2 = cv2.resize(im_2, (w, h), interpolation=cv2.INTER_CUBIC)

# construct a batch
batch = np.zeros(shape=(2, h, w, c))
batch[0] = im_1
batch[1] = im_2

# preprocess the batch
x = preprocess_cv2(batch)

# forward pass the network and extract features
print('Computing features...')
features = extract_resnet_features(x, layer)
features_shape = features.shape
print('Finished computing features!')
feature_1 = features[0]
feature_2 = features[1]

# save the features for later use
# with open('features_{}_1.nparray'.format(K.image_dim_ordering()), 'wb') as f:
#     np.save(f, feature_1)
# with open('features_{}_2.nparray'.format(K.image_dim_ordering()), 'wb') as f:
#     np.save(f, feature_2)

# l2 normalize the features
normalized_feature_1 = normalize(feature_1.flatten(), norm='l2', axis=0)
normalized_feature_2 = normalize(feature_2.flatten(), norm='l2', axis=0)

distance = l2_distance(normalized_feature_1, normalized_feature_2)
print('Distance between features: {}'.format(distance))

以下是我分别为Theano和Tensorflow得到的结果：

# /usr/bin/python2.7 /home/hal9000/tf_vs_th/comparison.py /home/hal9000/Pictures/eeeeeeeeeeeeeeeeee.png /home/hal9000/Pictures/joe_camel2.png
# Using Theano backend.
# WARNING (theano.sandbox.cuda): CUDA is installed, but device gpu is not available  (error: Unable to get the number of gpus available: CUDA driver version is insufficient for CUDA runtime version)
# Using backend th
# Image batch arrived with shape: (2, 320, 320, 3)
# Image batch axes were transposed to shape: (2, 3, 320, 320) for THEANO dim-ordering convention
# Computing features...
# Finished computing features!
# /usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py:395: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
#   DeprecationWarning)
# /usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py:395: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
#   DeprecationWarning)
# Distance between features: 446.848968506
# 
# Process finished with exit code 0

# /usr/bin/python2.7 /home/hal9000/tf_vs_th/comparison.py /home/hal9000/Pictures/eeeeeeeeeeeeeeeeee.png /home/hal9000/Pictures/joe_camel2.png
# Using TensorFlow backend.
# Using backend tf
# Computing features...
# Finished computing features!
# /usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py:395: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
#   DeprecationWarning)
# /usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py:395: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
#   DeprecationWarning)
# Distance between features: 261.067047119
# 
# Process finished with exit code 0

1 回答

0
事实证明问题在于我如何计算输出特征的L2规范化版本 .

而不是特征标准化
```
normalized_feature_1 = normalize(feature_1.flatten(), norm='l2', axis=0)
```
我应该像这样进行逐步归一化
```
normalized_feature_1 = normalize(feature_1.flatten(), norm='l2', axis=1)
```
这导致TF和TH模型执行相同的距离测量（两个库如何处理舍入之间的差异）
回复于 2024-05-05T22:57:05+08:00

为什么我在Keras的Theano和Tensorflow后端为一对图像之间的L2距离得到不同的结果？

1 回答

相关问题