我附上一个脚本,从Keras预先训练好的Resnet50 convnet中提取特征 . 您可以将任意一对图像提供给它,它将打印出来自网络特定层的特征嵌入的L2距离(我选择了'activation_43') .
我的问题是,当在Keras使用的两个可用后端进行数值计算之间切换时,我得到了不同的结果:Theano和Tensorflow . 据我所知,我试图处理各自的图像轴惯例,但我一定忽略了一些东西 .
以下是为一对图像打印L2距离的代码:
'''
Compare the L2 distance between features extracted from 2 images. Which specific images we use doesn't matter --
what we're interested in comparing is the L2 distance between an image pair in the THEANO backend vs the TENSORFLOW
backend.
I pasted my personal results at the bottom of the script in comments.
usage: $python this_script.py image1.jpg image2.jpg
'''
import cv2
import numpy as np
import keras.backend as K
from keras.applications import ResNet50
from keras.models import Model
from sklearn.preprocessing import normalize
import sys
def preprocess_cv2(images, dim_ordering='default'):
'''
:param images: rank 4 tensor of concatenated cv2_images
note: channels will be ordered BGR by default
:param dim_ordering: keras backend - either 'tf' or 'th'
note: if 'th', images must be (batch, channels, height, width)
if 'tf', images must be (batch, height, width channels)
:return: preprocessed batch of images
'''
images = images.astype(np.float64)
if dim_ordering == 'default':
dim_ordering = K.image_dim_ordering()
assert dim_ordering in {'tf', 'th'}
if dim_ordering == 'th':
# need to transpose axes to make (batch, channels, height, width)
print('Image batch arrived with shape: {}'.format(str(images.shape)))
images = np.transpose(images, (0, 3, 1, 2))
print('Image batch axes were transposed to shape: {} for THEANO dim-ordering convention'.format(str(images.shape)))
# # 'RGB'->'BGR'
# x = x[:, ::-1, :, :]
# Zero-center by mean pixel
images[:, 0, :, :] -= 103.939
images[:, 1, :, :] -= 116.779
images[:, 2, :, :] -= 123.68
else:
# 'RGB'->'BGR'
# x = x[:, :, :, ::-1]
# # Zero-center by mean pixel
images[:, :, :, 0] -= 103.939
images[:, :, :, 1] -= 116.779
images[:, :, :, 2] -= 123.68
return images
def extract_resnet_features(x, layer_name):
net = ResNet50(include_top=False, weights='imagenet')
model = Model(input=net.input, output=net.get_layer(layer_name).output)
return model.predict(x)
def l2_distance(A, B):
return np.linalg.norm(A - B)
########################################################################################################################
print('Using backend {}'.format(K.image_dim_ordering()))
layer = 'activation_43'
# img_path_1 = '/home/hal9000/Pictures/eeeeeeeeeeeeeeeeee.png'
# img_path_2 = '/home/hal9000/Pictures/joe_camel2.png'
img_path_1 = sys.argv[1]
img_path_2 = sys.argv[2]
im_1 = cv2.imread(img_path_1)
im_2 = cv2.imread(img_path_2)
# resize both images such they they have the same size (so that the extracted features have the same dimension)
h, w, c = 320, 320, 3
im_1 = cv2.resize(im_1, (w, h), interpolation=cv2.INTER_CUBIC)
im_2 = cv2.resize(im_2, (w, h), interpolation=cv2.INTER_CUBIC)
# construct a batch
batch = np.zeros(shape=(2, h, w, c))
batch[0] = im_1
batch[1] = im_2
# preprocess the batch
x = preprocess_cv2(batch)
# forward pass the network and extract features
print('Computing features...')
features = extract_resnet_features(x, layer)
features_shape = features.shape
print('Finished computing features!')
feature_1 = features[0]
feature_2 = features[1]
# save the features for later use
# with open('features_{}_1.nparray'.format(K.image_dim_ordering()), 'wb') as f:
# np.save(f, feature_1)
# with open('features_{}_2.nparray'.format(K.image_dim_ordering()), 'wb') as f:
# np.save(f, feature_2)
# l2 normalize the features
normalized_feature_1 = normalize(feature_1.flatten(), norm='l2', axis=0)
normalized_feature_2 = normalize(feature_2.flatten(), norm='l2', axis=0)
distance = l2_distance(normalized_feature_1, normalized_feature_2)
print('Distance between features: {}'.format(distance))
以下是我分别为Theano和Tensorflow得到的结果:
# /usr/bin/python2.7 /home/hal9000/tf_vs_th/comparison.py /home/hal9000/Pictures/eeeeeeeeeeeeeeeeee.png /home/hal9000/Pictures/joe_camel2.png
# Using Theano backend.
# WARNING (theano.sandbox.cuda): CUDA is installed, but device gpu is not available (error: Unable to get the number of gpus available: CUDA driver version is insufficient for CUDA runtime version)
# Using backend th
# Image batch arrived with shape: (2, 320, 320, 3)
# Image batch axes were transposed to shape: (2, 3, 320, 320) for THEANO dim-ordering convention
# Computing features...
# Finished computing features!
# /usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py:395: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
# DeprecationWarning)
# /usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py:395: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
# DeprecationWarning)
# Distance between features: 446.848968506
#
# Process finished with exit code 0
# /usr/bin/python2.7 /home/hal9000/tf_vs_th/comparison.py /home/hal9000/Pictures/eeeeeeeeeeeeeeeeee.png /home/hal9000/Pictures/joe_camel2.png
# Using TensorFlow backend.
# Using backend tf
# Computing features...
# Finished computing features!
# /usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py:395: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
# DeprecationWarning)
# /usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py:395: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
# DeprecationWarning)
# Distance between features: 261.067047119
#
# Process finished with exit code 0
1 回答
事实证明问题在于我如何计算输出特征的L2规范化版本 .
而不是特征标准化
我应该像这样进行逐步归一化
这导致TF和TH模型执行相同的距离测量(两个库如何处理舍入之间的差异)