首页 文章

如何将矢量重塑为TensorFlow的过滤器?

提问于
浏览
5

我想将一些由另一个网络训练的权重转移到TensorFlow,权重存储在一个向量中,如下所示:

[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18]

通过使用numpy,我可以将它重塑为两个3乘3的过滤器,如下所示:

1 2 3     9  10 11
3 4 5     12 13 14
6 7 8     15 16 17

因此,我的过滤器的形状是 (1,2,3,3) . 但是,在TensorFlow中,过滤器的形状为 (3,3,2,1)

tf_weights = tf.Variable(tf.random_normal([3,3,2,1]))

在将tf_weights重塑为预期形状后,重量变得混乱,我无法获得预期的卷积结果 .

具体来说,当图像或滤镜的形状是[数字,通道,大小,大小]时,我写了一个卷积函数,它给出了正确的答案,但它太慢了:

def convol(images,weights,biases,stride):
    """
    Args:
      images:input images or features, 4-D tensor
      weights:weights, 4-D tensor
      biases:biases, 1-D tensor
      stride:stride, a float number
    Returns:
      conv_feature: convolved feature map
    """
    image_num = images.shape[0] #the number of input images or feature maps
    channel = images.shape[1] #channels of an image,images's shape should be like [n,c,h,w]
    weight_num = weights.shape[0] #number of weights, weights' shape should be like [n,c,size,size]
    ksize = weights.shape[2]
    h = images.shape[2]
    w = images.shape[3]
    out_h = (h+np.floor(ksize/2)*2-ksize)/2+1
    out_w = out_h

    conv_features = np.zeros([image_num,weight_num,out_h,out_w])
    for i in range(image_num):
        image = images[i,...,...,...]
        for j in range(weight_num):
            sum_convol_feature = np.zeros([out_h,out_w])
            for c in range(channel):
                #extract a single channel image
                channel_image = image[c,...,...]
                #pad the image
                padded_image = im_pad(channel_image,ksize/2)
                #transform this image to a vector
                im_col = im2col(padded_image,ksize,stride)

                weight = weights[j,c,...,...]
                weight_col = np.reshape(weight,[-1])
                mul = np.dot(im_col,weight_col)
                convol_feature = np.reshape(mul,[out_h,out_w])
                sum_convol_feature = sum_convol_feature + convol_feature
            conv_features[i,j,...,...] = sum_convol_feature + biases[j]
    return conv_features

相反,通过使用tensorflow的conv2d,如下所示:

img = np.zeros([1,3,224,224])
img = img - 1
img = np.rollaxis(img, 1, 4)

weight_array = googleNet.layers[1].weights
weight_array = np.reshape(weight_array,[64,3,7,7])

biases_array = googleNet.layers[1].biases

tf_weight = tf.Variable(weight_array)

tf_img = tf.Variable(img)
tf_img = tf.cast(tf_img,tf.float32)

tf_biases = tf.Variable(biases_array)

conv_feature = tf.nn.bias_add(tf.nn.conv2d(tf_img,tf_weight,strides=[1,2,2,1],padding='SAME'),tf_biases)
sess = tf.Session()
sess.run(tf.initialize_all_variables())
feautre = sess.run(conv_feature)

我得到的功能图是错误的 .

2 回答

  • 6

    不要使用 np.reshape . 它可能mess up the order of your values .

    请改用np.rollaxis

    >>> a = np.array([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18])
    >>> a = a.reshape((1,2,3,3))
    >>> a
    array([[[[ 1,  2,  3],
             [ 4,  5,  6],
             [ 7,  8,  9]],
    
            [[10, 11, 12],
             [13, 14, 15],
             [16, 17, 18]]]])
    >>> b = np.rollaxis(a, 1, 4)
    >>> b.shape
    (1, 3, 3, 2)
    >>> b = np.rollaxis(b, 0, 4)
    >>> b.shape
    (3, 3, 2, 1)
    

    请注意,尺寸为3的两个轴的顺序没有改变 . 如果我要标记它们,那么两个 rollaxis 操作会导致形状变为(1,2,31,32) - >(1,31,32,2) - >(31,32,2,1) . 你的最终数组如下:

    >>> b
    array([[[[ 1],
             [10]],
    
            [[ 2],
             [11]],
    
            [[ 3],
             [12]]],
    
    
           [[[ 4],
             [13]],
    
            [[ 5],
             [14]],
    
            [[ 6],
             [15]]],
    
    
           [[[ 7],
             [16]],
    
            [[ 8],
             [17]],
    
            [[ 9],
             [18]]]])
    
  • 0

    样本张量操作

    我不知道这是否有帮助 . 考虑Reshape,Gather,Dynamic_partition和Split操作,并根据您的需要进行调整 . 下面是这些操作的说明,可以适应您的情况 . 我从我的git repo中复制了这个 . 我相信如果你在ipython中运行这些例子,你可以弄清楚你真正想要的是什么,并获得更好的洞察力 .

    重塑,收集,动态分区和拆分

    收集操作(tf.gather())

    生成数组并测试收集操作 . 请注意这种快速原型制作方法:

    • 我们在Numpy中生成一个数组并测试其上的张量流的操作 .

    使用:根据指数从params收集切片 .

    indices必须是任何维度的整数张量(通常为0-D或1-D) . 最好通过一个例子来说明:

    array = np.array([[1,2,3],[4,9,6],[2,3,4],[7,8,0]])
    
    array.shape
    
    
    (4, 3)
    
    In [27]:
    
    gather_output0  = tf.gather(array,1)
    gather_output01  = tf.gather(array,2)
    gather_output02  = tf.gather(array,3)
    
    gather_output11  = tf.gather(array,[1,2])
    gather_output12  = tf.gather(array,[1,3])
    gather_output13  = tf.gather(array,[3,2])
    
    
    
    
    gather_output  = tf.gather(array,[1,0,2])
    gather_output1  = tf.gather(array,[1,1,2])
    gather_output2  = tf.gather(array,[1,2,1])
    
    In [28]:
    
    with tf.Session() as sess:
        print (gather_output0.eval());print("\n")
        print (gather_output01.eval());print("\n")
        print (gather_output02.eval());print("\n")  
        print (gather_output11.eval());print("\n")
        print (gather_output12.eval());print("\n")
        print (gather_output13.eval());print("\n")
    
        print (gather_output.eval());print("\n")
        print (gather_output1.eval());print("\n")
        print (gather_output2.eval());print("\n")
        #print (gather_output2.eval());print("\n")
    
    [4 9 6]
    
    
    [2 3 4]
    
    
    [7 8 0]
    
    
    [[4 9 6]
     [2 3 4]]
    
    
    [[4 9 6]
     [7 8 0]]
    
    
    [[7 8 0]
     [2 3 4]]
    
    
    [[4 9 6]
     [1 2 3]
     [2 3 4]]
    
    
    [[4 9 6]
     [4 9 6]
     [2 3 4]]
    
    
    [[4 9 6]
     [2 3 4]
     [4 9 6]]
    

    看一下这个简单的例子:

    • 初始化简单数组

    • 测试收集操作

    在[11]中:

    array_simple = np.array([1,2,3])
    
    In [15]:
    
    print "shape of simple array is: ", array_simple.shape
    
    shape of simple array is:  (3,)
    
    In [57]:
    
    gather1  = tf.gather(array1,[0])
    gather01 = tf.gather(array1,[1])
    gather02 = tf.gather(array1,[2])
    
    gather2 = tf.gather(array1,[1,2])
    gather3 = tf.gather(array1,[0,1])
    
    with tf.Session() as sess:
        print (gather1.eval());print("\n")
        print (gather01.eval());print("\n")
        print (gather02.eval());print("\n")
        print (gather2.eval());print("\n")
        print (gather3.eval());print("\n")
    
    [1]
    
    
    [2]
    
    
    [3]
    
    
    [2 3]
    
    
    [1 2]
    
    
    tf.reshape( )
    
    Note:
    
    *  Use the same array that was initiated
    *  Do reshape using tf.reshape( )
    
    In [64]:
    
    array.shape # Confirm array shape
    
    Out[64]:
    
    (4, 3)
    
    In [74]:
    
    print ("This is the array\n" ,array) # see the output and compare with the initial array,
    
    This is the array
     [[1 2 3]
     [4 9 6]
     [2 3 4]
     [7 8 0]]
    
    In [84]:
    
    reshape_ops= tf.reshape(array,[-1,4]) # Note the parameters in reshpe
    reshape_ops1= tf.reshape(array,[-1,3]) # Note the parameters in reshpe
    reshape_ops2= tf.reshape(array,[-1,6]) # Note the parameters in reshpe
    
    reshape_ops_back1= tf.reshape(array,[6,-1]) # Note the parameters in reshpe
    reshape_ops_back2= tf.reshape(array,[3,-1]) # Note the parameters in reshpe
    reshape_ops_back3= tf.reshape(array,[4,-1]) # Note the parameters in reshpe
    
    In [86]:
    
    with tf.Session() as sess:
        print(reshape_ops.eval());print("\n")
        print(reshape_ops1.eval());print("\n")
        print(reshape_ops2.eval());print("\n")
        print ("Output when we reverse the parameters:");print("\n")
        print(reshape_ops_back1.eval());print("\n")
        print(reshape_ops_back2.eval());print("\n")
        print(reshape_ops_back3.eval());print("\n")
    
    [[1 2 3 4]
     [9 6 2 3]
     [4 7 8 0]]
    
    
    [[1 2 3]
     [4 9 6]
     [2 3 4]
     [7 8 0]]
    
    
    [[1 2 3 4 9 6]
     [2 3 4 7 8 0]]
    
    
    Output when we reverse the parameters:
    
    
    [[1 2]
     [3 4]
     [9 6]
     [2 3]
     [4 7]
     [8 0]]
    
    
    [[1 2 3 4]
     [9 6 2 3]
     [4 7 8 0]]
    
    
    [[1 2 3]
     [4 9 6]
     [2 3 4]
     [7 8 0]]
    

    注意:输入大小和输出大小必须相同 . ---否则会出错 . 检查这个的简单方法是通过简单的乘法确保输入可以分成重塑参数 .

    Dynamic_cell_partitions

    This is declared as :
    
    tf.dynamic_partition (array, partitions, num_partitions, name=None)
    
    Note:
    
    * we decalare number_partitions --- number of partitions
    * Use our array initialised earlier
    * We declare the partition as [0 1 0 1] . This signifies the partitions we want 0's fall to one partition and 1 the other partitions given that we have two num_partitions=2.
    
    * The output is a list
    
    In [96]:
    
        print ("This is the array\n" ,array) # This is output array
    
        This is the array
         [[1 2 3]
         [4 9 6]
         [2 3 4]
         [7 8 0]]
    
        We show how to make two and three partitions below
        In [123]:
    
        num_partitions = 2
        num_partitions1 = 3
    
        partitions = [0, 0, 1, 1]
        partitions1 = [0 ,1 ,1, 2 ]
    
        In [119]:
    
        dynamic_ops =tf.dynamic_partition(array, partitions, num_partitions, name=None) # 2 partitions
        dynamic_ops1 =tf.dynamic_partition(array, partitions1, num_partitions1, name=None) # 3 partitions
    
        In [125]:
    
        with tf.Session() as sess:
            run = sess.run(dynamic_ops)
            run1 = sess.run(dynamic_ops1)
            print("Output for 2 partitions: ")
            print (run[0]);print("\n")
            print(run[1]) ;print("\n")# Compare result with initial array. Out is list
            print("Output for three partitions: ")
    
            print (run1[0]);print("\n")
            print (run1[1]);print("\n")
            print (run1[2]);print("\n")
    
        Output for 2 partitions: 
        [[1 2 3]
         [4 9 6]]
    
    
        [[2 3 4]
         [7 8 0]]
    
    
        Output for three partitions: 
        [[1 2 3]]
    
    
        [[4 9 6]
         [2 3 4]]
    
    
        [[7 8 0]]
    

    tf.split()

    确保使用最新的tensorflow版本 . 否则在旧版本中,此实现将给出错误

    这在文档中指定如下:

    tf.split(value,num_or_size_splits,axis = 0,num = None,name ='split') .

    它将张量分割为子张量 . 最好通过一个例子来说明:

    * we define (5,30) aray in numpy
    * we split the array along axis 1
    * We  specify the number of splits as 1-Dimen Tensor along axis 1. So we have 3 splits.
    
    Specify an array
    
        Create a (5 by 30) numpy array. The syntax using numpy is shown below
        In [2]:
    
        ArrayBeforeSplitting = np.arange(150).reshape(5,30) 
        print ("Array shape without split operation is : " ,ArrayBeforeSplitting.shape)
    
        ('Array shape without split operation is : ', (5, 30))
    
        specify number of splits
        In [3]:
    
        split_1D = tf.Variable([8,13,9])
        print("specify number of partions using 1-Dimen Variable:" , tf.shape(split_1D))
    
        ('specify number of partions using 1-Dimen Variable:', <tf.Tensor 'Shape:0' shape=(1,) dtype=int32>)
    
        Use tf.split
    
        Make 3 splits aong y axis so that we have (5,8) ,(5,13),(5,9) splits. The axis 1 add up to give 30-- we can see axis 1 has 30 elements so the partition along that axis should add up to 30 otherwise it gives error.
        In [6]:
    
        split1,split2,split3 = tf.split(ArrayBeforeSplitting,split_1D,1)
        # we have 3 splits along axis 1 specified spcifically
        # by the split_1D . That is split axis 1D (with 30 elements) into partions with 8 ,13, and 9 elements while the x axis
        #remains constant
    
        In [7]:
    
        #INitialise global variables. because split_ID is a variable and needs to be initialised before being
        #used in a computational graph
        init_op = tf.global_variables_initializer()
    
        In [16]:
    
        with tf.Session() as sess:
            sess.run(init_op) # run variable initialisation.
            result=split1.eval();print("\n")
            print(result)
            print("the shape of the first split operation is : ",result.shape)
            result2=split2.eval();print("\n")
            print(result2)
            print("the shape of the second split operation is : ",result2.shape)
    
            result3=split3.eval();print("\n")
            print(result3)
            print("the shape of the third split operation is : ",result3.shape)
    
    
        [[  0   1   2   3   4   5   6   7]
         [ 30  31  32  33  34  35  36  37]
         [ 60  61  62  63  64  65  66  67]
         [ 90  91  92  93  94  95  96  97]
         [120 121 122 123 124 125 126 127]]
        ('the shape of the first split operation is : ', (5, 8))
    
    
        [[  8   9  10  11  12  13  14  15  16  17  18  19  20]
         [ 38  39  40  41  42  43  44  45  46  47  48  49  50]
         [ 68  69  70  71  72  73  74  75  76  77  78  79  80]
         [ 98  99 100 101 102 103 104 105 106 107 108 109 110]
         [128 129 130 131 132 133 134 135 136 137 138 139 140]]
        ('the shape of the second split operation is : ', (5, 13))
    

    希望这可以帮助!

相关问题