首页 文章

批量4D张量Tensorflow索引

提问于
浏览
1

Given

  • batch_images :4D张量形状 (B, H, W, C)

  • x :3D张量的形状 (B, H, W)

  • y :3D张量的形状 (B, H, W)

Goal

如何使用 xy 坐标索引 batch_images 以获得4D张量形状 B, H, W, C . 也就是说,我想获得每批次,并为每一对 (x, y) 形成张量 C .

在numpy中,这将使用 input_img[np.arange(B)[:,None,None], y, x] 来实现,但我似乎无法使其在tensorflow中工作 .

My attempt so far

def get_pixel_value(img, x, y):
    """
    Utility function to get pixel value for 
    coordinate vectors x and y from a  4D tensor image.
    """
    H = tf.shape(img)[1]
    W = tf.shape(img)[2]
    C = tf.shape(img)[3]

    # flatten image
    img_flat = tf.reshape(img, [-1, C])

    # flatten idx
    idx_flat = (x*W) + y

    return tf.gather(img_flat, idx_flat)

这是一个不正确的张量形状 (B, H, W) .

1 回答

  • 1

    应该可以通过展平张量来实现它,因为你必须制作一个额外的虚拟批量索引张量,其形状与始终包含当前批次的索引的 xy 相同 . 这基本上是来自你的numpy示例的 np.arange(B) ,你的TensorFlow代码中缺少这个例子 .

    您还可以使用tf.gather_nd来简化一些事情,它会为您进行索引计算 .

    这是一个例子:

    import numpy as np
    import tensorflow as tf
    
    # Example tensors
    M = np.random.uniform(size=(3, 4, 5, 6))
    x = np.random.randint(0, 5, size=(3, 4, 5))
    y = np.random.randint(0, 4, size=(3, 4, 5))
    
    def get_pixel_value(img, x, y):
        """
        Utility function that composes a new image, with pixels taken
        from the coordinates given in x and y.
        The shapes of x and y have to match.
        The batch order is preserved.
        """
    
        # We assume that x and y have the same shape.
        shape = tf.shape(x)
        batch_size = shape[0]
        height = shape[1]
        width = shape[2]
    
        # Create a tensor that indexes into the same batch.
        # This is needed for gather_nd to work.
        batch_idx = tf.range(0, batch_size)
        batch_idx = tf.reshape(batch_idx, (batch_size, 1, 1))
        b = tf.tile(batch_idx, (1, height, width))
    
        indices = tf.pack([b, y, x], 3)
        return tf.gather_nd(img, indices)
    
    s = tf.Session()
    print(s.run(get_pixel_value(M, x, y)).shape)
    # Should print (3, 4, 5, 6).
    # We've composed a new image of the same size from randomly picked x and y
    # coordinates of each original image.
    

相关问题