使用numpy.take键入转换错误-Java 学习之路

我有一个存储65536 uint8 值的查找表（LUT）：

lut = np.random.randint(256, size=(65536,)).astype('uint8')

我想使用此LUT转换 uint16 数组中的值：

arr = np.random.randint(65536, size=(1000, 1000)).astype('uint16')

我想要进行转换，因为最后一个数组可能会变得非常大 . 当我尝试它时，会发生以下情况：

>>> np.take(lut, arr, out=arr)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python27\lib\site-packages\numpy\core\fromnumeric.py", line 103, in take
    return take(indices, axis, out, mode)
TypeError: array cannot be safely cast to required type

我不明白发生了什么 . 我知道，如果没有 out 参数，则返回与 lut 具有相同的dtype，所以 uint8 . 但是为什么不能将 uint8 投射到 uint16 ？如果你问numpy：

>>> np.can_cast('uint8', 'uint16')
True

显然以下工作：

>>> lut = lut.astype('uint16')
>>> np.take(lut, arr, out=arr)
array([[173, 251, 218, ..., 110,  98, 235],
       [200, 231,  91, ..., 158, 100,  88],
       [ 13, 227, 223, ...,  94,  56,  36],
       ..., 
       [ 28, 198,  80, ...,  60,  87, 118],
       [156,  46, 118, ..., 212, 198, 218],
       [203,  97, 245, ...,   3, 191, 173]], dtype=uint16)

但这也有效：

>>> lut = lut.astype('int32')
>>> np.take(lut, arr, out=arr)
array([[ 78, 249, 148, ...,  77,  12, 167],
       [138,   5, 206, ...,  31,  43, 244],
       [ 29, 134, 131, ..., 100, 107,   1],
       ..., 
       [109, 166,  14, ...,  64,  95, 102],
       [152, 169, 102, ..., 240, 166, 148],
       [ 47,  14, 129, ..., 237,  11,  78]], dtype=uint16)

这真的没有意义，因为现在 int32 被投射到 uint16 ，这绝对不是一件安全的事情：

>>> np.can_cast('int32', 'uint16')
False

如果我将 lut 的dtype设置为 uint16 ， uint32 ， uint64 ， int32 或 int64 中的任何内容，但我的代码有效，但 uint8 ， int8 和 int16 失败 .

我错过了什么，或者这只是在numpy中被打破？

变通方法也是受欢迎的...因为LUT不是那么大，我想它的类型与数组匹配并不是那么糟糕，即使它占用了两倍的空间，但是这样做感觉不对 . ..

有没有办法告诉numpy不担心铸造安全？

1 回答

2
有趣的问题 . numpy.take(lut, ...) 被转换为 lut.take(...) ，其来源可以在这里查看：

https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/item_selection.c#L28

我相信异常被抛出at line 105：
```
obj = (PyArrayObject *)PyArray_FromArray(out, dtype, flags);
if (obj == NULL) {
    goto fail;
}
```
在你的情况下 out 是 arr 但是 dtype 是 lut 之一，即 uint8 . 所以它试图将 arr 转换为 uint8 ，但失败了 . 我不得不说我不知道为什么它需要这样做，只是指出它确实...出于某种原因 take 似乎假设你想要输出数组与 dtype 具有相同的 dtype .

顺便说一下， in many cases the call to PyArray_FromArray will actually create a new array and the replacement will not be in place . 例如if you call take with mode='raise'（默认情况下，以及示例中发生的情况）或 lut.dtype != arr.dtype 时就是这种情况 . 好吧，至少它应该，我无法解释为什么，当你将 lut 转换为 int32 时，输出数组仍为 uint16 ！这对我来说是个谜 - 也许它与NPY_ARRAY_UPDATEIFCOPY旗帜有关（另见here） .

底线：
- numpy的行为确实很难理解......也许其他人会提供一些见解为什么它做它做的事情
- 我不会尝试处理 arr - 似乎在大多数情况下都会在引擎盖下创建一个新阵列 . 我只是选择 arr = lut.take(arr) - 顺便说一句，它最终将释放 arr 之前使用的一半内存 .
回复于 2024-04-18T12:43:05+08:00

使用numpy.take键入转换错误

1 回答

相关问题