scipy稀疏矩阵中每行或每列的Argmax-Java 学习之路

给定轴， scipy.sparse.coo_matrix.max 返回每行或每列的最大值 . 我想知道的不是值，而是每行或每列的最大值的索引 . 我很乐意接受任何帮助 .

5 回答

3

从scipy版本0.19开始， csr_matrix 和 csc_matrix 都支持 argmax() 和 argmin() 方法 .

回复于 2024-05-04T12:48:44+08:00
1
我建议研究代码
```
moo._min_or_max_axis
```
其中 moo 是 coo_matrix .
```
mat = mat.tocsc()  # for axis=0
mat.sum_duplicates()

major_index, value = mat._minor_reduce(min_or_max)
not_full = np.diff(mat.indptr)[major_index] < N
value[not_full] = min_or_max(value[not_full], 0)

mask = value != 0
major_index = np.compress(mask, major_index)
value = np.compress(mask, value)
return coo_matrix((value, (np.zeros(len(value)), major_index)),
                      dtype=self.dtype, shape=(1, M))
```
根据轴的不同，它更喜欢使用csc而不是csr . 我猜测应该可以在计算中包含 argmax .

这个建议可能行不通 . 关键是 mat._minor_reduce 方法，它做了一些改进：
```
ufunc.reduceat(mat.data, mat.indptr[:-1])
```
也就是将 ufunc 应用于矩阵 data 数组的块，使用 indptr 来定义块 . np.sum ， np.maxiumum 是 ufunc 这里有效 . 我不知道等效的 argmax ufunc .

一般情况下，如果你想通过'row'为csr矩阵（或csc的col）做事，你要么必须遍历相对昂贵的行，要么使用 ufunc.reduceat 在flat mat.data vector上做同样的事情 .

group argmax/argmin over partitioning indices in numpy尝试执行 argmax.reduceat . 那里的解决方案可能适用于稀疏矩阵 .
回复于 2024-05-04T12:48:44+08:00

如果 A 是您的 scipy.sparse.coo_matrix ，那么您将获得最大值的行和列，如下所示：

I=A.data.argmax()
maxrow = A.row[I]
maxcol=A.col[I]

要获得每行的最大值索引，请参阅下面的编辑：

from scipy.sparse import coo_matrix
import numpy as np
row  = np.array([0, 3, 1, 0])
col  = np.array([0, 2, 3, 2])
data = np.array([-3, 4, 11, -7])
A= coo_matrix((data, (row, col)), shape=(4, 4))
print A.toarray()

nrRows=A.shape[0]
maxrowind=[]
for i in range(nrRows):
    r = A.getrow(i)# r is 1xA.shape[1] matrix
    maxrowind.append( r.indices[r.data.argmax()] if r.nnz else 0)
print maxrowind

r.nnz 是显式存储值的计数（即非零值）

回复于 2024-05-04T12:48:44+08:00

2
最新版本的numpy_indexed包（免责声明：我是它的作者）可以高效优雅的方式解决这个问题：
```
import numpy_indexed as npi
col, argmax = group_by(coo.col).argmax(coo.data)
row = coo.row[argmax]
```
在这里我们按col分组，所以它的argmax在列上;交换行和col将为您提供行上的argmax .
回复于 2024-05-04T12:48:44+08:00

扩展来自@hpaulj和@joeln的答案并使用group argmax/argmin over partitioning indices in numpy建议的代码，此函数将计算CSR上的argmax列或CSC上的行argmax：

import numpy as np
import scipy.sparse as sp

def csr_csc_argmax(X, axis=None):
    is_csr = isinstance(X, sp.csr_matrix)
    is_csc = isinstance(X, sp.csc_matrix)
    assert( is_csr or is_csc )
    assert( not axis or (is_csr and axis==1) or (is_csc and axis==0) )

    major_size = X.shape[0 if is_csr else 1]
    major_lengths = np.diff(X.indptr) # group_lengths
    major_not_empty = (major_lengths > 0)

    result = -np.ones(shape=(major_size,), dtype=X.indices.dtype)
    split_at = X.indptr[:-1][major_not_empty]
    maxima = np.zeros((major_size,), dtype=X.dtype)
    maxima[major_not_empty] = np.maximum.reduceat(X.data, split_at)
    all_argmax = np.flatnonzero(np.repeat(maxima, major_lengths) == X.data)
    result[major_not_empty] = X.indices[all_argmax[np.searchsorted(all_argmax, split_at)]]
    return result

对于完全稀疏的任何行（CSR）或列（CSC）的argmax，它返回-1（即，在 X.eliminate_zeros() 之后完全为零） .

回复于 2024-05-04T12:48:44+08:00

scipy稀疏矩阵中每行或每列的Argmax

5 回答

相关问题