r：非常大的matrix.csr到矩阵：整数溢出-Java 学习之路

以下代码（特别是 as.matrix ）仅在打开非常大的libsvm文件时失败 . 它适用于较小的文件

rawmforCluster=read.matrix.csr(filePath)
sparseforCluster=rawmforCluster$x
str(sparseforCluster)
sparseMatrixforCluster=as.matrix(sparseforCluster)

sparseforCluster 的结构是

Formal class 'matrix.csr' [package "SparseM"] with 4 slots
  ..@ ra       : num [1:4860285] 1 1 2 1 1 1 1 1 1 1 ...
  ..@ ja       : int [1:4860285] 77 668 716 1086 1202 1306 1527 2184 2545 2729 ...
  ..@ ia       : int [1:659095] 1 18 25 26 31 36 52 59 67 72 ...
  ..@ dimension: int [1:2] 659094 3778

我得到的错误是

double（nrow * ncol）错误：向量大小不能为NA另外：警告消息：在nrow * ncol：由整数溢出产生的NAs

Question 如何将数据强制转换为矩阵或（次优）data.table？（或者我应该寻求其他解决方案吗？）

Update 我发现标准解决方案是通过删除稀疏（低频）项来减小矩阵的大小 . 在我的情况下，这不是一个选项，因为一些低频项可能与某些子集高度相关 .

我也读过有关bigmemory包的内容 . 但是，这似乎不适用于matrix.csr

r：非常大的matrix.csr到矩阵：整数溢出

相关问题