R - 减少使用插入符号训练随机森林的内存使用量-Java 学习之路

我正在尝试创建一个随机森林给出~10万输入 . 为了实现它们，我使用了带有 method = "parRF" 的插入符号包中的 train . 不幸的是，我的128 GB内存机器仍然耗尽 . 因此，我需要减少我使用的内存量 .

现在，我正在运行的培训方法是：

> trControl <- trainControl(method = "LGOCV", p = 0.9, savePredictions = T)
> model_parrf <- train(x = data_preds, y = data_resp, method = "parRF",
                     trControl = trControl)

但是，由于保留了每个林，系统会很快耗尽内存 . 如果我对 train 和 randomForest 的理解是正确的，那么每个随机森林至少会存储大约 500 * 100,000 倍 . 因此，我想扔掉我不再需要的随机森林 . 我尝试将 keep.forest = FALSE 传递给 randomForest 使用

> model_parrf <- train(x = data_preds, y = data_resp, method = "parRF",
                       trControl = trControl, keep.forest = FALSE)
Error in train.default(x = data_preds, y = data_resp, method = "parRF",  : 
  final tuning parameters could not be determined

此外，这个警告反复抛出：

In eval(expr, envir, enclos) :
  predictions failed for Resample01: mtry=2 Error in predict.randomForest(modelFit, newdata) : 
  No forest component in the object

似乎由于某种原因，插入符号需要保留森林以便比较模型 . 有什么方法可以用更少的内存使用插入符号吗？

1 回答

1

请记住，如果使用 M 核心，则需要存储数据集 M+1 次 . 尝试少用 Worker .

回复于 2024-04-28T08:37:05+08:00

R - 减少使用插入符号训练随机森林的内存使用量

1 回答

相关问题