使用MLlib从Spark的决策树中查找重要性值-Java 学习之路

我们使用MLlib为Decision Tree运行Spark 1.0或1.1 .

当我使用示例数据运行示例SCALA代码时，它没有错误，但我无法从结果中找到功能重要性 .

任何人都有关于如何获得 Value 的信息？

2 回答

在Spark 2中，您可以执行以下操作：

val vectorAssembler = new VectorAssembler().setInputCols(featureArray)
val decisionTreeModel = decisionTree.fit(trainingDataset)
val featureImportances = decisionTreeModel.featureImportances // Sparse or Dense Vector

featureArray.zip(featureImportances.toArray).sortBy(_._2).reverse

回复于 2024-04-20T13:13:17+08:00

当你训练DecisionTreeModel结束时，你有这个类

class DecisionTreeModel(val topNode: Node, val algo: Algo) {
   ...
}

您可以从顶部开始遍历节点，您可以从中获得所需的一切（预测InformationGainStats）

class Node (
    val id: Int,
    val predict: Double,
    val isLeaf: Boolean,
    val split: Option[Split],
    var leftNode: Option[Node],
    var rightNode: Option[Node],
    val stats: Option[InformationGainStats])

回复于 2024-04-20T13:13:17+08:00

使用MLlib从Spark的决策树中查找重要性值

2 回答

相关问题