我有一个随机森林模型,我试图获取featureImportance向量 .

Map<Object, Object> categoricalFeaturesParam = new HashMap<>();
scala.collection.immutable.Map<Object, Object> categoricalFeatures = (scala.collection.immutable.Map<Object, Object>)
        scala.collection.immutable.Map$.MODULE$.apply(JavaConversions.mapAsScalaMap(categoricalFeaturesParam).toSeq());
int numberOfClasses = 2;
RandomForestClassifier rfc = new RandomForestClassifier();
RandomForestClassificationModel rfm = RandomForestClassificationModel.fromOld(model, rfc, categoricalFeatures, numberOfClasses);
System.out.println(rfm.featureImportances());

当我运行上面的代码时,我发现featureImportance为null . 我是否需要特定设置任何内容以获取随机森林模型的特征重要性 .

尝试使用1.6版本的Spark,它在API中使用了numberOfFeatures第五个参数,但仍然将featureImportance作为null .

RandomForestClassifier rfc = getRandomForestClassifier(numTrees,maxBinSize,maxTreeDepth,seed,impurity); RandomForestClassificationModel rfm = RandomForestClassificationModel.fromOld(model,rfc,categoricalFeatures,numberOfClasses,numberOfFeatures);的System.out.println(rfm.featureImportances());

堆栈跟踪:在org.apache.spark.ml.tree.impl的org.apache.spark.ml.tree.impl.RandomForest $ .computeFeatureImportance(RandomForest.scala:1152)中的线程“main”java.lang.NullPointerException中的异常.andomForest $$ anonfun $ featureImportances $ 1.apply(RandomForest.scala:1111)org.apache.spark.ml.tree.impl.RandomForest $$ anonfun $ featureImportances $ 1.apply(RandomForest.scala:1108)at scala.collection .IndexedSeqOptimized $ class.foreach(IndexedSeqOptimized.scala:33)at scala.collection.mutable.ArrayOps $ ofRef.foreach(ArrayOps.scala:186)at org.apache.spark.ml.tree.impl.RandomForest $ .featureImportances( RandomForest.scala:1108)org.apache.spark.ml.classification.RandomForestClassificationModel.featureImportances $ lzycompute(RandomForestClassifier.scala:237)at org.apache.spark.ml.classification.RandomForestClassificationModel.featureImportances(RandomForestClassifier.scala:237)在com.markmonitor.antifraud.ce.ml.CheckFeatureImportance.main(CheckFeatureImportance.java:49)