如何修改Scikit-Learn中决策树算法中的分裂标准（gini / entropy）？-Java 学习之路

我在二元分类问题上使用决策树算法，目标是最小化分类的误报（最大化 positive predicted value ）（诊断工具的成本非常高） .

有没有办法在gini /熵分裂标准中引入 weight 来惩罚误报错误分类？

Here例如，修改后的Gini索引如下：

enter image description here

因此，我想知道是否有任何方法可以在Scikit-learn中实现它？

EDIT

使用 class_weight 产生了以下结果：

from sklearn import datasets as dts
iris_data = dts.load_iris()

X, y = iris_data.features, iris_data.targets
# take only classes 1 and 2 due to less separability
X = X[y>0]
y = y[y>0]
y = y - 1 # make binary labels

# define the decision tree classifier with only two levels at most and no class balance
dt = tree.DecisionTreeClassifier(max_depth=2, class_weight=None)

# fit the model, no train/test for simplicity
dt.fit(X[:55,:2], y[:55])

绘制决策边界和树 Blue are positive (1) ：

enter image description here

虽然超过了少数民族（或更珍贵）：

dt_100 = tree.DecisionTreeClassifier（max_depth = 2，class_weight = {1：100}）

enter image description here

1 回答

1

决策树分类器支持class_weight参数 .

在两个类问题中，这可以完全解决您的问题 . 通常，这用于不 balancer 的问题 . 对于两个以上的类，不可能提供单个标签（据我所知）

回复于 2024-04-28T21:22:39+08:00

如何修改Scikit-Learn中决策树算法中的分裂标准（gini / entropy）？

1 回答

相关问题