首页 文章

如何访问Scikit了解嵌套的交叉验证分数

提问于
浏览
3

我正在使用python,我想使用scikit learn的嵌套交叉验证 . 我找到了一个非常好的example

NUM_TRIALS = 30
non_nested_scores = np.zeros(NUM_TRIALS)
nested_scores = np.zeros(NUM_TRIALS)
# Choose cross-validation techniques for the inner and outer loops,
# independently of the dataset.
# E.g "LabelKFold", "LeaveOneOut", "LeaveOneLabelOut", etc.
inner_cv = KFold(n_splits=4, shuffle=True, random_state=i)
outer_cv = KFold(n_splits=4, shuffle=True, random_state=i)

# Non_nested parameter search and scoring
clf = GridSearchCV(estimator=svr, param_grid=p_grid, cv=inner_cv)
clf.fit(X_iris, y_iris)
non_nested_scores[i] = clf.best_score_

# Nested CV with parameter optimization
nested_score = cross_val_score(clf, X=X_iris, y=y_iris, cv=outer_cv)
nested_scores[i] = nested_score.mean()

如何访问嵌套交叉验证中的最佳参数集以及所有参数集(及其相应的分数)?

1 回答

  • 4

    您无法访问 cross_val_score 中的单个参数和最佳参数 . cross_val_score 在内部执行的操作是克隆提供的估算器,然后在单个估算器上使用给定 Xy 调用 fitscore 方法 .

    如果您想在每次拆分时访问参数,您可以使用:

    #put below code inside your NUM_TRIALS for loop
    cv_iter = 0
    temp_nested_scores_train = np.zeros(4)
    temp_nested_scores_test = np.zeros(4)
    for train, test in outer_cv.split(X_iris):
        clf.fit(X_iris[train], y_iris[train])
        temp_nested_scores_train[cv_iter] = clf.best_score_
        temp_nested_scores_test[cv_iter] = clf.score(X_iris[test], y_iris[test])
        #You can access grid search's params here
    nested_scores_train[i] = temp_nested_scores_train.mean()
    nested_scores_test[i] = temp_nested_scores_test.mean()
    

相关问题