我正在使用scikit-learn的线性回归与(k-fold)交叉验证相结合 . 现在,要计算t检验,我需要访问我的错误数组(y_test - y_pred) .

# Load dataset
smartphone = pd.read_csv('all-users_w4_filtered.csv')

# Define Independent variable (X) and Dependent variable (Y)  
x=smartphone[['mood_mean', 'valence_mean', 'app.social_mean', 'app.other_mean']].values
y=smartphone['target'].values
z=smartphone['benchmark'].values

# Reshape data
x = x.reshape(-1, 4)
y = y.reshape(-1, 1)

# First shuffle the data, then perform k-fold cross-validation
kf = KFold(n_splits=5, shuffle=True, random_state=0)
scores = cross_val_score(regr, x, y, cv=kf)
print ('5-fold shuffled cross validation scores: ', scores)
print ('Mean of cross-validation:', scores.mean())

for train_index, test_index in kf.split(x):
   #print("TRAIN:", train_index, "TEST:", test_index)
   X_train, X_test = x[train_index], x[test_index]
   y_train, y_test = y[train_index], y[test_index]

分数输出:

5倍洗牌交叉验证分数:[0.21801002 0.3282497 0.27146692 0.36056872 0.29064657]交叉验证平均值:0.293788384892

如何访问y_pred数组,用于计算cross_val_score中的MSE?

我尝试使用regr.fit / regr.predict对循环中的索引数据进行复制,但这会产生不同的结果 .