SKLearn使用新数据进行预测-Java 学习之路

我已经尝试使用SKLearn进行线性回归 . 我有类似的信息：Calories Eaten |重量 .

150 | 150

300 | 190

350 | 200

基本上编号，但我已将数据集纳入线性回归模型 .

What I'm confused on is, how would I go about predicting with new data, say I got 10 new numbers of Calories Eaten, and I want it to predict Weight?

regressor = LinearRegression()
regressor.fit(x_train, y_train)
y_pred = regressor.predict(x_test) ??

但是我怎样才能让 only my 10 new data 卡路里的数量吃掉并让它成为 Test Set 我希望回归者预测？

3 回答

1
你是对的，你只需调用模型的 predict 方法并传入新的看不见的数据进行预测 . 现在它还取决于你的意思 new data . 您是在引用您不知道结果的数据（即您不知道重量值），还是这些数据用于测试模型的性能？

For new data (to predict on):

你的方法是正确的 . 您只需打印 y_pred 变量即可访问所有预测 .

You know the respective weight values and you want to evaluate model:

确保您有两个单独的数据集：x_test（包含要素）和y_test（包含标签） . 使用 y_pred 变量生成预测，然后可以使用许多性能指标计算其性能 . 最常见的是均方根，您只需将 y_test 和 y_pred 作为参数传递 . 以下是sklearn提供的所有regression performance metrics的列表 .

如果您不知道10个新数据点的权重值：

使用train_test_split将初始数据集拆分为两部分： training 和 testing . 您将拥有4个数据集： x_train ， y_train ， x_test ， y_test .
```
from sklearn.model_selection import train_test_split
# random state can be any number (to ensure same split), and test_size indicates a 25% cut
x_train, y_train, x_test, y_test = train_test_split(calories_eaten, weight, test_size = 0.25, random_state = 42)
```
通过拟合 x_train 和 y_train 训练模型 . 然后通过预测 x_test 并将这些 predictions 与 y_test 的实际结果进行比较来评估模型的训练表现 . 通过这种方式，您可以了解模型的执行方式 . 此外，您可以相应地为 10 新数据点预测 weight values .

作为初学者，还值得进一步阅读该主题 . This是一个简单的教程 .
回复于 2024-05-03T09:32:38+08:00

您必须在sklearn中使用 model_selection 选择模型，然后训练并拟合数据集 .

from sklearn.model_selection import train_test_split
X_train, y_train, X_test, y_test = train_test_split(eaten, weight)

regressor.fit(X_train, y_train)
y_pred = regressor.predict(X_test)

回复于 2024-05-03T09:32:38+08:00

1
我感到困惑的是，我将如何预测新数据，说我有10个新的卡路里消耗数量，我希望它能预测体重？

是， Calories Eaten 表示自变量，而 Weight 表示 dependent 变量 .

将数据拆分为训练集和测试集后，下一步是使用 X_train 和 y_train 数据拟合回归量 .

在训练模型后，您可以预测 X_test 方法的结果，因此我们得到了 y_pred .

现在您可以将 y_pred （预测数据）与 y_test （实际数据）进行比较 .

您还可以为创建的线性模型使用 score 方法，以获取模型的 performance .

score 使用 R^2 （R平方）度量或Coefficient of determination.计算
```
score = regressor.score(x_test, y_test)
```
要拆分数据，可以使用 train_test_split 方法 .
```
from sklearn.model_selection import train_test_split
X_train, y_train, X_test, y_test = train_test_split(eaten, weight, test_size = 0.2, random_state = 0)
```
回复于 2024-05-03T09:32:38+08:00

SKLearn使用新数据进行预测

3 回答

相关问题