首页 文章

如何以两个列表的形式测试和训练多个数据集?

提问于
浏览
0

我想在两个列表中创建一个训练和测试10个独立数据集的函数 . 以下是列表:

blend_30_d<-list(desktop_30_1, desktop_30_2, desktop_30_3, desktop_30_4, desktop_30_5, desktop_30_6, desktop_30_7, desktop_30_8, desktop_30_9, desktop_30_10)

blend_30_td<-list(desktop_30_t1, desktop_30_t2, desktop_30_t3, desktop_30_t4, desktop_30_t5, desktop_30_t6, desktop_30_t7, desktop_30_t8, desktop_30_t9, desktop_30_t10)

每个数据集的名称是:

[1] "date" "Wkday" "Imps" "Clicks" "Total_Cost" "Units"
[7] "January" "February" "March" "April" "May" "June"
[13] "July" "August" "September" "October" "November" "December"
[19] "Monday" "Tuesday" "Wednesday" "Thursday" "Friday" "Saturday"
[25] "Sunday" "Vday" "Tgiving" "Xmas" "XmasE" "NYE"
[31] "NYD" "July4" "Labor" "Memorial" "Mob_App_Launch" "Auto_Approve_Launch"

我已经构建了以下函数 - 我想要blend_30_d [1]来测试blend_30_td [1] .

d_cost <- function(train, test){
    ####Run regression on training
    q<-lm(Total_Cost ~ . -date - Wkday - Imps - Clicks + poly(date, 2), data=train)
    ####Predict values into test set
    test_cost_d <- predict.lm(q, x=test)
    ####Calculate R^2 between predicted vs. actual values
    z<-(cor(test_cost_d, test$Total_Cost))^2
}

d_cost(blend_30_d, blend_30_td)

我收到以下错误:terms.formula(formula,data = data)中的错误:使用' . '在数据框中重复名称'date'

我不确定这是两个列表的正确方法......有什么建议吗?谢谢!

2 回答

  • 0

    您的 d_cost 函数构建为采用两个数据帧,一个用于测试,另一个用于训练 . 你're trying to call it by passing it two lists of data frames. You' ve一次为一对数据帧构建了你的函数,所以你需要给它一对,而不是两对的列表 . 尝试这样的事情:

    z = rep(NA, length(blend_30_d)
    for (i in seq_along(blend_30_d) {
        z[i] = d_cost(blend_30_d[[i]], blend_30_td[[i]])
    }
    
  • 0

    我想你可能需要添加一个循环:

    for(i in 1:10){
        d_cost(train[[i]], test[[i]])
    }
    

相关问题