首页 文章

使用GBM的插入错误,但不是没有插入符号

提问于
浏览
6

我一直在使用 gbmcaret 没有问题,但是当从我的数据帧中删除一些变量时它开始失败 . 我已经尝试了所提到的包的github和cran版本 .

这是错误:

> fitRF = train(my_data[trainIndex,vars_for_clust], clusterAssignment[trainIndex], method = "gbm", verbose=T)
Something is wrong; all the Accuracy metric values are missing:
    Accuracy       Kappa    
 Min.   : NA   Min.   : NA  
 1st Qu.: NA   1st Qu.: NA  
 Median : NA   Median : NA  
 Mean   :NaN   Mean   :NaN  
 3rd Qu.: NA   3rd Qu.: NA  
 Max.   : NA   Max.   : NA  
 NA's   :9     NA's   :9    
Error in train.default(my_data[trainIndex, vars_for_clust], clusterAssignment[trainIndex],  : 
  Stopping
In addition: There were 50 or more warnings (use warnings() to see the first 50)
> warnings()
Warning messages:
1: In eval(expr, envir, enclos) :
  model fit failed for Resample01: shrinkage=0.1, interaction.depth=1, n.minobsinnode=10, n.trees=150 Error in gbm.fit(x = structure(list(relatedness_cottle = c(0, 0, 8, 6,  : 
  unused arguments (x = list(relatedness_cottle = c(0, 0, 8, 6, 0, 6, 8, 10, 10, 6, 6, 4, 4, 4, 0, 0, 0, 0, 18, 18, 18, 0, 0, 6, 6, 0, 18, 12, 0, 4, 4, 4, 0, 0, 0, 18, 18, 6, 4, 4, 4, 6, 8, 6, 6, 0, 14, 2, 0, 8, 6, 6, 0, 4, 0, 0, 0, 0, 0, 4, 8, 8, 8, 4, 18, 0, 0, 4, 10, 18, 6, 0, 0, 18, 10, 10, 6, 2, 4, 4, 10, 10, 10, 2, 8, 0, 0, 0, 0, 10, 6, 6, 0, 4, 4, 0, 0, 0, 0, 8, 0, 0, 4, 4, 6, 6, 10, 6, 0, 0, 6, 4, 4, 8, 0, 12, 6, 2, 2, 8, 8, 4, 4, 4, 4, 6, 2, 2, 4, 0, 6, 0, 0, 0, 12, 18, 8, 0, 0, 4, 4, 2, 0, 0, 0, 0, 18, 
12, 6, 6, 4, 4, 12, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 6, 18, 0, 0, 18, 6, 4, 2, 2, 0, 0, 10, 0, 0, 0, 12, 4, 4, 4, 4, 4, 8, 18, 6, 18, 18, 12, 12, 12, 0, 0, 0, 0, 10, 12, 12, 12, 12, 12, 4, 4, 4, 6, 6, 6, 6, 12, 0, 6, 0, 0, 4, 4, 18, 18, 18, 0, 0, 4, 6, 6, 0, 0, 2, 0, 0, 0, 18, 12, 12, 0, 0, 0, 0, 0, 0, 18 [... truncated]

没有缺失值,响应是4级因子,输入如下:

Classes ‘tbl_df’, ‘tbl’ and 'data.frame':  1165 obs. of  14 variables:
 $ relatedness_cottle       : num  0 0 8 8 0 6 0 6 6 0 ...
 $ dominance_cottle         : int  4 6 0 6 6 6 6 4 4 4 ...
 $ time_spent               : num  26832 20822 18893 13107 25406 ...
 $ num_color_changes        : num  3.33 2.33 1.33 1 1 ...
 $ num_selects              : num  1 0.667 2 0.667 1.667 ...
 $ show_select_match        : num  1 0.667 0.333 1 1 ...
 $ default_size             : num  0.667 0 0.667 0 0 ...
 $ select_order             : Factor w/ 6 levels "future_past_present",..: 1 4 4 2 5 1 4 6 6 4 ...
 $ order_x                  : Factor w/ 6 levels "future_past_present",..: 4 4 4 4 4 3 4 4 4 4 ...
 $ color_past               : Factor w/ 8 levels "black","blue",..: 5 1 6 8 5 7 1 6 6 5 ...
 $ color_present            : Factor w/ 8 levels "black","blue",..: 1 4 4 4 6 8 4 4 1 4 ...
 $ color_future             : Factor w/ 8 levels "black","blue",..: 2 2 2 2 2 2 1 2 8 2 ...
 $ dominance_cottle_future  : int  0 4 0 4 2 0 4 2 2 0 ...
 $ relatedness_cottle_future: int  0 2 4 4 0 4 0 2 4 0 ...

但是如果我直接用数据帧调用 gbm ,它可以工作:

summary(gbm(clusterAssignment[trainIndex] ~ ., data = my_data[trainIndex,vars_for_clust]))
Distribution not specified, assuming multinomial ...
                                                var   rel.inf
color_present                         color_present 33.533673
dominance_cottle                   dominance_cottle 33.170138
default_size                           default_size 25.321566
dominance_cottle_future     dominance_cottle_future  5.674563
color_future                           color_future  2.300060
relatedness_cottle               relatedness_cottle  0.000000
time_spent                               time_spent  0.000000
num_color_changes                 num_color_changes  0.000000
num_selects                             num_selects  0.000000
show_select_match                 show_select_match  0.000000
select_order                           select_order  0.000000
order_x                                     order_x  0.000000
color_past                               color_past  0.000000
relatedness_cottle_future relatedness_cottle_future  0.000000

Edit :重现,run the script found here .

3 回答

  • 4

    目前,使用 as.data.frame() 将数据帧从plyr / dplyr转换为普通数据框可以解决问题 .

    train(as.data.frame(issueDataframe), issueResponse, method="gbm")
    

    this issue .

  • 0

    与glm方法相同的问题 . 删除VERBOSE选项后解决了...

  • 2

    对于某些 caret 方法,当用户尝试使用 multinomial 分类进行预测并且算法中仅允许二进制 {0,1} 结果或使用当前参数集时,将出现此问题 .

相关问题