首页 文章

将嵌套的摘要列表(aov())中的值提取到数据帧中

提问于
浏览
1

我在一个数据框内跨多个组运行简单的单向ANOVA .

数据框可在此处获取:https://www.dropbox.com/s/6nsjk4l1pgiwal3/cut1.csv?dl=0

>download.file('https://www.dropbox.com/s/6nsjk4l1pgiwal3/cut1.csv?raw=1', destfile = "cut1.csv", method = "auto")

> data <- read.csv("cut1.csv")
> cut1 <- data %>% mutate(Plot = as.factor(Plot), Block = as.factor(Block), Cut = as.factor(Cut)) 

> str(cut1)
'data.frame':   160 obs. of  6 variables:
 $ Plot       : Factor w/ 16 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ Block      : Factor w/ 4 levels "1","2","3","4": 1 1 1 1 2 2 2 2 3 3 ...
 $ Treatment  : Factor w/ 4 levels "AN","C","IU",..: 4 2 3 1 1 3 4 2 3 1 ...
 $ Cut        : Factor w/ 3 levels "1","2","3": 1 1 1 1 1 1 1 1 1 1 ...
 $ Measurement: Factor w/ 10 levels "ADF","Ash","Crude_Protein",..: 5 5 5 5 5 5 5 5 5 5 ...
 $ Value      : num  956 965 961 963 955 ...

我使用了this SO question中的一些代码来启用aov函数以应用于 Measurement 因子的每个级别:

anova_1<- sapply(unique(as.character(cut1$Measurement)),
                 function(meas)aov(Value~Treatment+Block,cut1,subset=(Measurement==meas)),
                 simplify=FALSE,USE.NAMES=TRUE)
summary_1 <- lapply(anova_1, summary)

我可以通过 summary_1 手动查看,但理想情况下我想要做的是将 Measurement 因子的每个级别的p值提取到一个数据帧中,然后我可以进行过滤,这样我只能看到哪些<0.5 . 然后我会在这些上运行 TukeyHSD .

summary_1 看起来像这样(只显示前2个列表):

> str(summary_1)
List of 10
 $ Dry_matter   :List of 1
  ..$ :Classes ‘anova’ and 'data.frame':    3 obs. of  5 variables:
  .. ..$ Df     : num [1:3] 3 3 9
  .. ..$ Sum Sq : num [1:3] 359 167 612
  .. ..$ Mean Sq: num [1:3] 119.8 55.5 68
  .. ..$ F value: num [1:3] 1.761 0.816 NA
  .. ..$ Pr(>F) : num [1:3] 0.224 0.517 NA
  ..- attr(*, "class")= chr [1:2] "summary.aov" "listof"
 $ Crude_Protein:List of 1
  ..$ :Classes ‘anova’ and 'data.frame':    3 obs. of  5 variables:
  .. ..$ Df     : num [1:3] 3 3 9
  .. ..$ Sum Sq : num [1:3] 306 721 1606
  .. ..$ Mean Sq: num [1:3] 102 240 178
  .. ..$ F value: num [1:3] 0.572 1.347 NA
  .. ..$ Pr(>F) : num [1:3] 0.647 0.319 NA
  ..- attr(*, "class")= chr [1:2] "summary.aov" "listof"

我可以从 summary_1 中的一个列表中提取p值,如下所示:

> summary_1$OAH[[1]][,5][1]
[1] 0.4734992

但是,我不知道如何从所有嵌套列表中提取并放置在数据框中 .

非常需要任何帮助 .

2 回答

  • 0

    您可以将包 broomdplyr 结合使用 Anova 来应用 Anova ,并以整齐的格式将输出分配给 data.frame .

    library(broom)
    library(dplyr)
    
    summaries <- cut1 %>% group_by(Measurement) %>% 
            do(tidy(aov(Value ~ Treatment + Block, data = .)))
    
    head(summaries)
    #  Measurement      term    df      sumsq    meansq statistic    p.value
    #       (fctr)     (chr) (dbl)      (dbl)     (dbl)     (dbl)      (dbl)
    #1         ADF Treatment     3  41.416875 13.805625  3.097871 0.07138437
    #2         ADF     Block     1   8.001125  8.001125  1.795388 0.20729351
    #3         ADF Residuals    11  49.021375  4.456489        NA         NA
    #4         Ash Treatment     3  38.511875 12.837292  1.051787 0.40840601
    #5         Ash     Block     1  34.980125 34.980125  2.865998 0.11856463
    #6         Ash Residuals    11 134.257375 12.205216        NA         NA
    
  • 2

    这是香草R的解决方案:

    # you can shorten your example -- download.file not necessary
    cut1 <- read.csv('https://www.dropbox.com/s/6nsjk4l1pgiwal3/cut1.csv?raw=1') %>%
      mutate(Plot = as.factor(Plot), Block = as.factor(Block), Cut = as.factor(Cut)) 
    
    # split-apply-combine strategy
    do.call(rbind, lapply(split(cut1,cut1$Measurement),
                          function(x) with(x, summary(aov(Value ~ Treatment + Block)))[[1]]
                          )
            )
    

    收益:

    Df  Sum Sq Mean Sq F value  Pr(>F)  
    ADF.Treatment              3   41.42   13.81  6.7088 0.01133 *
    ADF.Block                  3   38.50   12.83  6.2366 0.01405 *
    ADF.Residuals              9   18.52    2.06                  
    Ash.Treatment              3   38.51   12.84  0.9162 0.47115  
    Ash.Block                  3   43.13   14.38  1.0261 0.42602  
    Ash.Residuals              9  126.11   14.01                  
    Crude_Protein.Treatment    3  306.42  102.14  0.5723 0.64733  
    Crude_Protein.Block        3  721.42  240.47  1.3473 0.31946  
    Crude_Protein.Residuals    9 1606.39  178.49                  
    D.Treatment                3    9.47    3.16  4.5530 0.03331 *
    D.Block                    3    7.57    2.52  3.6383 0.05751 .
    D.Residuals                9    6.24    0.69                  
    Dry_matter.Treatment       3  359.39  119.80  1.7609 0.22432  
    Dry_matter.Block           3  166.62   55.54  0.8164 0.51656  
    Dry_matter.Residuals       9  612.27   68.03                  
    ME.Treatment               3    0.24    0.08  4.5530 0.03331 *
    ME.Block                   3    0.19    0.06  3.6383 0.05751 .
    ME.Residuals               9    0.16    0.02                  
    NCGD.Treatment             3 2777.57  925.86  4.5530 0.03331 *
    NCGD.Block                 3 2219.55  739.85  3.6383 0.05751 .
    NCGD.Residuals             9 1830.17  203.35                  
    NDF.Treatment              3  355.91  118.64  6.8809 0.01050 *
    NDF.Block                  3  336.70  112.23  6.5095 0.01239 *
    NDF.Residuals              9  155.17   17.24                  
    OAH.Treatment              3    1.41    0.47  0.9108 0.47350  
    OAH.Block                  3    1.37    0.46  0.8850 0.48488  
    OAH.Residuals              9    4.65    0.52                  
    Sugar.Treatment            3   86.18   28.73  5.0212 0.02577 *
    Sugar.Block                3   51.64   17.21  3.0085 0.08720 .
    Sugar.Residuals            9   51.49    5.72                  
    ---
    Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
    

相关问题