首页 文章

在图上添加回归线方程和R2

提问于
浏览
185

我想知道如何在 ggplot 上添加回归线方程和R ^ 2 . 我的代码是

library(ggplot2)
df <- data.frame(x = c(1:100))
df$y <- 2 + 3 * df$x + rnorm(100, sd = 40)
p <- ggplot(data = df, aes(x = x, y = y)) +
            geom_smooth(method = "lm", se=FALSE, color="black", formula = y ~ x) +
            geom_point()
p

任何帮助将受到高度赞赏 .

5 回答

  • 82

    这是一个解决方案

    # GET EQUATION AND R-SQUARED AS STRING
    # SOURCE: http://goo.gl/K4yh
    
    lm_eqn <- function(df){
        m <- lm(y ~ x, df);
        eq <- substitute(italic(y) == a + b %.% italic(x)*","~~italic(r)^2~"="~r2, 
             list(a = format(coef(m)[1], digits = 2), 
                  b = format(coef(m)[2], digits = 2), 
                 r2 = format(summary(m)$r.squared, digits = 3)))
        as.character(as.expression(eq));                 
    }
    
    p1 <- p + geom_text(x = 25, y = 300, label = lm_eqn(df), parse = TRUE)
    

    编辑 . 我从我选择此代码的地方找到了源代码 . 这是ggplot2 google群组中原始帖子的link

    Output

  • 70

    我改变了 stat_smooth 和相关函数的几行来创建一个新函数,它增加了拟合方程和R平方值 . 这也适用于小平面图!

    library(devtools)
    source_gist("524eade46135f6348140")
    df = data.frame(x = c(1:100))
    df$y = 2 + 5 * df$x + rnorm(100, sd = 40)
    df$class = rep(1:2,50)
    ggplot(data = df, aes(x = x, y = y, label=y)) +
      stat_smooth_func(geom="text",method="lm",hjust=0,parse=TRUE) +
      geom_smooth(method="lm",se=FALSE) +
      geom_point() + facet_wrap(~class)
    

    enter image description here

    我使用@ Ramnath答案中的代码来格式化等式 . stat_smooth_func 函数很难用它来玩 .

    https://gist.github.com/kdauria/524eade46135f6348140 . 如果出现错误,请尝试更新 ggplot2 .

  • 192

    我在我的包ggpmisc中包含了一个统计数据 stat_poly_eq() ,它允许这个答案:

    library(ggplot2)
    library(ggpmisc)
    df <- data.frame(x = c(1:100))
    df$y <- 2 + 3 * df$x + rnorm(100, sd = 40)
    my.formula <- y ~ x
    p <- ggplot(data = df, aes(x = x, y = y)) +
       geom_smooth(method = "lm", se=FALSE, color="black", formula = my.formula) +
       stat_poly_eq(formula = my.formula, 
                    aes(label = paste(..eq.label.., ..rr.label.., sep = "~~~")), 
                    parse = TRUE) +         
       geom_point()
    p
    

    enter image description here

    此统计数据适用于任何没有缺失项的多项式,并且希望具有足够的灵活性以通常有用 . R ^ 2或经调整的R ^ 2标记可与任何配有lm()的模型公式一起使用 . 作为一个ggplot统计数据,它的行为与团队和方面一样 .

    'ggpmisc'包可以通过CRAN获得 .

    版本0.2.6刚刚被CRAN接受 .

    它涉及@shabbychef和@ MYaseen208的评论 .

    @ MYaseen208这显示了如何添加帽子 .

    library(ggplot2)
    library(ggpmisc)
    df <- data.frame(x = c(1:100))
    df$y <- 2 + 3 * df$x + rnorm(100, sd = 40)
    my.formula <- y ~ x
    p <- ggplot(data = df, aes(x = x, y = y)) +
       geom_smooth(method = "lm", se=FALSE, color="black", formula = my.formula) +
       stat_poly_eq(formula = my.formula,
                    eq.with.lhs = "italic(hat(y))~`=`~",
                    aes(label = paste(..eq.label.., ..rr.label.., sep = "~~~")), 
                    parse = TRUE) +         
       geom_point()
    p
    

    enter image description here

    @shabbychef现在可以将方程中的变量与用于轴标签的变量相匹配 . 要用z代替z和y,我会使用:

    p <- ggplot(data = df, aes(x = x, y = y)) +
       geom_smooth(method = "lm", se=FALSE, color="black", formula = my.formula) +
       stat_poly_eq(formula = my.formula,
                    eq.with.lhs = "italic(h)~`=`~",
                    eq.x.rhs = "~italic(z)",
                    aes(label = ..eq.label..), 
                    parse = TRUE) + 
       labs(x = expression(italic(z)), y = expression(italic(h))) +          
       geom_point()
    p
    

    enter image description here

    作为这些正常的R解析表达式,希腊字母现在也可以在等式的lhs和rhs中使用 .

    [2017-03-08] @elarry编辑以更精确地解决原始问题,显示如何在等式和R2标签之间添加逗号 .

    p <- ggplot(data = df, aes(x = x, y = y)) +
      geom_smooth(method = "lm", se=FALSE, color="black", formula = my.formula) +
      stat_poly_eq(formula = my.formula,
                   eq.with.lhs = "italic(hat(y))~`=`~",
                   aes(label = paste(..eq.label.., ..rr.label.., sep = "*plain(\",\")~")), 
                   parse = TRUE) +         
      geom_point()
    p
    

    enter image description here

  • 2

    我已经将Ramnath的帖子修改为a)使其更通用,因此它接受线性模型作为参数而不是数据框,并且b)更适当地显示负片 .

    lm_eqn = function(m) {
    
      l <- list(a = format(coef(m)[1], digits = 2),
          b = format(abs(coef(m)[2]), digits = 2),
          r2 = format(summary(m)$r.squared, digits = 3));
    
      if (coef(m)[2] >= 0)  {
        eq <- substitute(italic(y) == a + b %.% italic(x)*","~~italic(r)^2~"="~r2,l)
      } else {
        eq <- substitute(italic(y) == a - b %.% italic(x)*","~~italic(r)^2~"="~r2,l)    
      }
    
      as.character(as.expression(eq));                 
    }
    

    用法将变为:

    p1 = p + geom_text(aes(x = 25, y = 300, label = lm_eqn(lm(y ~ x, df))), parse = TRUE)
    
  • 73

    真的很喜欢@Ramnath解决方案 . 为了允许使用自定义回归公式(而不是固定为y和x作为文字变量名称),并将p值添加到打印输出中(如@Jerry T评论),这里是mod:

    lm_eqn <- function(df, y, x){
        formula = as.formula(sprintf('%s ~ %s', y, x))
        m <- lm(formula, data=df);
        # formating the values into a summary string to print out
        # ~ give some space, but equal size and comma need to be quoted
        eq <- substitute(italic(target) == a + b %.% italic(input)*","~~italic(r)^2~"="~r2*","~~p~"="~italic(pvalue), 
             list(target = y,
                  input = x,
                  a = format(as.vector(coef(m)[1]), digits = 2), 
                  b = format(as.vector(coef(m)[2]), digits = 2), 
                 r2 = format(summary(m)$r.squared, digits = 3),
                 # getting the pvalue is painful
                 pvalue = format(summary(m)$coefficients[2,'Pr(>|t|)'], digits=1)
                )
              )
        as.character(as.expression(eq));                 
    }
    
    geom_point() +
      ggrepel::geom_text_repel(label=rownames(mtcars)) +
      geom_text(x=3,y=300,label=lm_eqn(mtcars, 'hp','wt'),color='red',parse=T) +
      geom_smooth(method='lm')
    

    enter image description here
    不幸的是,这不适用于facet_wrap或facet_grid .

相关问题