首页 文章

堆积百分比条形图,ggplot2中有错误条

提问于
浏览
1

我正在尝试创建一个堆积条形图,显示数据框的两列中显示的值的百分比 .

我的代码生成的堆积条形图有两个问题,我认为是链接的 .
enter image description here

  • ggplot2 不会让我显示%,但会将输入值显示为1.0的份额 . 我无法用 scale_y_continuous(labels = percent_format()) 解决这个问题,这是我在这里看到SO时发现的,所以我对如何解决这个问题感到很茫然?

  • 我的误差栏非常长 . 也许这是因为SEM是按百分比计算的,但是我的图表显示的是1.0的份额 . 那么所有值都是我数据框中的1/100?

我的数据帧:

ID Group Labeled Unlabeled
A     0       2        98
B     0       2        98
C     0       4        96
D     0       4        96
E     0       4        96
A     1      50        50
B     1      40        60
C     1      50        50
D     1      40        60
E     1      30        70
A     2      30        70
B     2      30        70
C     2      20        80
D     2      20        80
E     2      20        80
A     3      10        90
B     3      10        90
C     3       5        95
D     3      10        90
E     3       5        95
A     4       2        98
B     4       2        98
C     4       1        99
D     4       1        99
E     4       0       100

我的代码:

library(ggplot2)
library(plyr)
library(reshape2)


#Calculate means for both groups
melted <- melt(data, id.vars=c("ID", "Group"))
means <- ddply(melted, c("variable", "Group"), summarise,
           mean=mean(value))

#Draw bar plot with ggplot2
plot <- ggplot(data=means, aes(x=Group, y=mean, fill=variable)) + 
  geom_bar(stat="identity",
       position="fill",
       width = 0.4) +                           
  xlab(" ") + ylab("Percentage (%)") + 
  theme_classic(base_size = 16, base_family = "Helvetica") + 
  theme(axis.text.y=element_text(size=16, face="bold")) + 
  theme(axis.title.y=element_text(size=16, face="bold", vjust=1)) + 
  theme(axis.text.x=element_text(angle=45,hjust=1,vjust=1, size=16, face="bold")) +
  theme(legend.position="right")

# Calc SEM  
means.sem <- ddply(melted, c("variable", "Group"), summarise,
               mean=mean(value), sem=sd(value)/sqrt(length(value)))
means.sem <- transform(means.sem, lower=mean-sem, upper=mean+sem)

# Add SEM & change appearance of barplot
plotSEM <- plot + geom_errorbar(data=means.sem, aes(ymax=upper,  ymin=lower), position="fill", width=0.15)

2 回答

  • 1

    这也应该工作(您只需要调整Labeled变量的错误条),默认位置堆栈应该工作 .

    plot <- ggplot(data=means, aes(x=Group, y=mean, fill=variable)) + 
      geom_bar(stat="identity",
               width = 0.4) +                           
      xlab(" ") + ylab("Percentage (%)") + 
      theme_classic(base_size = 16, base_family = "Helvetica") + 
      theme(axis.text.y=element_text(size=16, face="bold")) + 
      theme(axis.title.y=element_text(size=16, face="bold", vjust=1)) + 
      theme(axis.text.x=element_text(angle=45,hjust=1,vjust=1, size=16, face="bold")) +
      theme(legend.position="right")
    
    # Calc SEM  
    means.sem <- ddply(melted, c("variable", "Group"), summarise,
                       mean=mean(value), sem=sd(value)/sqrt(length(value)))
    means.sem <- transform(means.sem, lower=mean-sem, upper=mean+sem)
    means.sem[means.sem$variable=='Labeled',5:6] <- means.sem[means.sem$variable=='Labeled',3] + means.sem[means.sem$variable=='Unlabeled',5:6]
    
    # Add SEM & change appearance of barplot
    plotSEM <- plot + geom_errorbar(data=means.sem, aes(ymax=upper,  ymin=lower), 
                                    width=0.15)
    

    enter image description here

  • 1
    • 您通常需要 format_percent()format_percent() 才能使用,但我们将使用自定义函数

    • 我稍微调整了你的代码,但主要区别是:

    • position = 'stack' 在条形而不是 fill

    • position = 'identity'stat = 'identity' 用于错误栏

    • 每组仅显示一个错误栏

    数据:

    df <- read.table(text = "ID Group Labeled Unlabeled
                 A     0       2        98
                 B     0       2        98
                 C     0       4        96
                 D     0       4        96
                 E     0       4        96
                 A     1      50        50
                 B     1      40        60
                 C     1      50        50
                 D     1      40        60
                 E     1      30        70
                 A     2      30        70
                 B     2      30        70
                 C     2      20        80
                 D     2      20        80
                 E     2      20        80
                 A     3      10        90
                 B     3      10        90
                 C     3       5        95
                 D     3      10        90
                 E     3       5        95
                 A     4       2        98
                 B     4       2        98
                 C     4       1        99
                 D     4       1        99
                 E     4       0       100", header = T)
    

    代码:

    library(ggplot2)
    library(dplyr)
    library(scales)
    
    df %>% 
      gather('key','value',-ID, -Group) %>% 
      group_by(Group, key) %>% 
      summarise(mean = mean(value),
                sem = sd(value) / sqrt(n()),
                lower = (mean - sem),
                upper = (mean + sem))-> newdf
    
    #Draw bar plot with ggplot2
    plot <- ggplot(data=newdf, aes(x=Group, y=mean, fill=key)) + 
      geom_bar(stat="identity",
               position="stack",
               width = 0.4) +
      geom_errorbar(data = filter(newdf, key == 'Unlabeled'), aes(ymax=upper,  ymin=lower), stat = 'identity', position = 'identity', width=0.15) +
      xlab(" ") + 
      ylab("Percentage (%)") +
      scale_y_continuous(labels = function(bs) {paste0(bs, '%')}) +
      theme_classic(base_size = 16, base_family = "Helvetica") + 
      theme(axis.text.y=element_text(size=16, face="bold"), 
            axis.title.y=element_text(size=16, face="bold"),
            axis.text.x=element_text(angle=45,hjust=1,vjust=1, size=16, face="bold"),
            legend.position="right")
    

    结果:

    enter image description here

相关问题