首页 文章

你能改变ggplot2中刻面堆积条形图中因子的顺序吗?

提问于
浏览
0

我试图绘制一张图表,显示不同年龄组中有18岁以下孩子的男性和女性的比例 . 我想要一个有两个酒吧的图表(男性一个,女性一个)每个年龄组的年龄;我希望两个栏显示底部有孩子的百分比,而不是顶部(堆积的栏) . 我无法弄清楚如何在ggplot2中制作这样的图表,并且非常感谢建议 .

我使用dplyr计算了我的分组统计数据:

kid18summary <- marsub %>% 
group_by(AgeGroup, sex, kid_under_18) %>% 
summarise(n=n()) %>% 
mutate(freq = n/sum(n))

这产生了这个:

dput(kid18summary)
structure(list(AgeGroup = c("Age<40", "Age<40", "Age<40", "Age<40", 
"Age41-49", "Age41-49", "Age41-49", "Age41-49", "Age50-64", "Age50-64", 
"Age50-64", "Age50-64"), sex = structure(c(1L, 1L, 2L, 2L, 1L, 
1L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("Male", "Female"), class = "factor"), 
    kid_under_18 = c("No", "Yes", "No", "Yes", "No", "Yes", "No", 
    "Yes", "No", "Yes", "No", "Yes"), freq = c(0.625, 0.375, 
    0.636833046471601, 0.363166953528399, 0.349557522123894, 
    0.650442477876106, 0.444897959183673, 0.555102040816327, 
    0.724852071005917, 0.275147928994083, 0.819548872180451, 
    0.180451127819549)), .Names = c("AgeGroup", "sex", "kid_under_18", 
"freq"), class = c("grouped_df", "tbl_df", "tbl", "data.frame"
), row.names = c(NA, -12L), vars = list(AgeGroup, sex), drop = TRUE, indices = list(
    0:1, 2:3, 4:5, 6:7, 8:9, 10:11), group_sizes = c(2L, 2L, 
2L, 2L, 2L, 2L), biggest_group_size = 2L, labels = structure(list(
    AgeGroup = c("Age<40", "Age<40", "Age41-49", "Age41-49", 
    "Age50-64", "Age50-64"), sex = structure(c(1L, 2L, 1L, 2L, 
    1L, 2L), .Label = c("Male", "Female"), class = "factor")), class = "data.frame", row.names = c(NA, 
-6L), vars = list(AgeGroup, sex), drop = TRUE, .Names = c("AgeGroup", 
"sex")))

我可以绘制每个年龄组和性别中没有18岁以下孩子的人的百分比:

ggplot(kid18summary, aes(x = factor(AgeGroup), y = freq, fill = factor(sex)), color = factor(sex)) +
  geom_bar(position = "dodge", stat = "identity") + scale_y_continuous(labels = percent)

或者我可以制作一个多面的堆积条形图,它更接近我想要的,因为我想同时显示“是”和“否”,即使百分比加起来为100,因为我认为比较彩色条比负空间更容易 . 唯一的麻烦是无论我做什么,底部都是“No”,顶部是“Yes”,我反过来也喜欢它 . (理想情况下,我真的希望男女不同的颜色,对于有孩子的男人来说是深蓝色,对于没有男人的人来说是浅蓝色;对于有孩子的女人来说是暗红色,对于没有女人的女人来说是浅色的,但我已经放弃了那暂时 . )

我试图以各种方式改变因素的顺序,都完全不成功 .

正如ggplot2 documentation中所建议的那样,我尝试直接更改因子级别的顺序:

kid18summary$kid_under_18 < as.factor(kid18summary$kid_under_18)
o <- c("Yes", "No")  # which I've also changed to ("No", "Yes"), which makes no difference; the order of the Yes and No in the legend changes, but the "Yes" bars stay on top
kid18summary$kid_under_18 <- factor(kid18summary$kid_under_18, levels = o)

kid18summary $ kid_under_18 < - factor(kid18summary $ kid_under_18,levels(kid18summary $ kid_under_18)[c(“是”,“否”)])#更改为[c(“否”,“是”)]也只会更改订单传说

我已尝试在另一个问题中建议的答案,并添加了另一个有序因素:

kid18summary <- transform(kid18summary, stack.ord = factor(kid_under_18, levels = c("Yes", "No"), ordered = TRUE))
ggplot(kid18summary, aes(x = factor(sex), y = freq, fill = factor(stack.ord)), color = factor(stack.ord)) + geom_bar(stat = "identity") + scale_y_continuous(labels = percent) + facet_wrap(~AgeGroup, nrow=1)

或者只是添加另一个虚拟变量:

kid18summary$orderfactor <- "NA"
kid18summary$orderfactor[kid18summary$kid_under_18 == "Yes"] <- 0
kid18summary$orderfactor[kid18summary$kid_under_18 == "No"] <- 1
ggplot(kid18summary, aes(x = factor(sex), y = freq, fill = factor(orderfactor)), color = factor(orderfactor)) + geom_bar(stat = "identity") + scale_y_continuous(labels = percent) + facet_wrap(~AgeGroup, nrow=1)

所有这些都给了我很多不同的方法,我可以切换条形中的是和否组的颜色,但实际上不是哪一组在顶部 .
Plot1

Plot2

1 回答

  • 1

    根据aosmith提出的答案,我最终得到了以下内容,这正是我想要的:

    ggplot(arrange(df, kid_under_18), aes(x = factor(sex), y = freq, fill = interaction(sex, factor(kid_under_18))), color = factor(kid_under_18)) + 
    geom_bar(stat = "identity") + scale_y_continuous(labels = percent) + 
    facet_wrap(~AgeGroup, nrow=1)
    

相关问题