首页 文章

日期箱内变量计数的堆积条形图

提问于
浏览
0

使用R,我试图按日期制作一个简单的堆积条形图,显示不同结算类型的计数 . 我有3种方法来计算日期 . 下面是我的数据库的一个例子

ID Settlement Start End Mid 01 Urban 200 400 300 02 Rural 450 850 650 03 Military 1300 1400 1350 04 Castle 2 1000 501

到目前为止我有

count(ratData,vars =“Settlement”)

返回

Settlement freq 1 78 2 Castle 25 3 Cave 3 4 Fortification 5 5 Hill Fort 2 6 Industrial (quarry) 1 7 Manor 2 8 Military 4 9 Military camp 1 10 Military Camp 3 11 Military site 1 12 Mining 1 13 Monastic 15 14 Monastic/Rural? 1 15 Port 5 16 River-site 2 17 Roman fort 1 18 Roman Fort 1 19 Roman settlement 3 20 Rural 22 21 Settlement 2 22 urban 1 23 Urban 123 24 Villa 4 25 Wic 13

然后去绘图

ggplot(v,aes(x = Settlement,y = freq))geom_bar(stat ='identity',fill ='lightblue',color ='black')

然而,这显示了x轴上的结算类型,而不是堆叠结算类型 . 这是缺少日期数据 . 我想将它们从1-1500分成100年的箱子,并制作每箱的结算类型的堆积条形图,以说明随时间的存在 .

1 回答

  • 0

    这应该可以解决问题 . cut 函数在这种情况下非常有用,您需要根据某个连续变量范围创建分类变量 . 我已经走了 Tidyverse 路线,但也有基本R选项 .

    library(dplyr)
    library(ggplot2)
    
    # Some dummy data that resembles your problem
    s <- data.frame(ID = 1:100,
                    Settlement = c(rep('Urban', 50), rep('Rural', 20), rep('Military', 10), rep('Castle', 20)),
                    Start = signif(rnorm(100, 500, 100), 2),
                    End = signif(rnorm(100, 1000, 100), 2))
    s$Mid <- s$Start + ((s$End - s$Start) / 2)
    
    # Find the range of the mid variable to decide on cut locations
    r <- range(s$Mid)
    
    # Make a new factor variable based year bins - you will need to change to match your actual data
    s$group <- cut(s$Mid, 5, labels = c('575-640', '641-705', '706-770', '771-835', '836-900'))
    
    # Frequency count per factor level
    grouped <- s %>%
      group_by(group) %>%
      count(Settlement)
    
    # You'll need to clean up axis labels, etc.
    ggplot(grouped, aes(x = group, y = n, fill = Settlement)) +
      geom_bar(stat = 'identity')
    

    enter image description here

相关问题