首页 文章

R - 基于一系列开始和结束日期复制行

提问于
浏览
6

我有一个像这样的数据框“DF”:

Flight.Start   Flight.End   Device      Partner   Creative   Days.in.Flight 
2015-08-31     2015-08-31   Standard    MSN       Video      35

我需要做的是“吹嘘”,如下:

Flight.Start   Flight.End   Date         Device      Partner   Creative   Days.in.Flight 
2015-08-31     2015-10-04   2015-08-31   Standard    MSN       Video      35
2015-08-31     2015-10-04   2015-09-01   Standard    MSN       Video      35
2015-08-31     2015-10-04   2015-09-02   Standard    MSN       Video      35
2015-08-31     2015-10-04   2015-09-03   Standard    MSN       Video      35
2015-08-31     2015-10-04   2015-09-04   Standard    MSN       Video      35
2015-08-31     2015-10-04   2015-09-05   Standard    MSN       Video      35
2015-08-31     2015-10-04   2015-09-06   Standard    MSN       Video      35
2015-08-31     2015-10-04   2015-09-07   Standard    MSN       Video      35

ETC ......直到Date变量达到2015-10-04,然后继续下一个重复

基本上每一行都会被 days in flight - 1 的数量复制(因为已经存在的行可以在该时间间隔内占一天,然后为该航班中的相关日期填写一个新列"Date" . 所以如果一行有分别为9/1和9/5的开始和结束日期,4个重复行将附加到已存在的行,将创建新列(日期),以及航班开始和结束日期的日期顺序是对于原始行将填写列值 .

所有日期值都格式化为日期,飞行天数为num,其余为因子 .

EDIT

响应重复的问题标记:

为了澄清,这不像被标记为重复的情况,因为我的问题并没有真正关注如何根据飞行天数复制(我已经知道如何做到了!),而是我怎么能那么将列添加到该输出数据框并在相应的航班期间内依次插入日期 . 谢谢你的抬头......

3 回答

  • 1

    这是基础R的一种方法:

    mydf <- data.frame(Flight.Start = as.Date(c("2015-09-01", "2015-09-10")),
                       Flight.End = as.Date(c("2015-09-03", "2015-09-15")),
                       Device = "Standard",
                       Creative = "Video",
                       Days.in.Flight = c(3, 6),
                       stringsAsFactors = FALSE)
    
    expanded <-mydf[rep(row.names(mydf), mydf$ Days.in.Flight), ]
    data.frame(expanded,Date=expanded$Flight.Start+(sequence(mydf$Days.in.Flight)-1))
    
    > data.frame(expanded,Date=expanded$Flight.Start+(sequence(mydf$Days.in.Flight)-1))
        Flight.Start Flight.End   Device Creative Days.in.Flight       Date
    1     2015-09-01 2015-09-03 Standard    Video              3 2015-09-01
    1.1   2015-09-01 2015-09-03 Standard    Video              3 2015-09-02
    1.2   2015-09-01 2015-09-03 Standard    Video              3 2015-09-03
    2     2015-09-10 2015-09-15 Standard    Video              6 2015-09-10
    2.1   2015-09-10 2015-09-15 Standard    Video              6 2015-09-11
    2.2   2015-09-10 2015-09-15 Standard    Video              6 2015-09-12
    2.3   2015-09-10 2015-09-15 Standard    Video              6 2015-09-13
    2.4   2015-09-10 2015-09-15 Standard    Video              6 2015-09-14
    2.5   2015-09-10 2015-09-15 Standard    Video              6 2015-09-15
    
  • 6

    这是 splitstackshapedplyr 的一种方法 . 使用 splitstackshape 包中的 expandRows() ,您可以按照所述扩展数据框 . 然后,您想使用 mutate() 添加一系列日期 . 我所做的是通过 Flight.StartFlight.End 的组合对数据进行分组,并使用 seq() 为每个组创建一个日期序列 . first() 正在获取 Flight.StartFlight.End 的第一个元素 . 通过这种方式,您可以创建所需的序列 . 我希望这能帮到您 .

    DATA and CODE

    mydf <- data.frame(Flight.Start = as.Date(c("2015-09-01", "2015-09-10")),
                       Flight.End = as.Date(c("2015-09-03", "2015-09-15")),
                       Device = "Standard",
                       Creative = "Video",
                       Days.in.Flight = c(3, 6),
                       stringsAsFactors = FALSE)
    
    #  Flight.Start Flight.End   Device Creative Days.in.Flight
    #1   2015-09-01 2015-09-03 Standard    Video              3
    #2   2015-09-10 2015-09-15 Standard    Video              6
    
    library(splitstackshape)
    library(dplyr)
    
    expandRows(mydf, "Days.in.Flight", drop = FALSE) %>%
    group_by(Flight.Start, Flight.End) %>%
    mutate(Date = seq(first(Flight.Start),
                      first(Flight.End),
                      by = 1))
    
    #  Flight.Start Flight.End   Device Creative Days.in.Flight       Date
    #        (date)     (date)    (chr)    (chr)          (dbl)     (date)
    #1   2015-09-01 2015-09-03 Standard    Video              3 2015-09-01
    #2   2015-09-01 2015-09-03 Standard    Video              3 2015-09-02
    #3   2015-09-01 2015-09-03 Standard    Video              3 2015-09-03
    #4   2015-09-10 2015-09-15 Standard    Video              6 2015-09-10
    #5   2015-09-10 2015-09-15 Standard    Video              6 2015-09-11
    #6   2015-09-10 2015-09-15 Standard    Video              6 2015-09-12
    #7   2015-09-10 2015-09-15 Standard    Video              6 2015-09-13
    #8   2015-09-10 2015-09-15 Standard    Video              6 2015-09-14
    #9   2015-09-10 2015-09-15 Standard    Video              6 2015-09-15
    
  • 5

    或者使用 data.table ,我们将'data.frame'转换为'data.table'( setDT(mydf) ),按'Days.in.Flight'复制行序列,根据该索引,我们将数据集( .SD[rep(... )分组,按'Flight.Start'分组,'Flight.End',我们创建'Date'列 .

    library(data.table)
    setDT(mydf)[, .SD[rep(1:.N, Days.in.Flight)]][, 
         Date:= seq(Flight.Start , Flight.End, by = '1 day'),
         by = .(Flight.Start, Flight.End)][]
    

相关问题