首页 文章

将时间框架分割成日期R.

提问于
浏览
0

我有一个包含3列的数据框 - start_timeend_timeenergy 其中 start_timeend_time 是日期时间格式, energy 是这两次之间花费的能量 . ![在此处输入图像说明] [1]

我的目标是计算每天所消耗的能量 . start_timeend_time 具有相同日期的实例, energy 值将分配给该日期 . 但是我需要找到一种方法来对 start_timeend_time 具有不同日期的 energy 值进行分类 . 例如,像这样的数据框中的实例 -

start_time             end_time               energy
2014-06-09 20:54:10    2014-06-11 05:04:14    1114

应该在输出数据框架中产生这样的实例 -

date        energy
2014-06-09  <energy consumed between 2014-06-09 20:54:10 to 2014-06-09 23:59:59>
2014-06-10  <energy consumed between 2014-06-10 00:00:00 to 2014-06-10 23:59:59>
2014-06-11  <energy consumed between 2014-06-11 00:00:00 to 2014-06-11 05:04:14>

1 回答

  • 0

    我没有测试太多(提供的数据帧有点稀疏..),但这似乎工作正常 .

    calcEnergy <- function(startCol, endCol, valCol) {
        require(chron)
        # calculate start and finish times
        chron.fun <- function(x) chron(x[1], x[2], format=c('y-m-d','h:m:s'))
        starts <- unlist(lapply(strsplit(as.character(startCol), " "), chron.fun))
        ends <- unlist(lapply(strsplit(as.character(endCol), " "), chron.fun))
        # need to expand dataframe out to accomodate new rows, so calculate number of 
        # rows per original observation
        nrows <- ceiling(ends) - floor(starts)
        # ..& create expanded dataframe based on this
        df.out <- data.frame(start_time = rep(starts, nrows) + sequence(nrows)-1,
                           end_time = rep.int(ends, nrows) - (rep(nrows,nrows) -sequence(nrows)),
                           valCol = rep.int(valCol, nrows),
                           tDiffs = rep.int(ends - starts, nrows))
        # identify non-original starts and finishes (which are unique)
        startIndex <- !df.out$start_time %in% starts
        endIndex <- !df.out$end_time %in% ends
        # floor or ceiling accordingly
        df.out$start_time[startIndex] <- floor(df.out$start_time[startIndex])
        df.out$end_time[endIndex] <- ceiling(df.out$end_time[endIndex])
        # calculate proportion energy per day
        df.out$energy <- with(df.out, valCol*(end_time-start_time)/tDiffs)
        # reformat cols
        df.out$date <- chron(floor(df.out$start_time), out.format='y-m-d')
        df.out[c("date", "energy")]
    }
    

相关问题