首页 文章

group_by和胁迫时间序列dplyr r

提问于
浏览
0

我有一个data.frame:

df <- data.frame(region = rep(c("a","b","c","d"),12),
                 group = rep(c("A","A","A","B","B","B","C","C","C","D","D","D"),12), 
                 num = rep(c(1:12),12))

我希望按区域分组,然后按组分组,并将num强制转换为时间序列对象 - 我这样做:

df %>%
  group_by(region,group) %>%
  mutate(num = ts(num,f=4))

它有效,但我得到了一大堆警告:

12: In mutate_impl(.data, dots) :
Vectorizing 'ts' elements may not preserve their attributes

实际上,我将其应用于大型data.frame并需要分解时间序列数据 . 在我的简化示例中,我使用stl这样做:

df %>% 
group_by(region,group) %>%
mutate(num = ts(num,f=4)) %>% 
mutate(trendcycle(stl(num, s.window = "per")))

但我得到一个错误说:

Error in mutate_impl(.data, dots) : 
Evaluation error: series is not periodic or has less than two periods.

我猜这与尝试将数据强制为ts格式有关 . 问题是,我以前能够毫无问题地做到这一点 .

我使用的是R 3.4.1和dplyr 0.7.1

1 回答

  • 0

    通过将ts转换包含在一个mutate调用中,我已经解决了这个问题,如下所示:

    df %>%
    group_by(region,group) %>%
    mutate(trendcycle(stl(ts(num,f=4), s.window = "per")))
    

    我通过使用data.table攻击问题来到这里:

    df1 <- setDT(df)[,trendcycle(stl(ts(num, frequency = 4), s.window ="per")), by = .(region,group)]
    

    哪个更快,但我的程序遵循tidyverse语法,所以我保持一致

相关问题