我正在尝试使用do将add_row()分组数据 without .
library(dplyr)
library(tidyr)
library(purrr)
library(tibble)
my.data <- data.frame(
supplier = c("a","a","a","a","a","a","b","b","b","b","b","b"),
date = rep(c("2017-06-01","2017-03-01","2017-02-01","2017-01-12",
"2017-05-01","2017-04-01"), 2),
order = c(1,0,0,1,1,0,0,1,0,0,1,0)
)
Solution with do
my.data %>%
group_by(supplier) %>%
do(add_row(.,.before=0))
这使
# A tibble: 14 x 3
# Groups: supplier [3]
supplier date order
<chr> <chr> <dbl>
1 <NA> <NA> NA
2 a 2017-06-01 1
3 a 2017-03-01 0
4 a 2017-02-01 0
5 a 2017-01-12 1
6 a 2017-05-01 1
7 a 2017-04-01 0
8 <NA> <NA> NA
9 b 2017-06-01 0
10 b 2017-03-01 1
11 b 2017-02-01 0
12 b 2017-01-12 0
13 b 2017-05-01 1
14 b 2017-04-01 0
Attempt with nest and mutate or purrr::map
my.data %>%
group_by(supplier) %>%
nest() %>%
mutate(extra.row = add_row(data, .before = 0))
mutate_impl(.data,dots)出错:评估错误:不支持的索引类型:NULL .
有什么建议 . 缩放时做的很慢 .
1 回答
您可以使用
bind_rows
将汇总数据集绑定到原始数据集上 .您也可以使用
complete
,虽然现在您的每组日期相同,但可能无法按照每组不同日期编写的日期 . 此外,我相信当你扩大规模时,complete
往往会很慢 .这两个解决方案都依赖于
date
,它是原始数据集中的实际date
变量 .用
summarize
和bind_rows
汇总和绑定 .arrange
是为了使事情井井有条,在实际案例中很可能不需要 .如果组之间的日期相同,则使用
complete
.两者的结果: