df <- structure(list(id = c("A", "A", "A", "B", "B"),
var = c("atc", "atc", "atc", "atc", "atc"),
val = c("aaa", "bbb", "ccc", "aaa", "eee")),
.Names = c("id","var", "val"), class = "data.frame",
row.names = c(NA, -5L))
# var and val are nonsense columns for padding
# How many times does each id appear sequentially?
df$run <- sequence(rle(df$id)$lengths)
df
id var val run
1 A atc aaa 1
2 A atc bbb 2
3 A atc ccc 3
4 B atc aaa 1
5 B atc eee 2
aggregate(df, by = list(df$id), FUN = max)
Group.1 id var val run
1 A A atc ccc 3
2 B B atc eee 2
2 回答
如上所述
rle()
将计算运行长度 . 然后,您可以使用aggregate()
获取每个分组因子的最大运行长度 .在Excel中,这可以通过数组公式完成 .
假设您的值在列B中,例如在
B2:B31
范围内,并且您要检查的值在单元格E3
中,您可以使用以下公式:并将其作为数组公式输入(意思是,输入后,按CTRL SHIFT ENTER
希望这可以解决问题!