向量内的平均邻居
我的数据:
data <- c(1,5,11,15,24,31,32,65)
有2个邻居: 31 and 32 . 我希望删除它们并仅保留平均值(例如 31.5 ),以这种方式数据将是:
data <- c(1,5,11,15,24,31.5,65)
这看起来很简单,但我希望自动完成,有时候会使用包含更多邻居的向量 . 例如 :
data_2 <- c(1,5,11,15,24,31,32,65,99,100,101,140)
回答(4)
我有一个基于data.table的解决方案,同样可以翻译成dplyr我猜:
library(data.table)
df <- data.table(data2 = c(1,5,11,15,24,31,32,65,99,100,101,140))
df[,neighbours := ifelse(c(0,diff(data_2)) == 1,1,0)]
df[,neighbours := c(neighbours[1:(.N-1)],1),by = rleid(neighbours)]
df[,neigh_seq := rleid(neighbours)]
unique(df[,ifelse(neighbours == 1,mean(data2),data2),by = neigh_seq])
neigh_seq V1
1: 1 1.0
2: 1 5.0
3: 1 11.0
4: 1 15.0
5: 1 24.0
6: 2 31.5
7: 3 65.0
8: 4 100.0
9: 5 140.0
它的作用:如果与以下数字的差异为1,则第一行将neigbours设置为1
1: 1 0
2: 5 0
3: 11 0
4: 15 0
5: 24 0
6: 31 0
7: 32 1
8: 65 0
9: 99 0
10: 100 1
11: 101 1
12: 140 0
我想分组,以便所有neigbours的 neighbour
变量为1 . 我需要在每个组的每一端添加1:
df[,neighbours := c(neighbours[1:(.N-1)],1),by = rleid(neighbours)]
data2 neighbours
1: 1 0
2: 5 0
3: 11 0
4: 15 0
5: 24 0
6: 31 1
7: 32 1
8: 65 0
9: 99 1
10: 100 1
11: 101 1
12: 140 0
然后,我只是在更改 neighbour
值时进行分组,并将值设置为表示它们是否为neihbours
df[,ifelse(neighbours == 1,mean(data2),data2),by = rleid(neighbours)]
rleid V1
1: 1 1.0
2: 1 5.0
3: 1 11.0
4: 1 15.0
5: 1 24.0
6: 2 31.5
7: 2 31.5
8: 3 65.0
9: 4 100.0
10: 4 100.0
11: 4 100.0
12: 5 140.0
并采取独特的 Value 观 . 瞧 .
这是我的解决方案,它使用行程编码来识别组:
foo <- function(x) {
y <- x - seq_along(x) #normalize to zero differences in groups
ind <- rle(y) #run-length encoding
ind$values <- ind$lengths != 1 #to find groups
ind$values[ind$values] <- cumsum(ind$values[ind$values]) #group ids
ind <- inverse.rle(ind)
xnew <- x
xnew[ind != 0] <- ave(x, ind, FUN = mean)[ind != 0] #calculate means
xnew[!(duplicated(ind) & ind != 0)] #remove duplicates from groups
}
foo(data)
#[1] 1.0 5.0 11.0 15.0 24.0 31.5 65.0
foo(data_2)
#[1] 1.0 5.0 11.0 15.0 24.0 31.5 65.0 100.0 140.0
data_3 <- c(1, 2, 4, 1, 2)
foo(data_3)
#[1] 1.5 4.0 1.5
我假设你不推荐在Rcpp中使用简单的C for
循环 .
2 years ago
这是通过
cumsum(c(TRUE, diff(a) > 1))
创建id的另一个想法,其中1
显示间隙阈值,即您也可以将其包装在一个函数中 . 我把间隙留作参数让你可以调整,
DATA: