我正在尝试向我的df添加一个新列,这只是我的函数hardfunct应用于'values',其中行是'hardness' . 然后,我希望该值填充该列中匹配'site'和'dates'的所有行 . 如何填充其余行?我尝试过使用summary,rowwise和mutate . 样本数据如下 .
site=c(rep("River A",4),rep("River B",4))
dates=as.Date(c("01/01/2001","01/01/2001","01/01/2001","01/01/2001","05/08/2001","05/08/2001","05/08/2001","05/08/2001"), format = "%m/%d/%Y")
param=c("lead","hardness","mercury","cadmium","lead","hardness","mercury","cadmium")
value=c("0.2","45","0.9","1.2","0.5","1800","0.6","0.8")
df=data.frame(site,param,dates,value)
hardfunct=function(x){
if (x>=400) {
print(400)
} else if (x<=25) {
print(25)
} else {
return(x)}
}
#######Trying to use group_by and mutate
df %>% group_by(site,dates) %>%
mutate(New_Hardness=sapply(df[df$param=="hardness","value"],hardfunct))
这是新列的数据框应该是什么样子
site param dates value New_Hardness
River A lead 1/1/2001 0.2 45
River A hardness 1/1/2001 45 45
River A mercury 1/1/2001 0.9 45
River A cadmium 1/1/2001 1.2 45
River B lead 5/8/2001 0.5 400
River B hardness 5/8/2001 1800 400
River B mercury 5/8/2001 0.6 400
River B cadmium 5/8/2001 0.8 400
2 回答
在基数R中,您可以使用拆分/应用/组合策略 .
请注意,
pmax
和pmin
的想法是@Frank's .请注意,您必须将函数中的
print
更改为return
,否则您还需要在数据帧输出之前获取打印值 .另请注意,您需要具有字符变量而不是因子,因为应用于因子的
as.numeric
将为您提供与您期望的不同的数字 .