首页 文章

如何在dplyr中使用带有条件语句的mutate_at()内的approx()?

提问于
浏览
2

我想使用dplyr,piping和approx()来插入缺失值 .

数据:

test <- structure(list(site = structure(c(3L, 3L, 3L, 3L, 1L, 1L, 1L, 
1L, 2L, 2L, 2L, 2L), .Label = c("lake", "stream", "wetland"), class = "factor"), 
    depth = c(0L, -3L, -4L, -8L, 0L, -1L, -3L, -5L, 0L, -2L, 
    -4L, -6L), var1 = c(1L, NA, 3L, 4L, 1L, 2L, NA, 4L, 1L, NA, 
    NA, 4L), var2 = c(1L, NA, 3L, 4L, NA, NA, NA, NA, NA, 2L, 
    NA, NA)), .Names = c("site", "depth", "var1", "var2"), class = "data.frame", row.names = c(NA, 
-12L))

此代码有效:

library(tidyverse)

# interpolate missing var1 values for each site using approx()
test_int <- test %>% 
  group_by(site) %>% 
  mutate_at(vars(c(var1)),
            funs("i" = approx(depth, ., depth, rule=1, method="linear")[["y"]]))

但是,如果代码遇到不具有至少2个非NA值的分组(site&var),则代码不再有效,例如,

# here I'm trying to interpolate missing values for var1 & var2
test_int2 <- test %>% 
  group_by(site) %>% 
  mutate_at(vars(c(var1, var2)),
            funs("i" = approx(depth, ., depth, rule=1, method="linear")[["y"]]))

R适当地抛出此错误:mutate_impl(.data,dots)中的错误:评估错误:需要至少两个非NA值进行插值 .

如何包含条件语句或过滤器,以便它只尝试插入站点至少有2个非NA值并跳过其余值或返回NA的情况?

1 回答

  • 1

    这将做你想要的......

    test_int2 <- test %>% 
                 group_by(site) %>% 
                 mutate_at(vars(c(var1, var2)),
                           funs("i"=if(sum(!is.na(.))>1) 
                                      approx(depth, ., depth, rule=1, method="linear")[["y"]] 
                                    else 
                                      NA))
    
    test_int2
    # A tibble: 12 x 6
    # Groups:   site [3]
          site depth  var1  var2 var1_i var2_i
        <fctr> <int> <int> <int>  <dbl>  <dbl>
     1 wetland     0     1     1    1.0    1.0
     2 wetland    -3    NA    NA    2.5    2.5
     3 wetland    -4     3     3    3.0    3.0
     4 wetland    -8     4     4    4.0    4.0
     5    lake     0     1    NA    1.0     NA
     6    lake    -1     2    NA    2.0     NA
     7    lake    -3    NA    NA    3.0     NA
     8    lake    -5     4    NA    4.0     NA
     9  stream     0     1    NA    1.0     NA
    10  stream    -2    NA     2    2.0     NA
    11  stream    -4    NA    NA    3.0     NA
    12  stream    -6     4    NA    4.0     NA
    

相关问题