可以获得非标准评估在dplyr中使用filter_和count_但不是distinct

我正在尝试编写一个使用dplyr来计算z的所有唯一值的函数 . 当我将变量实际命名为z时，我的函数工作正常 . 但是，如果变量名为x，则会出现错误（代码下方） .

test.data<-data.frame(y=c(1:10),
                  x=c(letters[1:10]))
test.data$x<-as.character(test.data$x)
obsfunction<-function(z,y,data){
filter_(data,
          !is.na(deparse(substitute(y))))%>%
    distinct_(., deparse(substitute(z)))%>% #the line that breaks it
    count_(.)
}
obsfunction(z=x,y,data=test.data)

所以，上面的代码不起作用，并给出了这个错误：

>Error in eval(substitute(expr), envir, enclos) : unknown column 'z'

在函数中将z更改为x（或将x重命名为z）使其工作，但我不想重命名所有内容，特别是考虑到y使用不同的名称 .

我根据vignette，this question和this question尝试了lazyeval :: interp和quote（） .

distinct_(lazyeval::interp(as.name(z)))%>%
>Error in as.name(z) : object 'x' not found 

distinct_(quote(z))%>%
>Error in eval(substitute(expr), envir, enclos) : unknown column 'z'

我错过了什么？如何让z接受x作为列名？

3 回答

另一个 lazyeval/dplyr 变量，其中变量作为公式传递， f_interp 用公式传递给 uq(x) ，类似于 deparse(substitute(x))

library(dplyr)
library(lazyeval)

test.data<-data.frame(y=c(1:10),
                  x=c(letters[1:10]))
test.data$x<-as.character(test.data$x)


obsfunction<-function(z, y, data){
  data %>% filter_(f_interp(~!is.na(uq(y)))) %>%
    distinct_(f_interp(~uq(z))) %>% count()
}

obsfunction(z=~x,~y,data=test.data)

 #A tibble: 1 × 1
 #     n
 #  <int>
 #1    10

test.data.NA <- data.frame(
  y=c(1:4, NA, NA, 7:10),
  x=c(letters[c(1:8, 8, 8)]),
  stringsAsFactors = FALSE)


obsfunction(z=~x,~y,data=test.data.NA)
 # # A tibble: 1 × 1
 #        n
 #      <int>
 # 1      6

回复于 2024-04-19T08:52:09+08:00

由于dplyr标准评估了解字符串，我尝试了下面的代码并附加了测试数据，看起来很有用 . 我首先提取变量名，然后使用字符串构造表达式：

test.data<-data.frame(y=c(1:10),
                      x=c(letters[1:10]))
test.data$x<-as.character(test.data$x)

f <- function(z, y, data){
    z <- deparse(substitute(z))
    y <- deparse(substitute(y))
    res <- data %>% filter_(
        paste('!is.na(', y, ')', sep = '')) %>%
        distinct_(z) %>%
        count_(.)
}


x <- f(z = x, y, test.data)
# # A tibble: 1 × 1
#       n
# <int>
# 1    10



test.data <- data.frame(
    y=c(1:4, NA, NA, 7:10),
    x=c(letters[c(1:8, 8, 8)]),
    stringsAsFactors = F)

x <- f(z = x, y, test.data)
# # A tibble: 1 × 1
#       n
# <int>
# 1     6

回复于 2024-04-19T08:52:09+08:00

您可以使用 match.call 捕获函数参数，并在传递给dplyr SE函数之前将它们转换为字符：

obsfunction<-function(z, y, data){
    cl = match.call()
    y = as.character(cl['y'])
    z = as.character(cl['z'])

    data %>% filter_(paste('!is.na(', y, ')', sep = '')) %>%
             distinct_(z) %>%
             count_(.)
}

obsfunction(z = x, y = y, data = test.data)

# A tibble: 1 × 1
#      n
#  <int>
#1    10

obsfunction(x, y, test.data)

# A tibble: 1 × 1
#      n
#  <int>
#1    10

回复于 2024-04-19T08:52:09+08:00

可以获得非标准评估在dplyr中使用filter_和count_但不是distinct_

3 回答

相关问题