Home Articles

data.table中的非标准评估

Asked
Viewed 363 times
0

我在 data.table 中对 by 的评价迷失了 . What will be correct way to merge functionality of LJ and LJ2 into one function?

LJ <- function(dt_x_, dt_y_, by_)
{
    merge(
        dt_x_,
        dt_y_,
        by = eval(substitute(by_)), all.x = TRUE, sort = FALSE)
}
LJ2 <- function(dt_x_, dt_y_, by_)
{
    merge(
        dt_x_,
        dt_y_,
        by = deparse(substitute(by_)), all.x = TRUE, sort = FALSE)
}
LJ(
    data.table(A = c(1,2,3)),
    data.table(A = c(1,2,3), B = c(11,12,13)), 
    "A")
LJ2(
    data.table(A = c(1,2,3)),
    data.table(A = c(1,2,3), B = c(11,12,13)), 
    A)

1 Answer

  • 3

    我认为这是个坏主意 . 让用户始终传递字符值 . 你可以这样做:

    LJ3 <- function(dt_x_, dt_y_, by_)
    { 
      by_ <- gsub('\"', "", deparse(substitute(by_)), fixed = TRUE)
      dt_y_[dt_x_, on = by_] 
    }
    
    LJ3(
      data.table(A = c(4,1,2,3)),
      data.table(A = c(1,2,3), B = c(11,12,13)), 
      A)
    #   A  B
    #1: 4 NA
    #2: 1 11
    #3: 2 12
    #4: 3 13
    
    LJ3(
      data.table(A = c(4,1,2,3)),
      data.table(A = c(1,2,3), B = c(11,12,13)), 
      "A")
    #   A  B
    #1: 4 NA
    #2: 1 11
    #3: 2 12
    #4: 3 13
    

    这个问题与data.table无关 . merge.data.table 中的 by 参数始终需要字符值, on 也是如此 .

    Edit: @eddi指出,如果您的列名实际为 " ,上述内容将会失败(一般情况下应该避免这种情况,但如果您使用其他人准备了一些输入文件,则可能会发生这种情况) .

    可以处理这种边缘情况的替代方案是:

    LJ4 <- function(dt_x_, dt_y_, by_)
    { 
      by_ <- substitute(by_)
      if (!is.character(by_)) by_ <- deparse(by_)
      dt_y_[dt_x_, on = by_] 
    }
    

Related