首页 文章

在dplyr :: mutate中嵌套的ifelse调用为每行返回相同的值

提问于
浏览
1

亲爱的stackoverflow社区,

我仍然是R的初学者,遇到了以下问题,我无法在stackoverflow或更广泛的网络上找到解决方案 . 对我来说似乎很直接,但我不知道我错过了什么或者我违反了哪些编码约定 . 下面的问题是更大功能的一部分,但下面的示例再现了该问题 .

我有两个数据框a和b,并希望在使用嵌套ifelse语句的情况下创建一个新变量foo1,其中条件基于a和b中的元素 .

a <- data.frame(foo=c("a","b","c","d"), bar=c("e","f","g","h"))
b <- data.frame(foo=c(1,NA,2,3), bar=c(1,2,3,4))

a <- mutate(a, foo1 = ifelse(is.na(b$foo[1]), NA,
                             ifelse(a$foo == "a", "a", "f")))

我期望或正在寻找的是:第一个ifelse语句检查b的第一行中的值是否为NA . 因为它不是在这种情况下,它应该跳到第二个ifelse语句并给我

a <- data.frame(foo=c(1,NA,2,3), bar=c(1,2,3,4), foo1=c("a","f","f","f"))

因为$ foo的第一行是a而其他的不是a(b,c,d) .

相反,它给了我什么

a <- data.frame(foo=c(1,NA,2,3), bar=c(1,2,3,4), foo1=c("a","a","a","a"))

它在foo1的所有行中打印“a”,而不是识别应该为第2行到第4行分配else语句,从而分配“f” . 这是由于ifelse条件的不同维度,即第一个ifelse条件是基于单个元素,而第二个应该单独评估$ foo的每一行,这似乎没有 .

此处未显示的较大函数在第一个ifelse循环内使用is.na()条件 . 但是,我怀疑它不是由于is.na声明,而是更可能是因为我使用两个ifelse条件来调用来自两个不同数据帧的元素 .

UPDATE: Prem的将rowwise()添加到管道的解决方案修复了上面给出的简化示例的问题,但遗憾的是不是更复杂的示例 . 更复杂的示例使用lapply将函数应用于数据帧列表(a,b,c和d) . 如上面的简化示例所示,它使用第二个数据帧作为第一个ifelse语句的查找表 . 这是代码:

a <- data.frame(foo=c("a","b","c","d"), bar=c("e","f","g","h"))
b <- data.frame(foo=c("a","b","c","d"), bar=c("e","f","g","h"))
c <- data.frame(foo=c("a","b","c","d"), bar=c("e","f","g","h"))
d <- data.frame(foo=c("a","b","c","d"), bar=c("e","f","g","h"))
LookUp <- data.frame(foo=c(1,NA,2,3), bar=c("a","b","c","d"))

List <- list(a,b,c,d)
names(List) <- c("a","b","c","d")

library(dplyr)

List2 <- lapply(seq_along(List), function(i) {
  a <- filter(LookUp, bar == names(List[i]))
temp <- List[[i]] %>%
  rowwise() %>%
  mutate(foo1 = ifelse(is.na(LookUp$foo[1]), NA,
                       ifelse(List[[i]]$foo == "a", "a", "f"))) %>%
  data.frame()
} )

现在,对于所有数据帧,新列foo1中的所有值都被赋值为“a” . 我想要的是除列表元素2之外的所有列表元素的foo1 = c(“a”,“f”,“f”,“f”),它应该给出foo1 = c(NA,NA,NA,NA)给出第一个ifelse声明 .

此外,我列表中的一些数据框非常大 . Rowwise()大大减慢了函数的速度 . 是否有更好/更快的方式来编码我的功能?

UPDATE 2: 对于让这个问题更加复杂,我深表歉意 . 使用 Map() 的Prem的第二个解决方案为我给出的例子带来了魅力 . 不幸的是,我在更复杂的例子中犯了一个错误 . 我指定在第一个 ifelse 语句中使用 is.na(LookUp$foo[1]) 而不是 is.na(a$foo[1]) . a是子集查找表,用于存储有关列表中每个元素的变量名称的信息 . 但是,如果我将代码更改为 is.na(a$foo[1]) ,则 Map() 解决方案不再有效,因为该函数未指定如何循环通过i . 我希望代码为每个函数运行以不同的方式对查找表进行子集化 . 因此, b$bar 的更新值应为 c(NA,NA,NA,NA) 以下代码是更新后的代码 .

a <- data.frame(foo=c("a","b","c","d"), bar=c("e","f","g","h"))
b <- data.frame(foo=c("a","b","c","d"), bar=c("e","f","g","h"))
c <- data.frame(foo=c("a","b","c","d"), bar=c("e","f","g","h"))
d <- data.frame(foo=c("a","b","c","d"), bar=c("e","f","g","h"))
LookUp <- data.frame(foo=c(1,NA,2,3), bar=c("a","b","c","d"))

List <- list(a,b,c,d)
names(List) <- c("a","b","c","d")

library(dplyr)

List2 <- lapply(seq_along(List), function(i) {
  LookUp2 <- filter(LookUp, bar == names(List[i]))
temp <- List[[i]] %>%
  rowwise() %>%
  mutate(foo1 = ifelse(is.na(LookUp2$bar[1]), NA,
                       ifelse(List[[i]]$foo == "a", "a", "f"))) %>%
  data.frame()
} )

我尝试添加名称作为第二个向量,允许我按照本文How do I extract the index or name of the list item within FUN of lapply?中的建议动态更改我的函数,但没有成功 . 它继续在列表元素之内和之间给出相同的行值 . 感谢您的帮助和耐心 .

a <- data.frame(foo=c("a","b","c","d"), bar=c("e","f","g","h"))
b <- data.frame(foo=c("a","a","c","d"), bar=c("e","f","g","h"))
c <- data.frame(foo=c("a","a","a","d"), bar=c("e","f","g","h"))
d <- data.frame(foo=c("a","b","a","d"), bar=c("e","f","g","h"))
LookUp <- data.frame(foo=c(1,NA,2,3), bar=c("a","b","c","d"))
List <- list(a,b,c,d)
names(List) <- c("a","b","c","d")

library(dplyr)
List_new <- Map(function(x, name) {
  i = which(LookUp$bar == name)
  Lookup2 <- filter(LookUp, bar == names(List[i]))
  x %>% 
    rowwise() %>%
    mutate(foo1=ifelse(is.na(Lookup2$bar[1]), NA,
                       ifelse(foo == "a", "a", "f")))
}, List, names(List))
List_new

任何帮助将非常感谢 .

1 回答

  • 0

    希望这可以帮助!

    #sample data   
    a <- data.frame(foo=c("a","b","c","d"), bar=c("e","f","g","h"))
    b <- data.frame(foo=c(1,NA,2,3), bar=c(1,2,3,4))
    
    library(dplyr)
    a %>%
      rowwise() %>%
      mutate(foo1 = ifelse(is.na(b$foo[1]), NA,
                                 ifelse(foo == "a", "a", "f"))) %>%
      data.frame()
    

    输出是:

    foo bar foo1
    1   a   e    a
    2   b   f    f
    3   c   g    f
    4   d   h    f
    

    UPDATE: 解决了增加的要求

    #sample data
    a <- data.frame(foo=c("a","b","c","d"), bar=c("e","f","g","h"))
    b <- data.frame(foo=c("a","b","c","d"), bar=c("e","f","g","h"))
    c <- data.frame(foo=c("a","b","c","d"), bar=c("e","f","g","h"))
    d <- data.frame(foo=c("a","b","c","d"), bar=c("e","f","g","h"))
    LookUp <- data.frame(foo=c(1,NA,2,3), bar=c("a","b","c","d"))
    List <- list(a,b,c,d)
    names(List) <- c("a","b","c","d")
    
    library(dplyr)
    List_new <- Map(function(x) {
      x %>% 
        rowwise() %>%
        mutate(foo1=ifelse(is.na(LookUp$foo[1]), NA,
                         ifelse(foo == "a", "a", "f")))
    }, List)
    List_new
    

相关问题