首页 文章

如何获取每行中变量的总数

提问于
浏览
2

我有一个类似于名称sp的数据框

Join      p1     sp1       p2      sp2     p3      sp3
  1          0        0           0         0        0          0
   2          1        pine     0         0       1         Aspen
   3           2        pine     0        0       0          0

并且数据帧继续100行,其中p1是列sp1指示的物种数,依此类推 . 现在我想创建一个新的变量pine,它计算每行中树种松树的总数(连接)

2 回答

  • 0

    一个简单的 apply by row就可以了 . 我使用 grep 对data.frame进行子集化,以获取以 "sp" 开头的列 .

    pine <- apply(sp[grep("^sp", names(sp))], 1, function(x) sum(x == "pine"))
    pine
    #[1] 0 1 1
    

    Data.

    sp <- 
    structure(list(Join = 1:3, p1 = 0:2, sp1 = structure(c(1L, 2L, 
    2L), .Label = c("0", "pine"), class = "factor"), p2 = c(0L, 0L, 
    0L), sp2 = c(0L, 0L, 0L), p3 = c(0L, 1L, 0L), sp3 = structure(c(1L, 
    2L, 1L), .Label = c("0", "Aspen"), class = "factor")), class = "data.frame", row.names = c(NA, 
    -3L))
    
  • 0

    您可以长格式转换数据以执行计算 . 一旦数据采用长格式, fuzzyjoin::regex_inner_join 将允许连接配对值的数据(例如 p1 vs sp1 ) .

    使用 tidyverse 的选项可以是:

    library(tidyverse)
    library(fuzzyjoin)         
    
    #To calculate count of Species per row for different type
    
    df %>% gather(Species, value, -Join) %>% 
      mutate(Join = as.character(Join))  %>% {
        regex_inner_join(filter(., grepl("^s",Species)),
                  filter(.,grepl("^p",Species)),
                  by = c("Join", "Species"))
    } %>%
      filter(value.x != "0") %>%
      group_by(Join.x, value.x) %>%
      summarise(count = sum(as.numeric(value.y))) %>% as.data.frame()
    
    #   Join.x value.x count
    # 1      2   Aspen     1
    # 2      2    pine     1
    # 3      3    pine     2
    
    #To calculate count of Species per row 
    df %>% gather(Species, value, -Join) %>% 
      mutate(Join = as.character(Join))  %>% {
        regex_inner_join(filter(., grepl("^s",Species)),
                  filter(.,grepl("^p",Species)),
                  by = c("Join", "Species"))
    } %>%
    group_by(Join.x) %>%
    summarise(count = sum(as.numeric(value.y))) %>% as.data.frame()
    
    #   Join.x count
    # 1      1     0
    # 2      2     2
    # 3      3     2
    

    Data:

    df <- read.table(text = 
    "Join      p1     sp1       p2      sp2     p3      sp3
    1          0        0           0         0        0          0
    2          1        pine     0         0       1         Aspen
    3           2        pine     0        0       0          0",
    header = TRUE, stringsAsFactors = FALSE)
    

相关问题