首页 文章

按组中插入新观察值,即r中的和(或加权和)

提问于
浏览
0

我还是R的新手,许多事情仍然很难执行 . 这里的社区非常有帮助!我还有另一个问题 . 1.为每个组创建一个新观察值,它是某些变量的和(或加权和)2 . 为有时带有NA的变量创建加权和

My dataset:

df = structure(list(ID = c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 4L), ID_name = c("AA", "AA", "BB", "BB", "CC","CC", "DD","DD","DD"),
    Volume = c(10L, 20L, 30L, 50L, 50L, 40L, 20L, 
    30L, 10L), Score= c(0.1L, 0.3L, 0.5L, NA, 0.6L, NA, 
    0.6L, 0.2L, 0.6L)).Names = c("ID", "ID_name","Volume","Score"), class = "data.frame", row.names = c(NA, -9L))

我想 1.Create a new observation for each unique ID, that is ID 1, ID 2, ID 3, and ID 4

2. Have these new observations be as follows: ID ID_name体积分数(加权平均值)1 AA 30(即10 20)(10 * 0.1 0.3 * 20)/(10 20)= 0.23 2 BB 80(30 50)(30 * 0.5)/30=0.5(分数计算中忽略NA行)3 CC 90(50 40)(60 * 0.6)/60=0.6(分数计算中忽略NA行)4 DD 60(20 30 10)(20 * 0.6 30 * 0.2 10 * 0.6)/60=0.4

我试过mutate函数,但似乎没有用 . 任何线索都将非常感激 . 谢谢

1 回答

  • 0
    library(dplyr)
    
    df = data.frame(ID = c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 4L), 
                    ID_name = c("AA", "AA", "BB", "BB", "CC", "CC", "DD", "DD", "DD"), 
                    Volume = c(10L, 20L, 30L, 50L, 50L, 40L, 20L, 30L, 10L), 
                    Score = c(0.1, 0.3, 0.5, NA, 0.6, NA, 0.6, 0.2, 0.6))
    
    
    df %>%
      mutate(HasScore = ifelse(is.na(Score), 0, 1)) %>%
      group_by(ID, ID_name) %>%
      summarise(WA = sum(Volume*Score, na.rm = T)/sum(Volume*HasScore),
                Volume = sum(Volume)) %>%
      ungroup()
    
    # # A tibble: 4 x 4
    #      ID ID_name        WA Volume
    #   <int>  <fctr>     <dbl>  <int>
    # 1     1      AA 0.2333333     30
    # 2     2      BB 0.5000000     80
    # 3     3      CC 0.6000000     90
    # 4     4      DD 0.4000000     60
    

相关问题