R：使用dplyr删除data.frame中的某些行-Java 学习之路

dat <- data.frame(ID = c(1, 2, 2, 2), Gender = c("Both", "Both", "Male", "Female"))
> dat
  ID Gender
1  1   Both
2  2   Both
3  2   Male
4  2 Female

对于每个ID，如果Gender为 Both ， Male 和 Female ，我想删除 Both 行 . 也就是说，我想要的数据是这样的：

ID Gender
1  1   Both
2  2   Male
3  2 Female

我尝试使用以下代码执行此操作：

library(dplyr)
> dat %>% 
  group_by(ID) %>% 
  mutate(A = ifelse(length(unique(Gender)) >= 3 & Gender == 'Both', F, T)) %>% 
  filter(A) %>% 
  select(-A)

# A tibble: 2 x 2
# Groups:   ID [1]
     ID Gender
  <dbl> <fctr>
1     2   Male
2     2 Female

我正在声明一个名为 A 的虚拟变量，其中 A = F 如果对于给定的 ID ， Gender 的所有3个元素都存在（"Both"，"Male"和"Female";这些是 Gender 可以采用的不同值，不可能有其他值）和相应的行有 Gender == Both . 然后我将删除该行 .

但是，似乎我将 A = F 分配给第一行，即使它的 Gender 只是"Both"，但不是"Both"，"Male"和"Female"？

2 回答

在按'ID'分组后，创建一个逻辑条件，其中'Gender'不是'Both'，'Gender'中 distinct 元素的长度是3，即'Male'，'Female'，'Both'（因为OP提到没有其他值）或（ | ）如果数量为元素只有1

dat %>% 
  group_by(ID) %>% 
  filter((Gender != "Both" & n_distinct(Gender)==3)| n() ==1 )
# A tibble: 3 x 2
# Groups:   ID [2]
#    ID Gender
#  <dbl> <fct> 
#1     1 Both  
#2     2 Male  
#3     2 Female

或者另一个选择是

dat %>%
   group_by(ID) %>% 
   filter(Gender %in% c("Male", "Female")| n() == 1)
# A tibble: 3 x 2
# Groups:   ID [2]
#     ID Gender
#  <dbl> <fct> 
#1     1 Both  
#2     2 Male  
#3     2 Female

回复于 2024-05-01T15:10:55+08:00

从基地R，使用 ave

dat[!(ave(dat$Gender,dat$ID,FUN=function(x) length(unique(x)))!='1'&(dat$Gender=='Both')),]
  ID Gender
1  1   Both
3  2   Male
4  2 Female

回复于 2024-05-01T15:10:55+08:00

R：使用dplyr删除data.frame中的某些行

2 回答

相关问题