我有两个简单的数据帧 . 我想使用dplyr和tidyverse来查找第二个数据帧(Df2)的“Task2”中不属于第一个数据帧(Df)的“任务”的类别 . 我想使用dplyr的“setdiff”函数 . 另外,我想保留第二个数据帧(Df2)的“时间”列的相应时间 .
因此,最终产品应该包括两行,一个用于客户“Chris”的“铁衬衫”,总时间为30,客户“Eric”的一行,“购买杂货”,相应的时间为8 .
我还想删除日期列 .
我认为这样做的一种方法是使用dplyr的“setdiff”函数(我意识到必须更改Task和Task2列名,以便它们匹配)以分离出两行,然后重新加入总时间加入功能 .
最后,我希望这是一个自定义函数,因为我将不得不重复执行此任务 . 我想要一个像“差异(Df1,Df2)”这样的函数......所以我可以输入两个数据帧,并得到结果 .
我希望这不要求太多!我是自定义函数的新手,特别是包含dplyr和管道的函数 .
希望有人能帮助我!
CaseWorker<-c("John","John","Kim")
Client<-c("Chris","Chris","Eric")
Task<-c("Feed cat","Make dinner","Do homework")
Date<-c("10/27/2016","09/22/2016","10/11/2016")
Df<-data.frame(CaseWorker,Client,Date,Task)
第二个数据帧......
CaseWorker<-c("John","John","John","John","John","John","John","John","John",
"John","Kim","Kim","Kim")
Client<-c("Chris","Chris","Chris","Chris","Chris","Chris","Chris","Chris","Chris","Chris","Eric","Eric","Eric")
Date<-c("11/10/2016","10/10/2016","11/13/2016","09/18/2016","11/11/2016","09/19/2016","08/08/2016","10/10/2016","08/05/2016","11/12/2016","09/09/2016","11/11/2016","09/10/2016")
Task2<-c("Feed cat","Feed cat","Feed cat","Feed cat","Feed cat","Make dinner","Make dinner","Make dinner","Iron shirt","Iron shirt","Do homework",
"Do homework","Buy groceries")
Time<-c(20,34,11,10,5,6,55,30,20,10,12,10,8)
Df2<-data.frame(CaseWorker,Client,Date,Task2,Time)
1 回答
我们可以用
anti_join
如果我们需要转换为函数