reuss - 3 months ago 71

R Question

I have an problem in counting the number of combinations of drugs. My data is organized in two data frames.

df1 contains id and found drugs, as such:

`ID | drug`

-----------

1 | drug1

1 | drug2

1 | drug3

2 | drug3

2 | drug5

3 | drug1

3 | drug3

3 | drug4

3 | drug5

df2 shows all possible drug combination of 2 different drugs, as such:

`combi1 | combi2`

-----------------

drug1 | drug2

drug1 | drug3

drug1 | drug4

drug2 | drug3

drug2 | drug4

drug2 | drug5

With 7140 possible combinations in total. What I want is to find out how many IDs have combination

`drug1-drug2`

`drug1-drug3`

I have been trying a double

`for`

`counter=0`

for(com in 1:nrow(df2)){

for(id in 1:unique(df1$ID)){

if(df2$combi1[com] %in% df1$drug[id] & df2$combi2[com] %in% df1$drug[id]) {

counter=counter+1

}

}

df2$count=counter

counter=0

}

But it doesn't work, because it is only able to check one row at a time. I have also tried the solution in Find Most Frequent Combination within a Vector by Group, but without any luck.

Furthermore, I need to do the same with combinations of three drugs

EDIT:

I like the output to be like this in df2, where I can see, how many times drug1 and drug2 has occurred as combination within ID. For example, only one ID had both drug1 and drug2, whereas 2 IDs had drug1 and drug3

`combi1 | combi2 | count`

-----------------------

drug1 | drug2 | 1

drug1 | drug3 | 2

drug1 | drug4 | 0

drug2 | drug3 | 1

drug2 | drug4 | 0

drug2 | drug5 | 0

Answer

For this one, I reached for data.table, but you could use `tidyr`

just as easily.

```
library(data.table)
set.seed(213) # set seed
d <- data.table(ID = rep(1:3, each = 3), drug = paste0("drug", sample(1:5, 9, rep = T)))
get_combs <- function(x){
uniq_x <- unique(x)
if(length(uniq_x) == 1L){
return(NULL)
} else {
return(as.data.frame(t(combn(uniq_x, 2)), stringsAsFactors = FALSE))
}
}
combi <- d[, get_combs(drug), by = ID][order(V1, V2),]
combi[ , .N, by = .(V1, V2)]
V1 V2 N
1: drug1 drug2 2
2: drug1 drug4 2
3: drug2 drug4 2
4: drug3 drug5 1
```