Jason_K Jason_K - 2 months ago 6
R Question

Creating boolean expressions from columns

I have a data frame like this:

set1,set2,set3
"test1","test12","test13"
"test2","test22","test23"


I would like to create boolean expressions based on AND accross all possible combinations of all columns, using as base the first column.

Example of output based on the above df:

("test1" AND "test12" AND "test13")
("test1" AND "test22" AND "test23")
("test2" AND "test12" AND "test13")
("test2" AND "test22" AND "test23")


Is there any easy way to make it? i tried this:

set1 <- read.csv("C:/Users/Desktop/set.csv", header=TRUE, sep=",")

df <- data.frame()

i <- 1

for (i in 1:nrow(set1$set1)) {
j <- 1
for (j in 1:nrow(set1$set2)) {
k <- 1
for (k in 1:nrow(set1$set3)) {
df <- paste(set1$set1[i]," AND ",set1$set2[j]," AND ", set1$set3[k])
}
}
}

Answer

One idea, first we create a new column to paste set2 and set3 so to avoid strings such as ("test1" AND "test22" AND "test13"). We then create the combinations via expand.grid and paste, i.e.

df1$new <- do.call(paste, c(df1[,(2:3)], sep = ' AND '))
do.call(paste, c(expand.grid(df1[,-(2:3)]), sep = ' AND '))
#[1] "test1 AND test12 AND test13" "test2 AND test12 AND test13" "test1 AND test22 AND test23" "test2 AND test22 AND test23"

If you want all combinations, then

do.call(paste, c(expand.grid(df1), sep = ' AND '))
#[1] "test1 AND test12 AND test13" "test2 AND test12 AND test13" "test1 AND test22 AND test13" "test2 AND test22 AND test13"
#[5] "test1 AND test12 AND test23" "test2 AND test12 AND test23" "test1 AND test22 AND test23" "test2 AND test22 AND test23"

DATA

dput(df1)
structure(list(set1 = c("test1", "test2"), set2 = c("test12", 
"test22"), set3 = c("test13", "test23")), .Names = c("set1", 
"set2", "set3"), class = "data.frame", row.names = c(NA, -2L))

EDIT Since you want to keep quotes around each "test", then

#as before,
v1 <- do.call(paste, c(expand.grid(df1), sep = ' AND '))
v2 <- paste0('(', sapply(lapply(strsplit(v1, ' AND '), function(i) dQuote(i)), 
                                         function(j) paste(j, collapse = ' AND ')), ')')

#1 (“test1” AND “test12” AND “test13”)
#2 (“test2” AND “test12” AND “test13”)
#3 (“test1” AND “test22” AND “test13”)
#4 (“test2” AND “test22” AND “test13”)
#5 (“test1” AND “test12” AND “test23”)
#6 (“test2” AND “test12” AND “test23”)
#7 (“test1” AND “test22” AND “test23”)
#8 (“test2” AND “test22” AND “test23”)