KalC KalC - 10 days ago 6
R Question

Creating a combination of 2 and 3 variables from 4 or more vectors

I am looking for some directions as I am fairly new to R. Any help would be greatly appreciated.

I have the following vectors:

> types <- c("A", "B", "C", "D", "E")
> regions <- c("Atlantic", "Central", "Western")
> categories <- c("AA", "AB", "MN", "XY")
> market <- c("Small", "Medium", "Large")


I am trying to calculate YOY (year-over-year) values for all combinations of the values in these vectors. Combinations can be doubles or triples. Here are some examples ...

("A", "Atlantic", "AA")
("A", "Atlantic", "Small")
("A", "AB", "Small")
...
("A", "Small")
("B", "Western")


I intend to use dplyr for the summarizations but I won't be able to filter my main dataset if I don't know the keys. For example, I would need the doubles to be like ...


("types:A", "market:small")


so that I can use strsplit() to get the variable name.

Is it even possible to achieve this (creating all these named combinations) using R?

Answer

I think this will do what you want:

combos2 <- combn(c('types', 'regions', 'categories', 'market'), 2)
combos3 <- combn(c('types', 'regions', 'categories', 'market'), 3)

c(unlist(apply(combos2, 2, function(x) apply(expand.grid(get(x[1]), get(x[2])), 1, paste, collapse=':'))),
  unlist(apply(combos3, 2, function(x) apply(expand.grid(get(x[1]), get(x[2]), get(x[3])), 1, paste, collapse=':'))))

Including the names can be achieved (even less elegantly) thus:

c(unlist(apply(combos2, 2, function(x) apply(expand.grid(get(x[1]), get(x[2])), 1, function(y) paste(x[1],y[1],x[2],y[2], sep=':')))),
  unlist(apply(combos3, 2, function(x) apply(expand.grid(get(x[1]), get(x[2]), get(x[3])), 1, function(y) paste(x[1],y[1],x[2],y[2],x[3],y[3], sep=':')))))

This gives you all two and three combinations, using get() to return the relevant objects and feed to expand.grid(). It's not the most elegant if you want all 4 and 5 etc long versions but it works.