joon - 6 months ago 24

R Question

Currently, I am struggling with a problem related to obtaining all possible permutation within group using data.table.

To explain my problem, let me show you an example.

`x <- c(1, 1, 1, 2, 2)`

y <- c('red', 'blue', 'black', 'orange', 'red')

dt1 <- as.data.table(cbind(x,y))

dt1

x y

1: 1 red

2: 1 blue

3: 1 black

4: 2 orange

5: 2 red

Now I want to see every possible pair of color(y) within group(x). So my ideal result would be....

`x y1 y2`

1 black blue

1 black red

1 blue black

1 blue red

1 red black

1 red blue

2 orange red

2 red orange

To find a solution for this, I did googling it and I found a function, permutation, which is what I am looking for but I find it hard to squeeze it into data.table framework.

`y <- c('red', 'blue', 'black')`

permutations(n=3, r=2, v=y, repeats.allowed=F)

[,1] [,2]

[1,] "black" "blue"

[2,] "black" "red"

[3,] "blue" "black"

[4,] "blue" "red"

[5,] "red" "black"

[6,] "red" "blue"

So I tried to do the following but obviously it has errors..

`dt1[, .(j = lapply(.SD, permutations, n=.N, r=2, v=y, repeats.allowed=F)), by=x]`

Any suggestion for this?

I will really appreciate it.

Answer

First off, don't use `as.data.table(cbind(...))`

to create the data table. You will get unexpected column classes due to `cbind`

coercing to matrix. Use

```
dt1 <- data.table(x, y)
```

That said, you can do

```
dt1[, {
p <- gtools::permutations(.N, 2, y, repeats=FALSE)
.(y1 = p[, 1], y2 = p[, 2])
}, by = x]
```

which gives

`x y1 y2 1: 1 black blue 2: 1 black red 3: 1 blue black 4: 1 blue red 5: 1 red black 6: 1 red blue 7: 2 orange red 8: 2 red orange`

There is no need to loop since we are operating on groups. `permutations`

creates a matrix, so we create our desired result columns from the resulting matrix columns of `permutations`

.