mpalanco - 1 year ago 86

R Question

I'm trying to solve or implement an approximation to the set cover problem in R. Given a data frame like this.

`sets n`

1 s1 1

2 s1 2

3 s1 3

4 s2 2

5 s2 4

6 s3 3

7 s3 4

8 s4 4

9 s4 5

The unique number of elements in column

`n`

`unique(d$n)`

[1] 1 2 3 4 5

I'd like to calculate the smaller number of sets (column

`sets`

`LPsolve`

`Rsymphony`

Any help or guide about how to start or proceed will be very much appreciated.

`d <- structure(list(sets = structure(c(1L, 1L, 1L, 2L, 2L, 3L, 3L,`

4L, 4L), .Label = c("s1", "s2", "s3", "s4"), class = "factor"),

n = c(1, 2, 3, 2, 4, 3, 4, 4, 5)), .Names = c("sets", "n"

), row.names = c(NA, -9L), class = "data.frame")

Answer Source

The `lpSolve`

package is available on CRAN for linear programing problems. Using your link which had a response from the very reputable Hans Borchers, as well as a slightly more complex example (starting on pg 4/5) in http://math.mit.edu/~goemans/18434S06/setcover-tamara.pdf as templates to understand teh proper structure of the setup, and then following along with modifications to the first example in `?lp`

:

```
library( lpSolve)
?lp
# In Details: "Note that every variable is assumed to be >= 0!"
# go from your long-form rep of the sets to a wide form for a matrix representation
( items.mat<- t(table(d$sets,d$n)) ) # could have reversed order of args to skip t()
#---------
> dimnames(items.mat) = list( items=1:5, sets=paste0("s", 1:4) )
> items.mat
sets
items s1 s2 s3 s4
1 1 0 0 0
2 1 1 0 0
3 1 0 1 0
4 0 1 1 1
5 0 0 0 1
#---------
f.obj <- rep(1,4) # starting values of objective parameters by column (to be solved)
f.dir <- rep(">=",5) # the constraint "directions" by row
f.rhs <- rep(1,5) # the inequality values by row (require all items to be present)
lp ("min", f.obj, items.mat, f.dir, f.rhs)$solution
#[1] 1 0 0 1
```

So sets `s1`

and `s4`

are a minimal cover. The "column coefficients" determine choice of "sets".