tlev - 4 years ago 71

R Question

I'm looking for a solution of how to find the subset of numbers within a larger set that gets as close as possible to a target number. So for instance, I have a set of values:

`c(55.14, 26.22, 76.69, 37.77, 32.7, 48.71, 7.59, 21.37, 33.54, 3.95, 16.41,`

20.56, 24.74, 26.5, 4.72, 32.99, 130.15, 27.27, 20.56, 41.21, 13, 16.41, 88.25,

1.95, 68.2, 34.3, 51.75, 8.93, 8.38, 30.45, 34.89, 42.91, 19.42, 13.62, 9.73,

20.86, 21.5, 37.46, 14.4, 26.61, 55.31, 24.03)

And my target is 1262.2.

How can I find the subset of values in the full set that minimizes the difference between my target, 1262.2, and the sum of the subset?

Recommended for you: Get network issues from **WhatsUp Gold**. **Not end users.**

Answer Source

Let `v`

be the vector of values shown in the question.

Let `x`

be an unknown vector of the same length containing 0-1 variables.

Then run these two integer linear programs over `x`

choosing the solution from the one whose objective is closest to 1262.2. (Here * means the inner product.)

```
max(v*x) such that v*x <= 1262.2
min(v*x) such that v*x >= 1262.2
```

Using Rglpk for the first one we get this and since we achieved 1262.2 exactly we don't need to bother with the second integer linear program:

```
library(Rglpk)
ans <- Rglpk_solve_LP(v, t(v), "<=", 1262.2, max = TRUE, types = "B")
```

giving:

```
> ans[1:3]
$optimum
[1] 1262.2
$solution
[1] 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 0 1 0 1 1 1 1 1 1 1 0 1 1 1 0 1 1 1 1 1 1 1 1 1
$status
[1] 0
```

**Note** The first integer linear program is known as a knapsack problem and we can solve it using `knapsack`

in adagio.

```
library(adadgio)
v100 <- as.integer(100 * v)
knapsack(v100, v100, 126220L)
```

(adagio gives the wrong answer if we don't scale the problem as shown.)

Recommended from our users: **Dynamic Network Monitoring from WhatsUp Gold from IPSwitch**. ** Free Download**