Jash Shah - 1 year ago 62

R Question

I would like to create a vector that has distinct values from 1 to 20 thirty times but not uniformly.

For example:

There can be four counts of 1, one count of 2, two counts of 3 etc. But the counts of each number must add up to thirty and there must be 20 distinct values.

I tried:

`set.seed(3)`

sample(x = 1:20, size = 30, replace = TRUE)

But it does not always give all the values from 1 to 20. Some values are returned a higher number of times and some values are not returned at all.

I would like to create a vector that has all distinct values and the numbers have to necessarily be integers.

Answer Source

You can do it in three times:

generate a size-20 sample without replacements : you have every values 1 time

generate a size-10 sample with replacements

sample the two samples

Here is the result

```
a <- sample(1:20, 20)
b <- sample(1:20, 10, replace = TRUE)
result <- sample(c(a, b), 30)
# result
# [1] 1 10 20 11 16 12 9 8 20 4 15 2 7 5 19 18 6 13 14 17 11 5 1 7 4 19 6 16 3 3
# table(result) # every value appear at least one time
# result
# 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
# 2 1 2 2 2 2 2 1 1 1 2 1 1 1 1 2 1 1 2 2
```

Note that you can do it with a one-liner :

```
sample(c(sample(1:20, 20), sample(1:20, 10, replace = TRUE)), 30)
# [1] 4 13 15 20 6 5 9 11 11 14 17 1 10 9 3 10 11 12 18 17 8 7 18 12 19 16 2 13 13 4
```

Thanks to James's comment, you can use a faster solution:

```
sample(c(1:20,sample(20,10,replace=TRUE)))
```

Here is the `microbenchmark`

comparison:

```
# Unit: relative
# expr min lq mean median uq max neval
# etienne 1.727202 1.538411 1.529077 1.571341 1.5998 0.6855444 1000
# james 1.000000 1.000000 1.000000 1.000000 1.0000 1.0000000 1000
```