isjoy isjoy - 2 months ago 5
R Question

R code for repeating value into column

I am basically new to using R software.

I have a list of repeating codes (numeric/ categorical) from an excel file. I need to add another column values (even at random) to which every same code will get the same value.

Codes Value
1 122
1 122
2 155
2 155
2 155
4 101
4 101
5 251
5 251


Thank you.

Answer

We can use match:

n <- length(code0 <- unique(code))
value <- sample(4 * n, n)[match(code, code0)]

or factor:

n <- length(unique(code))
value <- sample(4 * n, n)[factor(code)]

The random integers generated are between 1 and 4 * n. The number 4 is arbitrary; you can also put 100.


Example

set.seed(0); code <- rep(1:5, sample(5))

code
# [1] 1 1 1 1 1 2 2 3 3 3 3 4 4 4 5

n <- length(code0 <- unique(code))
sample(4 * n, n)[match(code, code0)]

# [1]  5  5  5  5  5 18 18 19 19 19 19 12 12 12 11

Comment

The above gives the most general treatment, assuming that code is not readily sorted or taking consecutive values.

If code is sorted (no matter what value it takes), we can also use rle:

if (!is.unsorted(code)) {
  n <- length(k <- rle(code)$lengths)
  value <- rep.int(sample(4 * n, n), k)
  }

If code takes consecutive values 1, 2, ..., n (but not necessarily sorted), we can skip match or factor and do:

n <- max(code)
value <- sample(4 * n, n)[code]

Further notice: If code is not numerical but categorical, match and factor method will still work.