Phdaml Phdaml - 1 month ago 5
R Question

as.numeric(as.factor(x)) : why this expression can rerank noncontinuous data sequence

I have a group variable with a non-contunoious number label

user_id<-c(2,5,7,9)


I want to rename the
user_id
with a continuous number. The following code will work. But I want to know why? And is there any other way

new_id<- as.numeric(as.factor(user_id))
new_id
output:
1,2,3,4

Answer

You may want to use seq_along(user_id) to create the new id

user_id <- c(2,5,7,9)
new_id  <- seq_along(user_id)
# [1] 1 2 3 4

EDIT

As a follow up to the comment by @MatthewLundberg, here is a version which will take in to consideration duplicate user IDs; this uses the dplyr function dense_rank. This assumes a duplicate would get the same "new_id".

library(dplyr)

user_id <- c(2, 5, 7, 9, 2, 2, 7)
new_id  <- dense_rank(user_id)
new_id
# [1] 1 2 3 4 1 1 3