Gab_27 Gab_27 - 2 months ago 7
R Question

How to create an index from a variable in a dataframe

I have a data frame (

all_data
) in which I have a list of sites (1... to n) and their scores e.g.

site score
1 10
1 11
1 12
4 10
4 11
4 11
8 9
8 8
8 7


What I want to do is create another column in the data frame that numbers each site in numerical order e.g. in the above example from 1 to 3. So
all_data
would look like:

site score number
1 10 1
1 11 1
1 12 1
4 10 2
4 11 2
4 11 2
8 9 3
8 8 3
8 7 3


I am sure this must be easily solved, but I have not found a way yet.

Answer

Try Data$number <- as.numeric(as.factor(Data$site))

On a sidenote : the difference between the solution of me and @Chase on one hand, and the one of @DWin on the other, is the ordering of the numbers. Both as.factor and factor will automatically sort the levels, whereas that doesn't happen in the solution of @DWin :

Dat <- data.frame(site = rep(c(1,8,4), each = 3), score = runif(9))

Dat$number <- as.numeric(factor(Dat$site))
Dat$sitenum <- match(Dat$site, unique(Dat$site) ) 

Gives

> Dat
  site     score number sitenum
1    1 0.7377561      1       1
2    1 0.3131139      1       1
3    1 0.7862290      1       1
4    8 0.4480387      3       2
5    8 0.3873210      3       2
6    8 0.8778102      3       2
7    4 0.6916340      2       3
8    4 0.3033787      2       3
9    4 0.6552808      2       3