Asiack Asiack - 2 months ago 7
R Question

Converting factor / ?nominal variables into numeric in R

My question seems to be related to this thread.

However, the method given there does not work for me.

I define a vector from a dataset as:

eduyears1994 <- year1994$q131ed

and receive a vector that looks like:

[1] 17 lat/9 1O lat/3,4 1O lat/3,4 17 lat/9 17 lat/9 12 lat/5,6
1O lat/3,4 1O lat/3,4 12 lat/5,6
9 Levels: Brak formal wykszta³cenia 4 lata/1 8 lat/2 1O lat/3,4 12 lat/5,6
14 lat/7,8 ... BRAK DANYCH


where e.g. "10 lat" stands for 10 years (of education) and "/3,4" most likely stands for the factor label.

I would simply like to have a numeric variable where I have e.g. "10" instead of "10 years" in the column.

I have tried the following and received the following error message:


eduyears1994n <- as.numeric(as.character(eduyears1994))

Warning message:

NAs introduced by coercion


I also tried to do it manually:

eduyears1994[eduyears1994== "4 lata/1"] <- 4
eduyears1994[eduyears1994== "2"] <- 8
eduyears1994[eduyears1994== "17 lat"] <- 17


but the error message reads:


In [<-.factor(tmp, eduyears1994 == "9", value = 17) :

invalid factor level, NA generated


When I open the file with SPSS I see numbers, not labels, but then the data format was specified as nominal somehow, which might be the cause for the problem.

dput(eduyears1994)
c("17 lat/9", "1O lat/3,4", "1O lat/3,4", "17 lat/9", "17 lat/9",
"12 lat/5,6", "1O lat/3,4", "1O lat/3,4", "12 lat/5,6", "14 lat/7,8",
"1O lat/3,4", "1O lat/3,4", "1O lat/3,4", "17 lat/9", "12 lat/5,6",
"12 lat/5,6", "17 lat/9", "1O lat/3,4", "12 lat/5,6", "12 lat/5,6",
"12 lat/5,6", "12 lat/5,6", "12 lat/5,6", "17 lat/9", "1O lat/3,4",
"1O lat/3,4", "14 lat/7,8", "17 lat/9", "1O lat/3,4", "1O lat/3,4",
"12 lat/5,6", "12 lat/5,6", "17 lat/9", "17 lat/9", "17 lat/9",
"17 lat/9", "12 lat/5,6", "12 lat/5,6", "14 lat/7,8", "12 lat/5,6",
"8 lat/2", "1O lat/3,4", "12 lat/5,6", "8 lat/2", "17 lat/9",
"1O lat/3,4", "12 lat/5,6", "1O lat/3,4", "1O lat/3,4", "1O lat/3,4",
"17 lat/9", "8 lat/2", "8 lat/2", "1O lat/3,4", "1O lat/3,4",
"12 lat/5,6", "12 lat/5,6", "17 lat/9", "1O lat/3,4", "14 lat/7,8",
"1O lat/3,4", "14 lat/7,8", "1O lat/3,4", "1O lat/3,4", "17 lat/9",
"12 lat/5,6", "1O lat/3,4", "14 lat/7,8", "1O lat/3,4", "12 lat/5,6",
"12 lat/5,6", "1O lat/3,4", "8 lat/2", "12 lat/5,6", "1O lat/3,4",
"17 lat/9", "8 lat/2", "17 lat/9", "17 lat/9", "12 lat/5,6",
"1O lat/3,4", "8 lat/2", "1O lat/3,4", "12 lat/5,6", "1O lat/3,4",
"17 lat/9", "1O lat/3,4", "12 lat/5,6", "12 lat/5,6", "1O lat/3,4",
"8 lat/2", "8 lat/2", "1O lat/3,4", "1O lat/3,4", "12 lat/5,6",
"4 lata/1", "12 lat/5,6", "1O lat/3,4", "14 lat/7,8", "12 lat/5,6",
"17 lat/9", "12 lat/5,6", "1O lat/3,4", "8 lat/2", "12 lat/5,6",
"17 lat/9", "17 lat/9", "17 lat/9", "1O lat/3,4", "17 lat/9",
"17 lat/9", "12 lat/5,6", "1O lat/3,4", "1O lat/3,4", "1O lat/3,4",
"1O lat/3,4", "14 lat/7,8", "1O lat/3,4", "12 lat/5,6", "12 lat/5,6",
"12 lat/5,6", "8 lat/2", "17 lat/9", "1O lat/3,4", "1O lat/3,4",
"1O lat/3,4", "12 lat/5,6", "12 lat/5,6", "17 lat/9", "1O lat/3,4",
"8 lat/2", "14 lat/7,8", "1O lat/3,4", "8 lat/2", "1O lat/3,4",
"12 lat/5,6", "12 lat/5,6", "8 lat/2", "17 lat/9", "12 lat/5,6",
"12 lat/5,6", "12 lat/5,6", "1O lat/3,4", "12 lat/5,6", "12 lat/5,6",
"12 lat/5,6", "12 lat/5,6", "17 lat/9", "1O lat/3,4", "1O lat/3,4",
"1O lat/3,4", "1O lat/3,4", "1O lat/3,4", "12 lat/5,6", "12 lat/5,6",
"1O lat/3,4", "1O lat/3,4", "1O lat/3,4", "1O lat/3,4", "14 lat/7,8",
"8 lat/2", "8 lat/2", "1O lat/3,4", "1O lat/3,4", "8 lat/2",
"4 lata/1", "8 lat/2", "1O lat/3,4", "12 lat/5,6", "12 lat/5,6",
"1O lat/3,4", "8 lat/2", "8 lat/2", "14 lat/7,8", "12 lat/5,6",
"8 lat/2", "8 lat/2", "14 lat/7,8", "8 lat/2", "14 lat/7,8",
"17 lat/9", "12 lat/5,6", "1O lat/3,4", "1O lat/3,4", "1O lat/3,4",
"1O lat/3,4", "12 lat/5,6", "12 lat/5,6", "1O lat/3,4", "17 lat/9",
"8 lat/2", "14 lat/7,8", "1O lat/3,4", "17 lat/9", "1O lat/3,4",
"8 lat/2", "12 lat/5,6", "12 lat/5,6", "1O lat/3,4", "1O lat/3,4",
"4 lata/1", "12 lat/5,6", "12 lat/5,6", "12 lat/5,6", "17 lat/9",
"17 lat/9", "17 lat/9", "8 lat/2", "12 lat/5,6", "1O lat/3,4",
"1O lat/3,4", "8 lat/2", "8 lat/2", "12 lat/5,6", "1O lat/3,4",
"12 lat/5,6", "17 lat/9", "12 lat/5,6", "12 lat/5,6", "12 lat/5,6",
"1O lat/3,4", "17 lat/9", "17 lat/9", "8 lat/2", "12 lat/5,6",
"12 lat/5,6", "1O lat/3,4", "1O lat/3,4", "1O lat/3,4", "12 lat/5,6",
"1O lat/3,4", "1O lat/3,4", "12 lat/5,6", "1O lat/3,4", "17 lat/9",
"12 lat/5,6", "1O lat/3,4", "8 lat/2", "1O lat/3,4", "1O lat/3,4",
"17 lat/9", "12 lat/5,6", "1O lat/3,4", "12 lat/5,6", "1O lat/3,4",
"1O lat/3,4", "1O lat/3,4", "8 lat/2", "8 lat/2", "17 lat/9",
"1O lat/3,4", "1O lat/3,4", "14 lat/7,8", "1O lat/3,4", "1O lat/3,4",
"1O lat/3,4", "1O lat/3,4", "1O lat/3,4", "1O lat/3,4", "8 lat/2",
"1O lat/3,4", "1O lat/3,4", "1O lat/3,4", "14 lat/7,8", "12 lat/5,6",
"12 lat/5,6", "14 lat/7,8", "1O lat/3,4", "17 lat/9", "17 lat/9",
"12 lat/5,6", "1O lat/3,4", "1O lat/3,4", "12 lat/5,6", "17 lat/9",
"12 lat/5,6", "1O lat/3,4", "1O lat/3,4", "17 lat/9", "17 lat/9",
"1O lat/3,4", "12 lat/5,6", "12 lat/5,6", "1O lat/3,4", "17 lat/9",
"1O lat/3,4", "12 lat/5,6", "12 lat/5,6", "8 lat/2", "12 lat/5,6",
"12 lat/5,6", "14 lat/7,8", "8 lat/2", "14 lat/7,8", "1O lat/3,4",
"1O lat/3,4", "8 lat/2", "1O lat/3,4", "8 lat/2", "12 lat/5,6",
"1O lat/3,4", "12 lat/5,6", "1O lat/3,4", "14 lat/7,8", "12 lat/5,6",
"1O lat/3,4", "1O lat/3,4", "12 lat/5,6", "8 lat/2", "12 lat/5,6",
"12 lat/5,6", "14 lat/7,8", "12 lat/5,6", "14 lat/7,8", "17 lat/9",
"17 lat/9", "1O lat/3,4", "12 lat/5,6", "12 lat/5,6", "14 lat/7,8",
"1O lat/3,4", "8 lat/2", "1O lat/3,4", "12 lat/5,6", "14 lat/7,8",
"12 lat/5,6", "1O lat/3,4", "8 lat/2", "12 lat/5,6", "1O lat/3,4",
"8 lat/2", "8 lat/2", "1O lat/3,4", "12 lat/5,6", "1O lat/3,4",
"14 lat/7,8", "12 lat/5,6", "1O lat/3,4", "1O lat/3,4", "8 lat/2",
"17 lat/9", "17 lat/9", "8 lat/2", "14 lat/7,8", "1O lat/3,4",
"8 lat/2", "17 lat/9", "17 lat/9", "17 lat/9", "12 lat/5,6",
"17 lat/9", "12 lat/5,6", "12 lat/5,6", "17 lat/9", "1O lat/3,4",
"12 lat/5,6", "12 lat/5,6", "8 lat/2", "1O lat/3,4", "8 lat/2",
"8 lat/2", "8 lat/2", "12 lat/5,6", "1O lat/3,4", "1O lat/3,4",
"12 lat/5,6", "1O lat/3,4", "1O lat/3,4", "8 lat/2", "17 lat/9",
"1O lat/3,4", "1O lat/3,4", "8 lat/2", "1O lat/3,4", "8 lat/2",
"17 lat/9", "17 lat/9", "14 lat/7,8", "17 lat/9", "1O lat/3,4",
"17 lat/9", "17 lat/9", "8 lat/2", "1O lat/3,4", "17 lat/9",
"1O lat/3,4", "12 lat/5,6", "8 lat/2", "12 lat/5,6", "12 lat/5,6",
"12 lat/5,6", "17 lat/9", "17 lat/9", "1O lat/3,4", "1O lat/3,4",
"1O lat/3,4", "12 lat/5,6", "8 lat/2", "12 lat/5,6", "8 lat/2",
"8 lat/2", "14 lat/7,8", "8 lat/2", "17 lat/9", "12 lat/5,6",
"1O lat/3,4", "14 lat/7,8", "17 lat/9", "1O lat/3,4", "17 lat/9",
"17 lat/9", "1O lat/3,4", "12 lat/5,6", "1O lat/3,4", "8 lat/2",
"17 lat/9", "1O lat/3,4", "1O lat/3,4", "1O lat/3,4", "12 lat/5,6",
"1O lat/3,4", "12 lat/5,6", "1O lat/3,4", "1O lat/3,4", "1O lat/3,4",
"1O lat/3,4", "1O lat/3,4", "1O lat/3,4", "12 lat/5,6", "1O lat/3,4",
"8 lat/2", "8 lat/2", "1O lat/3,4", "14 lat/7,8", "1O lat/3,4",
"12 lat/5,6", "1O lat/3,4", "1O lat/3,4", "17 lat/9", "1O lat/3,4",
"1O lat/3,4", "1O lat/3,4", "1O lat/3,4", "12 lat/5,6", "1O lat/3,4",
"14 lat/7,8", "12 lat/5,6", "8 lat/2", "1O lat/3,4", "12 lat/5,6",
"1O lat/3,4", "8 lat/2", "1O lat/3,4", "1O lat/3,4", "12 lat/5,6",
"12 lat/5,6", "12 lat/5,6", "12 lat/5,6", "14 lat/7,8", "12 lat/5,6",
"1O lat/3,4", "1O lat/3,4", "12 lat/5,6", "17 lat/9", "12 lat/5,6",
"8 lat/2", "17 lat/9", "8 lat/2", "12 lat/5,6", "1O lat/3,4",
"17 lat/9", "12 lat/5,6", "14 lat/7,8", "1O lat/3,4", "1O lat/3,4",
"1O lat/3,4", "12 lat/5,6", "17 lat/9", "1O lat/3,4", "17 lat/9",
"17 lat/9", "12 lat/5,6", "8 lat/2", "1O lat/3,4", "1O lat/3,4",
"17 lat/9", "14 lat/7,8", "1O lat/3,4", "1O lat/3,4", "1O lat/3,4",
"1O lat/3,4", "1O lat/3,4", "12 lat/5,6", "14 lat/7,8", "8 lat/2",
"12 lat/5,6", "12 lat/5,6", "8 lat/2", "8 lat/2", "1O lat/3,4",
"12 lat/5,6", "1O lat/3,4", "1O lat/3,4", "1O lat/3,4", "1O lat/3,4",
"1O lat/3,4", "17 lat/9", "8 lat/2", "1O lat/3,4", "17 lat/9",
"17 lat/9", "12 lat/5,6", "12 lat/5,6", "12 lat/5,6", "8 lat/2",
"1O lat/3,4", "1O lat/3,4", "1O lat/3,4", "1O lat/3,4", "12 lat/5,6",
"8 lat/2", "12 lat/5,6", "14 lat/7,8", "1O lat/3,4", "1O lat/3,4",
"12 lat/5,6", "1O lat/3,4", "1O lat/3,4", "17 lat/9", "17 lat/9",
"1O lat/3,4", "1O lat/3,4", "8 lat/2", "12 lat/5,6", "8 lat/2",
"1O lat/3,4", "1O lat/3,4", "8 lat/2", "12 lat/5,6", "1O lat/3,4",
"1O lat/3,4", "8 lat/2", "1O lat/3,4", "12 lat/5,6", "12 lat/5,6",
"12 lat/5,6", "14 lat/7,8", "1O lat/3,4", "12 lat/5,6", "12 lat/5,6",
"1O lat/3,4", "1O lat/3,4", "8 lat/2", "17 lat/9", "12 lat/5,6",
"1O lat/3,4", "12 lat/5,6", "1O lat/3,4", "1O lat/3,4", "17 lat/9",
"12 lat/5,6", "12 lat/5,6", "8 lat/2", "1O lat/3,4", "8 lat/2",
"12 lat/5,6", "8 lat/2", "17 lat/9", "8 lat/2", "12 lat/5,6",
"1O lat/3,4", "17 lat/9", "1O lat/3,4", "17 lat/9", "12 lat/5,6",
"14 lat/7,8", "17 lat/9", "17 lat/9", "12 lat/5,6", "1O lat/3,4",
"8 lat/2", "8 lat/2", "8 lat/2", "4 lata/1", "12 lat/5,6", "17 lat/9",
"12 lat/5,6", "17 lat/9", "14 lat/7,8", "14 lat/7,8", "1O lat/3,4",
"12 lat/5,6", "1O lat/3,4", "12 lat/5,6", "1O lat/3,4", "1O lat/3,4",
"8 lat/2", "12 lat/5,6", "12 lat/5,6", "12 lat/5,6", "12 lat/5,6",
"12 lat/5,6", "8 lat/2", "12 lat/5,6", "1O lat/3,4", "8 lat/2",
"8 lat/2", "1O lat/3,4", "8 lat/2", "1O lat/3,4", "14 lat/7,8",
"12 lat/5,6", "1O lat/3,4", "12 lat/5,6", "12 lat/5,6", "1O lat/3,4",
"12 lat/5,6", "17 lat/9", "17 lat/9", "12 lat/5,6", "1O lat/3,4",
"17 lat/9", "1O lat/3,4", "12 lat/5,6", "12 lat/5,6", "12 lat/5,6",
"1O lat/3,4", "1O lat/3,4", "8 lat/2", "1O lat/3,4", "17 lat/9",
"1O lat/3,4", "12 lat/5,6", "1O lat/3,4", "12 lat/5,6", "1O lat/3,4",
"1O lat/3,4", "1O lat/3,4", "12 lat/5,6", "8 lat/2", "1O lat/3,4",
"1O lat/3,4", "12 lat/5,6", "1O lat/3,4", "14 lat/7,8", "12 lat/5,6",
"14 lat/7,8", "12 lat/5,6", "12 lat/5,6", "1O lat/3,4", "1O lat/3,4",
"1O lat/3,4", "1O lat/3,4", "17 lat/9", "17 lat/9", "1O lat/3,4",
"8 lat/2", "1O lat/3,4", "1O lat/3,4", "8 lat/2", "8 lat/2",
"12 lat/5,6", "12 lat/5,6", "14 lat/7,8", "14 lat/7,8", "1O lat/3,4",
"17 lat/9", "17 lat/9", "12 lat/5,6", "12 lat/5,6", "1O lat/3,4",
"1O lat/3,4", "8 lat/2", "1O lat/3,4", "12 lat/5,6", "8 lat/2",
"1O lat/3,4", "12 lat/5,6", "1O lat/3,4", "1O lat/3,4", "1O lat/3,4",
"12 lat/5,6", "12 lat/5,6", "12 lat/5,6", "8 lat/2", "1O lat/3,4",
"1O lat/3,4", "8 lat/2", "12 lat/5,6", "8 lat/2", "1O lat/3,4",
"12 lat/5,6", "8 lat/2", "1O lat/3,4", "1O lat/3,4", "12 lat/5,6",
"14 lat/7,8", "1O lat/3,4", "17 lat/9", "1O lat/3,4", "1O lat/3,4"
)

Answer

You could try

c(17,8,4)[as.numeric(eduyears1994)]
#[1] 17  4 17  4 17 17  4  4 17 17 17 17  4  8  4  4  8  4  8  8

or

 unname(c('4 lata/1'=4, '2'=8, '17 lat' =17)[as.character(eduyears1994)])
 #[1] 17  4 17  4 17 17  4  4 17 17 17 17  4  8  4  4  8  4  8  8

If 8 was infact a typo, you could use

 library(stringi)
 as.numeric(unlist(stri_extract_all_regex(eduyears1994, '^\\d+')))
 #[1] 17  4 17  4 17 17  4  4 17 17 17 17  4  2  4  4  2  4  2  2

data

set.seed(21)
eduyears1994 <- factor(sample(c('4 lata/1', 2, '17 lat'), 20, replace=TRUE))
Comments