Laura R. Laura R. - 1 month ago 6
R Question

Change numeric code of two variables set differently in two df in r

I use R; I hope my answer will not be considered too much "stupid", but I really can't understand the error that I make.

I have a national survey from 2002 to 2014 and each year it is asked the dimension of the company (number of workers) in which the person interviewed works.
A numeric code (1,2,..) is associated to each class dimension. From 2002 to 2006 I have 6 classes of dimension, whereas from 2008 to 2014 seven classes:

2002-2006 2008-2014
0-4 workers -> 1 0-4 workers -> 1
5-19 workers -> 2 5-15 workers -> 2
20-49 workers -> 3 16-19 workers -> 3
50-99 workers -> 4 20-49 workers -> 4
100-499 workers -> 5 50-99 workers -> 5
>500 workers -> 6 100-499 workers -> 6
>500 workers -> 7


First, I changed the code of class 3 (16-19 workers) in year 2008-14 in code 2, in order to have the same class dimension (5-20 workers) of code in 2002-06:

d.d <- data.frame(id=c(1,2,3,4,5,6), yr=c("2002", "2004", "2006", "2008", "2010", "2014"), dim=c(1,2,3,3,4,7))

For example:

id yr dim
1 2002 1
2 2004 2
3 2006 3
4 2008 3
5 2010 4
6 2014 7


the desired output is:

id yr dim
1 2002 1
2 2004 2
3 2006 3
4 2008 2
5 2010 3
6 2014 6


COMMAND 1

d.d$dim2 <- ifelse(d.d$dim=="3" & d.d$yr=="2008",2,
ifelse(d.d$dim=="3" & d.d$yr=="2010",2,
ifelse(d.d$dim=="3" & d.d$yr=="2012",2,
ifelse(d.d$dim=="3" & d.d$yr=="2014",2,
d.d$dim))))


where dim is the company dimension and yr is year. In this way I changed correctly from class 3 to class 2 from 2008 to 2014.

Since codes are not associated with the same class dimension (2002-06 code 3 (20-49 workers), 2008-14 code 4 (20-24 workers)) I tried to allign the codes as before:

COMMAND 2

d.d$dim2 <- ifelse(d.d$dim=="4" & d.d$yr=="2008",3,
ifelse(d.d$dim=="4" & d.d$yr=="2010",3,
ifelse(d.d$dim=="4" & d.d$yr=="2012",3,
ifelse(d.d$dim=="4" & d.d$yr=="2014",3,
d.d$dim))))


I noticed that the second code changes also the code changed by COMMAND 1

RESULT WITH COMMAND 1

d.d

id yr dim dim2
1 1 2002 1 1
2 2 2004 2 2
3 3 2006 3 3
**4 4 2008 3 2**
5 5 2010 4 4
6 6 2014 7 7


RESULT AFTER APPLYING COMMAND 2 (AFTER COMMAND 1)

d.d

id yr dim dim2
1 1 2002 1 1
2 2 2004 2 2
3 3 2006 3 3
**4 4 2008 3 3**
5 5 2010 4 3
6 6 2014 7 7


I can't understand the error.

Answer

Try this:

d.d$yr = as.numeric(d.d$yr)
d.d$dim = as.numeric(d.d$dim)

d.d$dim[ d.d$dim >= 3 & d.d$yr >= 2008 ] = d.d$dim[ d.d$dim >= 3 & d.d$yr >= 2008 ] - 1

First, change the year and dim information to numeric. This will simplify the condition for the subset you want modified.

Then substract 1 from dim for each dim and year that satisfies the condition of being 3 or more and from years 2008 forward.

If year or dim are factors then change them to numeric using as.numeric(as.character(...))