S Das S Das - 1 month ago 11
R Question

Recoding from separate code table

I have a dataset like the following:

dat1 <- read.table(header=TRUE, text="
ID Age Align Weat
8645 15-24 A 1
6228 15-24 B 1
5830 15-24 A 3
1844 25-34 B 1
4461 35-44 B 2
2119 35-44 C 2
2115 45-54 A 1
")
dat1
ID Age Align Weat
1 8645 15-24 A 1
2 6228 15-24 B 1
3 5830 15-24 A 3
4 1844 25-34 B 1
5 4461 35-44 B 2
6 2119 35-44 C 2
7 2115 45-54 A 1


The attributes of the columns
Age
,
Align
, and
Weat
are described in a code dataframe:

dat2 <- read.table(header=TRUE, text="
Code Desc Column
15-24 Young Age
25-34 Young Age
35-44 Middle Age
45-54 Middle Age
A Straight Align
B Curve Align
C Hill Align
1 Clear Weat
2 Cloudy Weat
3 Rain Weat
")
dat2
Code Desc Column
1 15-24 Young Age
2 25-34 Young Age
3 35-44 Middle Age
4 45-54 Middle Age
5 A Straight Align
6 B Curve Align
7 C Hill Align
8 1 Clear Weat
9 2 Cloudy Weat
10 3 Rain Weat


I want to match the code dataframe to get my dataset like the following:

ID Age Align Weat
1 8645 Young Straight Clear
2 6228 Young Curve Clear
3 5830 Young Straight Rain
4 1844 Young Curve Clear
5 4461 Middle Curve Cloudy
6 2119 Middle Hill Cloudy
7 2115 Middle Straight Clear


I am currently using the following codes to do my task, which is not efficient for a large dataset with 500 columns and code table for those columns.

age <- subset(dat2, Column=="Age")
age
Code Desc Column
1 15-24 Young Age
2 25-34 Young Age
3 35-44 Middle Age
4 45-54 Middle Age

align <- subset(dat2, Column=="Align")
align
Code Desc Column
5 A Straight Align
6 B Curve Align
7 C Hill Align

weat <- subset(dat2, Column=="Weat")
weat
Code Desc Column
8 1 Clear Weat
9 2 Cloudy Weat
10 3 Rain Weat

dat1$Age <- age$Desc[match(dat1$Age, age$Code)]
dat1$Align <- align$Desc[match(dat1$Align, align$Code)]
dat1$Weat <- weat$Desc[match(dat1$Weat, weat$Code)]

dat1
ID Age Align Weat
1 8645 Young Straight Clear
2 6228 Young Curve Clear
3 5830 Young Straight Rain
4 1844 Young Curve Clear
5 4461 Middle Curve Cloudy
6 2119 Middle Hill Cloudy
7 2115 Middle Straight Clear

Answer

You can use for loop on your variables in dat1:

# 'intersect' is needed to recode only those columns which have description
for (each_column in intersect(colnames(dat1), dat2$Column)){
    curr_dict = dat2$Column %in% each_column
    code = dat2$Code[curr_dict]
    descr = dat2$Desc[curr_dict]
    dat1[[each_column]] = descr[match(dat1[[each_column]], code)]
}