Graciela Carrillo Graciela Carrillo - 2 months ago 6
R Question

Matching "uniqid" with corresponding sex and age.

I have a dataframe

imcds
from a survey that asked sex and age information of every person in the household to the householder. So the householder would be Person 1 and the rest of the people would be person 2, 3, 4 .. etc... thus:

uniqid Age1 Age2 Age3 Sex1 Sex2 Sex3

1012501 9 7 5 1 2 1
1012502 9 7 5 1 2 1
1012503 9 7 5 1 2 1
1012601 8 5 NA 2 1 NA
1012602 8 5 NA 2 1 NA


The first five numbers of the
uniqid
are the household ID and the last two are the person identifier. Therefore, the Age value of Person
1012503
is
Age3
(5), and Sex is
Sex3
(1). What I want to do is reshape the data frame
imcds
into something like this:

uniqid Age Sex

1012501 9 1
1012502 7 2
1012503 5 1
1012601 8 2
1012602 5 1


Each
uniqid
with their correspondent
Sex
and
Age
values.The data frame has 2095 obs of 583 variables. Do I need a loop? What can I do?

Answer

We extract the substring from 6 to 7 characters in the 'uniqid' column, use that to create row/column index ('ind'), extract the corresponding elements from 'Age' columns and 'Sex' columns, and cbind with the first column of the dataset.

ind <- cbind(1:nrow(df1), as.numeric(substr(df1$uniqid, 6,7)))
Age <- df1[grep("Age", names(df1))][ind]
Sex <- df1[grep("Sex", names(df1))][ind]
df2 <- cbind(df1[1], Age, Sex)
df2
#   uniqid Age Sex
#1 1012501   9   1
#2 1012502   7   2
#3 1012503   5   1
#4 1012601   8   2
#5 1012602   5   1