refroll - 1 year ago 119
R Question

# For each row extract the value in the column name that match another value in the cell

I have a question which can be easily solved with a for-loop. However, since I have hundred-thousands rows in a dataframe, this would take very long computational time, and thus I am looking for a quick and smart solution.

For each row in my dataframe, I would like to paste the value of the cell whose column name matches the one from the first column (INDEX)

The dataframe looks like this

``````> mydata
INDEX    1   2    3   4    5   6
1     2 18.9 9.5 22.6 4.7 16.2 7.4
2     2 18.9 9.5 22.6 4.7 16.2 7.4
3     2 18.9 9.5 22.6 4.7 16.2 7.4
4     4 18.9 9.5 22.6 4.7 16.2 7.4
5     4 18.9 9.5 22.6 4.7 16.2 7.4
6     5 18.9 9.5 22.6 4.7 16.2 7.4
``````

Here's the code for reproducing it:

``````mydata <- data.frame(INDEX=c(2,2,2,4,4,5), ONE=(rep(18.9,6)), TWO=(rep(9.5,6)),
THREE=(rep(22.6,6)), FOUR=(rep(4.7,6)), FIVE=(rep(16.2,6)), SIX=(rep(7.4,6)))
colnames(mydata) <- c("INDEX",1,2,3,4,5,6)
``````

And this is the new dataframe with the newly calculated variable:

``````> new_mydf
INDEX    1   2    3   4    5   6 VARIABLE
3     2 18.9 9.5 22.6 4.7 16.2 7.4      9.5
2     2 18.9 9.5 22.6 4.7 16.2 7.4      9.5
1     2 18.9 9.5 22.6 4.7 16.2 7.4      9.5
5     4 18.9 9.5 22.6 4.7 16.2 7.4      4.7
4     4 18.9 9.5 22.6 4.7 16.2 7.4      4.7
6     5 18.9 9.5 22.6 4.7 16.2 7.4     16.2
``````

I solved it using the for-loop here below, but, as I wrote above, I am looking for a more straightforward solution (maybe using packages like dplyr, or other functions?), as the loop is to slow for my extended dataset

``````id = mydata\$INDEX
new_mydf <- data.frame()
for (i in 1:length(id)) {
mydata_row <- mydata[i,]
value <- mydata_row\$INDEX
mydata_row["VARIABLE"] <- mydata_row[,names(mydata_row) == value]
new_mydf <- rbind(mydata_row,new_mydf)
}
new_mydf <- new_mydf[ order(new_mydf[,1]), ]
``````

Based on your loop, this use of `apply` with an anonymous function may be faster (with your `mydata` initial definition) :

``````mydata\$VARIABLE<-apply(mydata, 1, function(x) { x[names(x)==x[names(x)=="INDEX"]] })
``````

Edit : And it works even with `INDEX` in characters :

``````mydata <- data.frame(INDEX=c("B","B","B","D","D","E"), "A"=(rep(18.9,6)), "B"=(rep(9.5,6)),
"C"=(rep(22.6,6)), "D"=(rep(4.7,6)), "E"=(rep(16.2,6)), "F"=(rep(7.4,6)))

mydata\$VARIABLE<-apply(mydata, 1, function(x) { x[names(x)==x[names(x)=="INDEX"]] })
``````

```> mydata INDEX A B C D E F VARIABLE 1 B 18.9 9.5 22.6 4.7 16.2 7.4 9.5 2 B 18.9 9.5 22.6 4.7 16.2 7.4 9.5 3 B 18.9 9.5 22.6 4.7 16.2 7.4 9.5 4 D 18.9 9.5 22.6 4.7 16.2 7.4 4.7 5 D 18.9 9.5 22.6 4.7 16.2 7.4 4.7 6 E 18.9 9.5 22.6 4.7 16.2 7.4 16.2```

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download