DRozen DRozen - 1 year ago 86
R Question

R: Looping through multiple columns and using all columns in a function?

In R: How do I loop through multiple columns and use a custom made function that takes in an argument from each of those columns and modifies those columns accordingly?

For example I have the following dataframe:

> head(runTimeSep)
hours h minutes min
1 70 min NA <NA>
2 21 min NA <NA>
3 106 min NA <NA>
4 75 min NA <NA>
5 14 min NA <NA>
6 82 min NA <NA>
7 1 h 11 min

my goal is to obtain a list of total minutes in the hours column. If "1h" is listed in the hours and h column, then convert hours to minutes and add on the minutes from the minutes column (or add nothing is it's a perfect hour with NA in the minutes column).

Therefore I have created the following function to apply to the dataframe:

# convert hours to minutes function
hoursToMins = function(hours, h, minutes, min) {
if (h == 'h' && min == "min") {
(hours = as.numeric(hours)*60+as.numeric(minutes))
if (h=="h" && min != "min") {
(hours = as.numeric(hours)*60)

How do I apply this function across all columns in the data frame? Eg. with lapply, ddpply, etc.

Edit: I also attempted the following:

finalRunTime = ifelse(runTimeSep$h == "h", runTimeSep$hours*60, runTimeSep$hours)
runTimeSep$hours = finalRunTime

which worked fine. But when I tried to apply the second round of ifelse:

finalRunTime = ifelse(runTimeSep$min == "min", runTimeSep$hours + runTimeSep$minutes, runTimeSep$hours)
runTimeSep$hours = finalRunTime

the 2nd round causes the else case (if there's no minute column) to become NA. Please help. Thanks.

In response to @Sandipan's answer:
How do I use which to discriminate whether the min column is 'min' or NA?

I tried:

indices <- which(runTimeSep$h == 'h' && runTimeSep$min != 'min')
runTimeSep[indices,]$hours <- 60*runTimeSep[indices, ]$hours

indices <- which(runTimeSep$h == 'h' && runTimeSep$min == 'min')
runTimeSep[indices,]$hours <- 60*runTimeSep[indices, ]$hours +

However both sets of indices returned empty sets.

42- 42-
Answer Source

This would give you a vector of minutes by row and if you wanted its total, then just wrap sum() around it:

with( dat,   (h=="h")*60*hours + (h=="min")*hours + 
                                             ifelse( is.na(minutes), 0, minutes) )

[1]  70  21 106  75  14  82  71

It substitutes 0 for NA when minutes is NA.