DRozen - 1 year ago 86

R Question

In R: How do I loop through multiple columns and use a custom made function that takes in an argument from each of those columns and modifies those columns accordingly?

For example I have the following dataframe:

`> head(runTimeSep)`

hours h minutes min

1 70 min NA <NA>

2 21 min NA <NA>

3 106 min NA <NA>

4 75 min NA <NA>

5 14 min NA <NA>

6 82 min NA <NA>

7 1 h 11 min

my goal is to obtain a list of total minutes in the hours column. If "1h" is listed in the hours and h column, then convert hours to minutes and add on the minutes from the minutes column (or add nothing is it's a perfect hour with NA in the minutes column).

Therefore I have created the following function to apply to the dataframe:

`# convert hours to minutes function`

hoursToMins = function(hours, h, minutes, min) {

if (h == 'h' && min == "min") {

(hours = as.numeric(hours)*60+as.numeric(minutes))

}

if (h=="h" && min != "min") {

(hours = as.numeric(hours)*60)

}

}

How do I apply this function across all columns in the data frame? Eg. with lapply, ddpply, etc.

Edit: I also attempted the following:

`finalRunTime = ifelse(runTimeSep$h == "h", runTimeSep$hours*60, runTimeSep$hours)`

head(finalRunTime)

runTimeSep$hours = finalRunTime

which worked fine. But when I tried to apply the second round of ifelse:

`finalRunTime = ifelse(runTimeSep$min == "min", runTimeSep$hours + runTimeSep$minutes, runTimeSep$hours)`

head(finalRunTime)

runTimeSep$hours = finalRunTime

the 2nd round causes the else case (if there's no minute column) to become NA. Please help. Thanks.

In response to @Sandipan's answer:

How do I use which to discriminate whether the min column is 'min' or NA?

I tried:

`indices <- which(runTimeSep$h == 'h' && runTimeSep$min != 'min')`

runTimeSep[indices,]$hours <- 60*runTimeSep[indices, ]$hours

indices <- which(runTimeSep$h == 'h' && runTimeSep$min == 'min')

runTimeSep[indices,]$hours <- 60*runTimeSep[indices, ]$hours +

runTimeSep[indices,]$minutes

However both sets of indices returned empty sets.

Answer Source

This would give you a vector of minutes by row and if you wanted its total, then just wrap `sum()`

around it:

```
with( dat, (h=="h")*60*hours + (h=="min")*hours +
ifelse( is.na(minutes), 0, minutes) )
[1] 70 21 106 75 14 82 71
```

It substitutes 0 for NA when minutes is NA.