giacomoV giacomoV - 3 months ago 14
R Question

R - greatest common divisor dplyr routine

I need to find the greatest common divisor (gcd) for a set of durations:

dur
.

My data look like this

actrec dur
1 c Personal Care 120
2 c Free Time 10
3 c Free Time 70
4 c Free Time 40
5 b Unpaid 10
6 c Free Time 20
7 c Personal Care 30
8 c Free Time 40
9 c Free Time 40
10 c Free Time 10


I am using the function
gcd
of the
schoolmath
library.
I am looping through my data and store the values in the vector
v
.
Finally, I use the
min
of
v
to find the gcd of my data.

library(schoolmath)

l = length(dt$dur)
v = array(0, l)

for(i in 2:l){
v[i] = gcd(dt$dur[i], dt$dur[i-1])
}

minV = min(v[-1])
minV


Which gives
10
.

However, I have trouble translating this routine into
dplyr
.

I thought of something like (
lag
for loop).

dt %>% mutate(gcd(dur, lag(dur, 0)))


But it isn't working. And I am unsure how to insert
min
.

Any clue ?

Answer

We can use rowwise to apply the gcd function on each row after taking the lag of 'dur, extract the 'new1' and get the min

dt %>%
   mutate(dur1 = lag(dur, default = dur[1])) %>% 
   rowwise() %>% 
   mutate(new1 = gcd(dur, dur1)) %>% 
  .$new1 %>% 
   tail(.,-1) %>% 
   min
#[1] 10

Or we create a Vectorized function of 'gcd' and apply on the 'dur' column

 gcdV <- Vectorize(function(x,y) gcd(x, y))
 dt %>%
   mutate(new1 = gcdV(dur, lag(dur, default = dur[1])))

and get the min as in the above solution.