RHA RHA - 10 days ago 6
R Question

Calculating distance between grouped locations (transects)

I have a data.frame with GPS-locations of transects, here's a small part of it:

structure(list(X = c(432532.3435, 432533.3316, 432534.3198, 432535.3068,
432536.339, 432528.3127, 432529.2956, 432530.2271, 432531.2019,
432532.1574, 432533.1353, 432534.0987, 432703.2786, 432702.2761,
432701.4092, 432700.3743, 432699.4523), Y = c(179892.6113, 179892.7918,
179892.9953, 179893.2271, 179893.3646, 179931.3134, 179931.5124,
179931.7763, 179932.0264, 179932.256, 179932.5104, 179932.7853,
179432.1222, 179432.2754, 179432.5235, 179432.7024, 179432.9146
), plot_raai = c("F1", "F1", "F1", "F1", "F1", "F6", "F6", "F6",
"F6", "F6", "F6", "F6", "A3", "A3", "A3", "A3", "A3")), .Names = c("X",
"Y", "plot_raai"), row.names = c(1L, 2L, 3L, 4L, 5L, 200L, 201L,
202L, 203L, 204L, 205L, 206L, 1039L, 1040L, 1041L, 1042L, 1043L
), class = "data.frame")


I want to add a column with the distance of every row(location) to the first row of it's transect. So expected result would be

X Y plot_raai dist
1 432532.3 179892.6 F1 0.000000
2 432533.3 179892.8 F1 1.004451
3 432534.3 179893.0 F1 2.013260
4 432535.3 179893.2 F1 3.026608
5 432536.3 179893.4 F1 4.065892
200 432528.3 179931.3 F6 0.000000
201 432529.3 179931.5 F6 1.002843
202 432530.2 179931.8 F6 1.969569
203 432531.2 179932.0 F6 2.975877
204 432532.2 179932.3 F6 3.958562
205 432533.1 179932.5 F6 4.968931
206 432534.1 179932.8 F6 5.970284
1039 432703.3 179432.1 A3 0.000000
1040 432702.3 179432.3 A3 1.014138
1041 432701.4 179432.5 A3 1.911988
1042 432700.4 179432.7 A3 2.961687
1043 432699.5 179432.9 A3 3.907489


Here's what I have tried:

#created distance function (Pythagoras)
distance <- function(x1,y1,x2,y2) {sqrt((x2-x1)^2+(y2-y1)^2)}

#applied that to the rows with sapply (however, no grouping yet)
sapply(2:nrow(mydf), function(x) distance(mydf$X[x],mydf$Y[x],mydf$X[1], mydf$Y[1]))

#then tried grouping using dplyr
library(dplyr)
test1 <- mydf %>%
group_by(., plot_raai) %>%
mutate(dist = c(0,sapply(2:nrow(.), function(x)
distance(X[x],Y[x],X[1],Y[1]) )))


However, this calculates distances to the first row in de dataframe, not the first row in the group:

X Y plot_raai dist
1 432532.3 179892.6 F1 0.000000
2 432533.3 179892.8 F1 1.004451
3 432534.3 179893.0 F1 2.013260
4 432535.3 179893.2 F1 3.026608
5 432536.3 179893.4 F1 4.065892
6 432528.3 179931.3 F6 38.911437
7 432529.3 179931.5 F6 39.020319
8 432530.2 179931.8 F6 39.222141
9 432531.2 179932.0 F6 39.431629
10 432532.2 179932.3 F6 39.645137
11 432533.1 179932.5 F6 39.906956
12 432534.1 179932.8 F6 40.212324
13 432703.3 179432.1 A3 491.191429
14 432702.3 179432.3 A3 490.699734
15 432701.4 179432.5 A3 490.167313
16 432700.4 179432.7 A3 489.643284
17 432699.5 179432.9 A3 489.128211


I know it must be simple, but I've tried several others ways and I'm already struggling for over an hour. Can anyone help me with this?

Answer

Try this:

mydf%>%
  group_by(plot_raai)%>%
  mutate(dist=distance(first(X),first(Y),X,Y))

It gives me:

          X        Y plot_raai     dist
      <dbl>    <dbl>     <chr>    <dbl>
1  432532.3 179892.6        F1 0.000000
2  432533.3 179892.8        F1 1.004451
3  432534.3 179893.0        F1 2.013260
4  432535.3 179893.2        F1 3.026608
5  432536.3 179893.4        F1 4.065892
6  432528.3 179931.3        F6 0.000000
7  432529.3 179931.5        F6 1.002843
8  432530.2 179931.8        F6 1.969569
9  432531.2 179932.0        F6 2.975877
10 432532.2 179932.3        F6 3.958562
11 432533.1 179932.5        F6 4.968931
12 432534.1 179932.8        F6 5.970284
13 432703.3 179432.1        A3 0.000000
14 432702.3 179432.3        A3 1.014138
15 432701.4 179432.5        A3 1.911988
16 432700.4 179432.7        A3 2.961687
17 432699.5 179432.9        A3 3.907489
Comments