user2716568 user2716568 - 3 months ago 14
R Question

Apply dplyr over select number of rows to calculate angular velocity

An example of my dataset is structured as follows:

dput(head(MovementAnalysis,10))
structure(list(Name = c("Amber", "Amber", "Amber", "Amber", "Amber",
"Jeff", "Jeff", "Jeff", "Jeff", "Jeff"), Sample = c(1, 2, 3, 4, 5, 1, 2,
3, 4, 5), X = c(26.66, 26.66, 26.65, 26.64, 26.64, 26.47, 26.46, 26.45,
26.43, 26.42), Y = c(-12.38, -12.37, -12.36, -12.36, -12.35, -12.23,
-12.22, -12.22, -12.22, -12.22), .Names = c("Name", "Sample", "X", Y"), row.names = c(NA, 10L), class = "data.frame")


I wish to calculate the Angular Velocity for each
Name
using the mathematics of the following, where k = 2.

i <- 2
while(i < length(X) - k){
if (i > k)
{
a <- c(X[i] - X[i-k], Y[i] - Y[i-k])
b <- c(X[i+k] - X[i], Y[i+k] - Y[i])
AngularVelocity <- acos(sum(a * b) / (sqrt(sum(a * a)) * sqrt(sum(b * b)))) * (180 / pi)
}

i <- i+1
}


I attempted to do this in
dplyr
(as I have attempted below) as my actual dataset has 1 million+ rows, but am stuck on how to leave the first 2 rows (k) and then iterate from row 3 onwards, to calculate Angular Velocity for each
Name
?

Output <- MovementAnalysis %>%
arrange(Name,Sample) %>%
group_by(Name) %>%
mutate(An = (X - (lag(X)-2) + (Y - (lag(Y)-2))))

Output <- MovementAnalysis %>%
arrange(Name,Sample) %>%
group_by(Name) %>%
mutate(Bn = (X - (lag(X)+2) + (Y - (lag(Y)+2))))


I understand this is a lengthy question, so welcome any feedback on how to improve the question.

UPDATED

I had successfully been using code in the answer below for a while now. However, I am now getting an error when trying the code with a new dataset. An example of this dataset is below:

# Create list of inviduals, drill number and practical or criterion measure
ID = c("Gus_D1_Practical", "Gus_D1_Criterion", "Hudson_D1_Practical", "Hudson_D1_Criterion")
# Set the seed
set.seed(300)
# Create a data.frame of dummy peak velocity data from two different tracking systems
ExampleDataset <- data.frame(ID = rep((ID), each = 300),
Sample = rep(1:300, each = 1),
X = runif(300, 4.5, 6.7),
Y = runif(300, 4.1, 8))
# Set the SampleRate
SampleRate <- 100
k <- as.integer(SampleRate)
# Calculate Angular Velocity
library(dplyr)
Output <- ExampleDataset %>%
arrange(ID,Sample) %>%
group_by(ID) %>%
do( { a = diff(cbind(.$X, .$Y),lag=2)
b = tail(a, -k)
a = head(a, -k)
ang_vel = acos(rowSums(a*b)/(sqrt(rowSums(a^2))*sqrt(rowSums(b^2)))) * (180 / pi)
data_frame(Sample=head(tail(.$Sample,-k),-k), ang_vel) }) %>%
right_join(ExampleDataset, by = c("ID","Sample"))


Unfortunately, when I now try to calculate Angular Velocity the following error is returned:

Error in data_frame_(lazyeval::lazy_dots(...)) :
arguments imply differing number of rows: 100, 198


Any thoughts on what I may be doing wrong?

Answer

I suspect this is a bit different type of application for dplyr. You might try something like

 library(dplyr)
  Output <- MovementAnalysis %>%
  arrange(Name,Sample) %>%
  group_by(Name) %>%
  do( { a = diff(cbind(.$X, .$Y),lag=2)
    b = tail(a, -k) 
    a = head(a, -k)
    ang_vel = acos(rowSums(a*b)/(sqrt(rowSums(a^2))*sqrt(rowSums(b^2)))) * (180 / pi)
    data_frame(Sample=head(tail(.$Sample,-k),-k), ang_vel) }) %>%
  right_join(MovementAnalysis, by = c("Name","Sample")) 
Comments