Kevin Moreau - 1 year ago 85
R Question

# How to determine the average distance between known motif in a list of DNA sequences

So there is my problem : I am searching for the average distance between a known motif inside sequence, and extend this to a list of sequences... The first part is done, the second part (extend to a list of sequences) is the problematic one ! So, here the way i am doing the first part :

``````source("motifOccurrence.R") #https://www.r-bloggers.com/calculate-the-average-distance-between-a-given-dna-motif-within-dna-sequences-in-r/
library("seqinr")
df2 <- df[[1]]
motif <- c("T", "C", "C", "A")
coord <- coordMotif(df2, motif)
motidist <- computeDistance(coord)
motidist

[1] 152
``````

It's appear that the first sequence of my fasta list have an average distance of 152 nucleotides between two TCCA motifs. And, i don't know how automatize this to all my list in df...

Thanks by advance for the help.

Kévin

This is untested, but should work. `sapply` "climbs" each list element (we could also use `lapply` here).
``````sapply(df, FUN = function(x, motif) {
The result will be a vector. If you would like to keep it a list, use `sapply(..., simplify = FALSE)`. Simplification is not done with `lapply`. Consider either behavior as a convenience. :)