Cebs Cebs - 2 months ago 9
R Question

Sliding window in a data frame r

I’m trying to obtain the proportions of individuals that that shares certain DNA sequences between two given points. And I want to use a specific sliding window. In order to show the problem I create this example. First I create a data frame with four columns.

x<-c(rep("sc256",times=2000),rep("sc784",times=2000))
pos1<-round(runif(2000,100,5000),digits=0)
pos2<-round(runif(2000,100,5000),digits=0)
y3<-rep(c(2,1),times=2000)
M1<-data.frame(x,pos1,pos2,y3)
colnames(M1)=c("iid","pos1","pos2","chr")


I also create a function to obtain the proportion of individuals that have sequences in a particular interval.

roh_island<-function(pop,chr,p1,p2){
a<-pop[pop$chr==chr,]
island<-subset(a,pos1>=p1 & pos2<=p2)
n<-nrow(island)/length(M1$iid)
return(n)
}

roh_island(M1,1,345,700)


Now I want to transform this interval into a sliding window of size 10 that moves between values 0 and 7000. So this window will take positions [0,10);(10,20),…,(6990,7000]. I also need that the new function with the slide window stores all the windows and proportion of individuals in each in a data frame to afterwards plot it. I try some solutions that I have found regarding sliding windows I saw but I could not make them work. Thanks

Answer

This code will slide p1 from 0 to 6990 in steps of 10 while p2 slides from 10 to 7000 in steps of 10:

output = apply(data.frame(seq(0,6990,10), seq(10,7000,10)), MARGIN=1,
           function(x,y,z,a) roh_island(M1, 1, x[1], x[2]))
plot(output, col="blue")
grid(5, 5)

enter image description here