Alberto Mnemon - 1 year ago 116
R Question

Testing CSR on lpp with R

I have recently posted a "very newie to R" question about the correct way of doing this, if you are interested in it you can find it [here].1

I have now managed to develop a simple R script that does the job, but now the results are what troubles me.

Long story short I'm using R to analyze

`lpp`
(Linear Point Pattern) with
`mad.test`
.That function performs an hypothesis test where the null hypothesis is that the points are randomly distributed. Currently I have 88 lpps to analyze, and according to the
`p.value`
86 of them are randomly distributed and 2 of them are not.

These are the two not randomly distributed lpps.

Looking at them you can see some kind of clusters in the first one, but the second one only has three points, and seems to me that there is no way one can assure only three points are not corresponding to a random distribution. There are other tracks with one, two, three points but they all fall into the "random" lpps category, so I don't know why this one is different.

So here is the question: how many points are too little points for CSR testing?

I have also noticed that these two lpps have a much lower
`\$statistic\$rank`
than the others. I have tried to find what that means but I'm clueless righ now, so here is another newie question: Is the
`\$statistic\$rank`
some kind of quality analysis indicator, and thus can I use it to group my lpp analysis into "significant ones" and "too little points" ones?

My R script and all the shp files can be downloaded from here(850 Kb).

Thank you so much for your help.

Regarding your question about the rank you might want to read a bit more up on the Monte Carlo test you are preforming. Basically, you summarise the point pattern by a single number (maximum absolute deviation of linear K) and then you look at how extreme this number is compared to numbers generated at random from CSR. Assuming you use 99 simulations of CSR you have 100 numbers in total. If your data ranks as the most extreme (`\$statistic\$rank==1`) among these it has p-value 1%. If it ranks as the 50th number the p-value is 50%. If you used another number of simulations you have to calculate accordingly. I.e. with 199 simulations rank 1 is 0.5%, rank 2 is 1%, etc.