My data is in the following format and includes a particular statistic
You now have a sampling distribution of LR values. The
quantile function in R will give you an estimate of whatever "critical value" you prefer. If, for instance, you decided you wanted the conventional 0.05 "p-value" you could take your dataframe, named LR_df for illustration, and issue this command:
quantile( LR_df[ , 'LRStat'] , 0.95)
If yo wnated all of those "probabilities" on hte figure you would use a vector of values complementary to unity. This gives you the
LSstat values at which a given proportion of the sample are higher than that value.
quantile( LR_df[ , 'LRStat'] , c(0.9, 0.95, 0.99, 0.999, 0.9999) )
The p-values are just a sampling distribution of a test statistic under a null hypothesis. Your null hypothesis in this case is that the LRstats are uniformly distributed. (I know it sounds strange to put it that way, but if you want to argue with the statisticians then get a copy of http://amstat.tandfonline.com/doi/pdf/10.1198/000313008X332421 .) The choice of p-value for cutoff will depend on scientific or business setting. If you were assessing an investment opportunity the cutoff might be 0.15 but if you are trying to find new scientific knowledge, I think it should be higher. The field of molecular genetics has a lot of junk (i.e. fails to reproduce results) in their literature because they were not strict enough in the statistical methods.