A D - 1 year ago 105
R Question

Understanding an RLE coverage value

Using R and bioconductor.

I'm not sure how to understand an integer rle that you'd get from functions like coverage() such as this

``````integer-Rle of length 3312 with 246 runs
Lengths:  25  34 249  16   7  11  16 ...   2  32   2  26  34  49
Values :   0   1   0   1   2   3   2 ...   1   2   1   0   1   0
``````

Okay so I get that it represents coverage of one range vs other ranges. In this case reads of an experiment over a given range. What do the 'runs' mean? What about the 'Lengths' and 'Values'? I thought that maybe Lengths represent a postion and values represent the amount of times its covered but then why would there be multiples of the same position such as 2 above? Why would they be out of order?

``````sum(coverage)
``````

to compare the coverage of one range to another of a different length and I was wondering if that was appropriate.

``````plot(as.integer(coverage))
Maybe `sum(coverage)` is appropriate; a more usual metric is to count reads rather than coverage, e.g., with `GenomicRanges::summarizeOverlaps()` illustrated in this DESeq2 work flow in the context of RNA-seq.