user4050 user4050 - 3 months ago 14
R Question

What is the meaning of include.lowest in reclassify raster package [r]

Given the definition of open internal (does not include end points) and closed interval (includes end points), it's easy to understand the

right
argument in
reclassify
. But I am confused on the
include.lowest
argument. It mentions


indicating if a value equal to the lowest value in rcl (or highest
value in the second column, for right = FALSE) should be included


the lowest value in rcl would be the first value, which according to
right
is not included by default, so setting include.lowest to true will include the lowest value. But the part about the "highest value in the second column", I don't understand what it's referring to. And what does "for right = FALSE" mean? The highest value in the second column should already be included anyway.

so if I have rcl=c(0,1,5, 1,Inf,10) by default it means 0>x>=1 becomes 5, and x>1 becomes 10. What happens if include.lowest is TRUE? 0>=x>=1 and....?

I find it confusing because the example given on the reclassify help file says that


all values >= 0 and <= 0.25 become 1, etc. m <- c(0, 0.25, 1, 0.25, 0.5, 2, 0.5, 1, 3)


but then the reclassify function in the example doesn't use the include.lowest so it shouldn't be all values >= 0 but >0.

EDIT: I find the help page very confusing, and given the answer the example's explanation in the help page is wrong.

Answer

As I said in my comment, the way that right and include.lowest work are exactly the same as in R base function cut. For a simple illustration, I will use cut in below, with vector 1:10 and break points 1, 5, 10.

By default, right = TRUE, so all intervals will be left open and right closed, thus we have two intervals: (1, 5], (5, 10]. Note these together give another left open right closed interval (1, 10], where the lowest 1 is not included. include.lowest = TRUE will consider [1, 10] and do [1,5], (5,10]. Compare

cut(1:10, right = TRUE, breaks = c(1, 5, 10))
# [1] <NA>   (1,5]  (1,5]  (1,5]  (1,5]  (5,10] (5,10] (5,10] (5,10] (5,10]
#Levels: (1,5] (5,10]

cut(1:10, right = TRUE, breaks = c(1, 5, 10), include.lowest = TRUE)
# [1] [1,5]  [1,5]  [1,5]  [1,5]  [1,5]  (5,10] (5,10] (5,10] (5,10] (5,10]
#Levels: [1,5] (5,10]

Now, if we set right = FALSE, all intervals will be left closed and right open: [1, 5), [5, 10). In this case, the include.lowest = TURE essentially includes the highest value. Compare

cut(1:10, right = FALSE, breaks = c(1, 5, 10))
# [1] [1,5)  [1,5)  [1,5)  [1,5)  [5,10) [5,10) [5,10) [5,10) [5,10) <NA>  
#Levels: [1,5) [5,10)

cut(1:10, right = FALSE, breaks = c(1, 5, 10), include.lowest = TRUE)
# [1] [1,5)  [1,5)  [1,5)  [1,5)  [5,10] [5,10] [5,10] [5,10] [5,10] [5,10]
#Levels: [1,5) [5,10]

Back to raster::reclassify.

I find it confusing because the example given on the reclassify help file says that

all values >= 0 and <= 0.25 become 1, etc. m <- c(0, 0.25, 1, 0.25, 0.5, 2, 0.5, 1, 3)

Why? With above m, you have rcl matrix:

matrix(m, ncol = 3L, byrow = TRUE, dimnames = list(NULL, c("from", "to", value)))
#     from   to value
#[1,] 0.00 0.25     1
#[2,] 0.25 0.50     2
#[3,] 0.50 1.00     3

With right = TRUE and include.lowest = FALSE (default behaviour), you have

(0.00, 0,25]   --->   1
(0.25, 0.50]   --->   2
(0.50, 1.00]   --->   3

with right = TRUE and include.lowest = TRUE, you have

[0.00, 0,25]   --->   1
(0.25, 0.50]   --->   2
(0.50, 1.00]   --->   3