adam.888 - 4 months ago 13

R Question

I have a numeric vector that I want to convert to five numeric levels.

I can get the five levels using cut

`dx <- data.frame(x=1:100)`

dx$cut <- cut(dx$x,5)

But I am now having problems extracting the lower and upper boundaries of the levels.

So for example

(0.901,20.8] would be 0.901 in dx$min and 20.8 in dx$max.

I tried:

`dx$min <- pmin(dx$cut)`

dx$max <- pmax(dx$cut)

dx

But this does not work.

Thank you for any help.

Answer

you can try splitting the labels (converted to `character`

beforehand and modified to suppress the punctuation except `,`

and `.`

) according to the comma and then create 2 columns:

```
min_max <- unlist(strsplit(gsub("(?![,.])[[:punct:]]", "", as.character(dx$cut), perl=TRUE), ",")) # here, the regex ask to replace every punctuation mark except a . or a , by an empty string
dx$min <- min_max[seq(1, length(min_max), by=2)]
dx$max <- min_max[seq(2, length(min_max), by=2)]
head(dx)
# x cut min max
#1 1 (0.901,20.8] 0.901 20.8
#2 2 (0.901,20.8] 0.901 20.8
#3 3 (0.901,20.8] 0.901 20.8
#4 4 (0.901,20.8] 0.901 20.8
#5 5 (0.901,20.8] 0.901 20.8
#6 6 (0.901,20.8] 0.901 20.8
```