Marta Karas Marta Karas - 24 days ago 11
R Question

R dplyr - categorize numeric variable with mutate

I would like to a categorize numeric variable in my

data.frame
object with the use of
dplyr
(and have no idea how to do it).

Without
dplyr
, I would probably do something like:

df <- data.frame(a = rnorm(1e3), b = rnorm(1e3))
df$a <- cut(df$a , breaks=quantile(df$a, probs = seq(0, 1, 0.2)))


and it would be done. However, I strongly prefer to do it with the use of some
dplyr
function (
mutate
, I suppose) in the
chain
sequence of other actions I do perform over my
data.frame
.

Answer
set.seed(123)
df <- data.frame(a = rnorm(10), b = rnorm(10))

df %>% mutate(a = cut(a, breaks = quantile(a, probs = seq(0, 1, 0.2))))

giving:

                 a          b
1  (-0.586,-0.316]  1.2240818
2   (-0.316,0.094]  0.3598138
3      (0.68,1.72]  0.4007715
4   (-0.316,0.094]  0.1106827
5     (0.094,0.68] -0.5558411
6      (0.68,1.72]  1.7869131
7     (0.094,0.68]  0.4978505
8             <NA> -1.9666172
9   (-1.27,-0.586]  0.7013559
10 (-0.586,-0.316] -0.4727914
Comments