Bhail - 1 year ago 57
R Question

# What is this doing factor(1 * (hp>200), labels=c("weak","good"))

I am trying to think where should I be looking to figure out what the

`1 * (hp>200)`
is doing in the factor( ). Or for that matter how can I use it.

``````> test<- mutate(mtcars, HPcat=factor(1 * (hp > 175), labels=c("weak","good")))
> str(test['HPcat'])
'data.frame':   32 obs. of  1 variable:
\$ HPcat: Factor w/ 2 levels "weak","good": 1 1 1 1 1 1 2 1 1 1 ...
>
> # So changing the 1 to a different number don't do nothing
>
> test<- mutate(mtcars, HPcat=factor(100 * (hp > 175), labels=c("weak","good")))
> str(test['HPcat'])
'data.frame':   32 obs. of  1 variable:
\$ HPcat: Factor w/ 2 levels "weak","good": 1 1 1 1 1 1 2 1 1 1 ...
``````

R allows a boolean true/false value to be used in numeric expressions, treating `FALSE` as `0` and `TRUE` as `1`. So, the expression `1*(hp>200)` (often written in alternate form `0+(hp>200)`) is a way of performing this conversion -- whenever `hp` exceeds 200, the value is 1, otherwise it's 0.
The `factor` function, when given a vector of 0s and 1s, turns it into a factor with two levels, `0` and `1` in order. The `labels=` argument relabels the levels to `weak` and `good`. If you use `100*(hp>200)`, the vector of 0s and 100s turns into a factor with two levels `0` and `100` which are relabeled by the `labels=` argument, giving the same final answer.
This code will fail if all `hp` values are <=200 or all values are >200 because it relies on the converted vector to contain both 0s and 1s for the factor to be constructed properly with two levels.
``````factor(ifelse(hp>175,"good","weak"), levels=c("weak","good"))