Craig Craig - 1 month ago 13
R Question

dplyr conditional mutate on itself

I have a data frame with a character variable consisting of mostly numeric values, with occasional known character strings as well as some

NA
values. I want to conditionally reformat the numeric values to have one decimal place, but leave the character and
NA
values alone.

This code works on a toy data frame and produces the desired output:

df <- data.frame(a = c("1", "2", "3", "none", NA),
stringsAsFactors = FALSE)

test <- df %>%
mutate(a = ifelse(is.na(a) | a == "none",
a,
format(round(as.numeric(a), 1), nsmall = 1)))

test
# a
# 1 1.0
# 2 2.0
# 3 3.0
# 4 none
# 5 <NA>


But throws a warning message

Warning message:
In format(round(as.numeric(c("1", "2", "3", "none", NA)), 1), nsmall = 1) :
NAs introduced by coercion


which I believe is the case b/c
format(round(as.numeric(a), 1), nsmall = 1)))
is still acting on the entire vector, even though the values from that are only used in the
mutate
statement where the
ifelse
condition is false.

I can wrap the whole thing in
suppressWarnings()
, but is there some other way to have this generate the desired output without warnings within the
dplyr
framework? I'm sure there's a
data.table
way to do it but this is part of a package that doesn't need
data.table
for anything else and it seems silly to make it necessary for such a small piece...

Answer

Use replace and you can convert just the numeric type data in column a:

test <- df %>%
               mutate(a = replace(a, !is.na(a) & a != "none",
                      format(round(as.numeric(a[!is.na(a) & a != "none"]), 1), nsmall = 1)))

test
#     a
#1  1.0
#2  2.0
#3  3.0
#4 none
#5 <NA>