Thomas Thomas - 1 month ago 5
R Question

R: How to include NA in ifelse?

I am trying to create a column

ID
based on logical statements for values of other columns. For example, in the following dataframe

test <- structure(list(time = c(10L, 20L, NA, 30L), type = structure(c(1L,
2L, 3L, NA), .Label = c("A", "B", "C"), class = "factor"), ID = c(NA,
"1", NA, NA)), .Names = c("time", "type", "ID"), row.names = c(NA,
-4L), class = "data.frame")


which looks like

time type
1 10 A
2 20 B
3 NA C
4 30 NA


I want to make a new column
ID
containing a value of 1 for all
time
that are not
NA
and all
type
that are not
A
. I am using the following code for this:

test$ID <- ifelse(is.na(test$time) | test$type == "A", NA, "1")


This gives the result as

time type ID
1 10 A NA
2 20 B 1
3 NA C NA
4 30 NA NA


However, this code ignores the
NA
in column
type
, resulting in a value of
NA
in column
ID
. I need this to be a value of 1, so my needed solution should give:

time type ID
1 10 A NA
2 20 B 1
3 NA C NA
4 30 NA 1


Can anyone tell me how I might do this? I could get this to work with my existing code if I could somehow change the result of
is.na(test$type)
to return
FALSE
instead of
TRUE
, but I'm not sure how to do that. Or, maybe the structure of my existing code needs to be entirely changed? I appreciate any help!

Answer

You can't really compare NA with another value, so using == would not work. Consider the following:

NA == NA
# [1] NA

You can just change your comparison from == to %in%:

ifelse(is.na(test$time) | test$type %in% "A", NA, "1")
# [1] NA  "1" NA  "1"

Regarding your other question,

I could get this to work with my existing code if I could somehow change the result of is.na(test$type) to return FALSE instead of TRUE, but I'm not sure how to do that.

just use ! to negate the results:

!is.na(test$time)
# [1]  TRUE  TRUE FALSE  TRUE