JavierM88 JavierM88 - 1 year ago 65
R Question

Error when replacing a value in a data frame with NAs

Say I have this dataframe


x <- c("NS","NS",NA)
y <- c("yes","yes","b")
z <- as.data.frame(cbind(x,y), stringsAsFactors=FALSE)

> z
x y
1 NS yes
2 NS yes
3 <NA> b

I just want to change the values that contain the
element to
. If I do this I get an error:

Error in `[<-.data.frame`(`*tmp*`, z$x == "NS", "yes", value = "a") :
missing values are not allowed in subscripted assignments of data frames

Because for some reason I am getting the dataframe with NA even though I only subset by
. If I remove the
, I get another error:

Error in na.omit(z[z$x == "NS", "a"]) <- "no" :
could not find function "na.omit<-"

Answer Source

The first problem is to sepcify the variable name correcty, that is with the name and not the value (probably just a typo in your question): "y" and not "yes".

Then another problem arises when you use == and it tries to think of what to do with the NA in the third row:


hmm, should it be kept or not ? It is neither TRUE nor FALSE... so it just gives an error as it cannot "decide".

While, using %in% (which is actually match(x, table, nomatch = 0)), we get:

x %in% "NS"

There you go, NA doesn't match the value "NS" so it returns 0, or, in logical, FALSE : we shouldn't keep it.

Thus, to get what you want:

z[z$x %in% "NS", "y"] <- "a"
#     x y
#1   NS a
#2   NS a
#3 <NA> b