VincentH VincentH - 11 days ago 6
R Question

Change stringsAsFactors settings for data.frame

I have a function in which I define a

data.frame
that I use loops to fill with data. At some point I get the Warning message:


Warning messages:
1: In
[<-.factor
(
*tmp*
, iseq, value = "CHANGE") :
invalid factor level, NAs generated


Therefore, when I define my data.frame, I'd like to set the option
stringsAsFactors
to
FALSE
but I don't understand how to do it.

I have tried:

DataFrame = data.frame(stringsAsFactors=FALSE)


and also:

options(stringsAsFactors=FALSE)


What is the correct way to set the stringsAsFactors option?

MvG MvG
Answer

It depends on how you fill your data frame, for which you haven't given any code. When you construct a new data frame, you can do t like this:

x <- data.frame(aName = aVector, bName = bVector, stringsAsFactors = FALSE)

In this case, if e.g. aVector is a character vector, then the dataframe column x$aName will be a character vector as well, and not a factor. Combining that with an existing data frame (using rbind, cbind or similar) should preserve that mode.

When you execute

options(stringsAsFactors = FALSE)

you change the global default setting. So every data frame you create after executing that line will not auto-convert to factors unless explicitely told to do so. If you only need to avoid conversion in a single place, then I'd rather not change the default. However if this affects many places in your code, changing the default seems a good idea.

One more thing: if your vector is already a factor, then neither of the above will change it back into a character vector. To do so, you should explicitly convert it back using as.character or similar.