Dr. Manhattan Dr. Manhattan - 1 month ago 8
R Question

How to Address Factor Values with Quotations in R

I'll appreciate if someone can describe this to me! My mind is about to blow up on this fundamental logical inconsistency!

> class(trlog$X.sce_status.[1])
[1] "factor"
> trlog$X.sce_status.[1]
[1] "Successful"
Levels: "Failed-CMD INF ERROR" "Failed-TRANS EXPIRED" "Successful"
> trlog$X.sce_status.[1] == as.character("Successful")
[1] FALSE

Answer

The key to the confusion here is the way that R prints out elements of a factor variable. If you construct a simple factor variable:

f <- factor("Successful")

and print it

f[1]
## [1] Successful
## Levels: Successful

you can see that R prints out the level name without quotation marks. On the other hand, if you have a (slightly weird) factor where the labels actually contain quotation marks, you get a reasonable-seeming but subtly different result printed:

g <- factor("\"Successful\"")
g
## [1] "Successful"
## Levels: "Successful"

This becomes a little bit clearer (?) if you print the results of as.character, which does print with quotation marks by default:

as.character(f)
## [1] "Successful"
as.character(g)
## [1] "\"Successful\""

You can use print(as.character(g),quote=FALSE) or print(g,quote=TRUE) if you want to add/subtract quotation marks from the printed representation.