Dmitry Onishchenko Jr. - 1 year ago 84
R Question

# One of the factor's levels is an empty string; how to replace it with non-missing value?

Data frame AEbySOC contains two columns - factor SOC with character levels and integer count Count:

``````> str(AEbySOC)
'data.frame':   19 obs. of  2 variables:
\$ SOC  : Factor w/ 19 levels "","Blood and lymphatic system disorders",..: 1 2 3 4 5 6 7 8 9 10 ...
\$ Count: int  25 50 7 3 1 49 49 2 1 9 ...
``````

One of the levels of SOC is an empty character string:

``````> l = levels(AEbySOC\$SOC)
> l[1]
[1] ""
``````

I want to replace the value of this level by a non-empty string, say, "Not specified". This does not work:

``````> library(plyr)
> revalue(AEbySOC\$SOC, c(""="Not specified"))
Error: attempt to use zero-length variable name
``````

Neither does this:

``````> AEbySOC\$SOC[AEbySOC\$SOC==""] = "Not specified"
Warning message:
In `[<-.factor`(`*tmp*`, AEbySOC\$SOC == "", value = c(NA, 2L, 3L,  :
invalid factor level, NA generated
``````

What's the right way to implement this? I appreciate any input/comment.

``````levels(AEbySOC\$SOC)[1] <- "Not specified"
``````

Created a toy example:

``````df<- data.frame(a= c("", "a", "b"))

df
#  a
#1
#2 a
#3 b

levels(df\$a)
#[1] ""  "a" "b"

levels(df\$a)[1] <- "Not specified"

levels(df\$a)
#[1] "Not specified" "a"             "b"
``````

EDIT

As per the OP's comments if we need to find it according the value then in such case, we can try

``````levels(AEbySOC\$SOC)[levels(AEbySOC\$SOC) == ""] <- "Not specified"
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download