Lily Lily - 9 months ago 64
R Question

R function gsub replace only cases with "."

my vector has some missing values which have been marked by dot "." in my vector, I want to replace the "." by "NA" without affecting the decimal point in other values,

for exemple:

vect <- c( 1.1, ".", 2.5, ".", 3.0)
> vect
[1] "1.1" "." "2.5" "." "3"

I've used the gsub function to do the replacement and I'd like to get something like:

[1] 1.1 NA 2.5 NA 3.0

I've tried these commands below:

> gsub(".", NA, vect)


> gsub(".","NA", vect)
[1] "NANANA" "NA" "NANANA" "NA" "NA"


> gsub("\\.\\b","NA", vect)
[1] "1NA1" "NA" "2NA5" "NA" "3"

How can I tell R to replace only those missing values marked by "." without changing the decimal point of others values? Thanks :)

Answer Source

We can use sub. Specify the pattern as . as the only character in the string and replace it with NA. The . is a metacharacter which means any character, so we either escape (\\.) or use fixed = TRUE (however, using start (^) and end $ of the string, the escape route is the safest.

as.numeric(sub("^\\.$", NA, vect))
#[1] 1.1  NA 2.5  NA 3.0

The usual way is just as.numeric as it will convert the character strings to NA with a warning.

#[1] 1.1  NA 2.5  NA 3.0