learnerX learnerX - 3 months ago 8
R Question

How to make R insert a '0' in place of missing values while reading a CSV?

We have a multi-column CSV file of the following format:

id1,id2,id3,id4
1,2,3,4
,,3,4,6
2,,3,4


These missing values are to be assumed as a '0' when reading the CSV column by column. The following is the script we currently have:

data <- read.csv("data.csv")

dfList <- lapply(seq_along(data), function(i) {
seasonal_per <- msts(data[, i], seasonal.periods=c(24,168))
best_model <- tbats(seasonal_per)
fcst <- forecast.tbats(best_model, h=24, level=90)
dfForec <- print(fcst)
result <- cbind(0:23, dfForec[, 1])
result$id <- names(df)[i]

return(result[c("id", "V1", "V2")])
})

finaldf <- do.call(rbind, dfList)
write.csv(finaldf, file = "out.csv", row.names = FALSE)


This script breaks when the CSV has missing values giving the error
Error in tau + 1 + adj.beta + object$p :
non-numeric argument to binary operator
. How do we tell R to assume a '0' when it encounters a missing value?

I tried the following:

library("forecast")
D <- read.csv("data.csv",na.strings=".")
D[is.na(D)] <- 0

dfList <- lapply(seq_along(data), function(i) {
seasonal_per <- msts(data[, i], seasonal.periods=c(24,168))
best_model <- tbats(seasonal_per)
fcst <- forecast.tbats(best_model, h=24, level=90)
dfForec <- print(fcst)
result <- cbind(0:23, dfForec[, 1])
result$id <- names(df)[i]

return(result[c("id", "V1", "V2")])
})

finaldf <- do.call(rbind, dfList)
write.csv(finaldf, file = "out.csv", row.names = FALSE)


but it gives the following error:

Error in data[, i] : object of type 'closure' is not subsettable

Answer

If you're certain that any NA value should be 0, and that's the only issue, then

data <- read.csv("data.csv")
data[is.na(data)] <- 0