Javier Semprun Javier Semprun - 25 days ago 6
R Question

R - SQLDF issue with special characters

I'm having an issue running sqldf in R and special characters.

Here is a small detail of the code I'm running to give you an idea of what's going on:

First I read my data from an excel sheet (using R's xlsx package), the method xlsx2 seems to get the data correctly and characters seem to be showing special characters such as 'Ñ'

verif_oblig <- try(read.xlsx2("My computer/Filename.xlsx", sheetName = 'VERIF_OBLIG'))
if("try-error" %in% class(verif_oblig))
verif_oblig <- Empty()


Then I start running my sql query using sqldf and the resulting table seems to replace Ñ characters for Ñ. Here's the query:

verif_oblig_v2 <- sqldf("
select
a.*,
case when b.Estado is null then 'NO GENERADO'
else b.Estado end as ESTADO,
case when resultado_operacion in ('EXITO','CORRECTO')
then 'EXITO'
else 'SIN EXITO'
end as RESULTADO_ACTUAL
from
verif_oblig a left join fin2016 b
on
a.CUPS = b.CUPS_Largo and a.DIVISION = b.DIVISION")


Can anyone help me find a solution for this?

Thank you very much

Answer

I ended up solving it by simply replacing the characters that were bugging after the sql query using gsub like this:

clear_errors <- function(table, campo){
table <- as.data.frame(table)
table[,campo] <- c(gsub("Ã'","Ñ",c(tabla_entrada[,campo])))
table[,campo]<- c(gsub("é","é",c(tabla_entrada[,campo])))
table[,campo]<- c(gsub("ó", "ó",c(tabla_entrada[,campo])))
table[,campo] <- c(gsub("ú","ú",c(tabla_entrada[,campo])))
table[,campo] <- c(gsub("ñ","ñ",c(tabla_entrada[,campo])))
table[,campo] <- c(gsub("Ã","í",c(tabla_entrada[,campo])))
table[,campo] <- c(gsub("O","A",c(tabla_entrada[,campo])))
return(table)

}

It isn't the most elegant solution but it works.

I think the issue happens because xlsx formats characters as factors and probably uses a different encoding than sqldf does for them. If someone can find out exactly what's going on I'd very much appreciate it (just out of curiosity)