bici.sancta bici.sancta -4 years ago 152
R Question

read_excel 'expecting numeric' ..... and value is numeric

I didn't find an answer to this question, so hopefully this is the place to get some help on this.

I am reading in many Excel files contained in .zip files. Each .zip that I have has about 40 excel files that I want to read. I am trying to create a list of data frames, but encounter an error on reading some files based on file content.

This is the read statment, inside a for loop:

library(readxl)
df[[i]] <- read_excel(xls_lst[i],
skip = 4,
col_names = FALSE,
na = "n/a",
col_types = data_types)


data_types has these values :
> data_types
[1] "text" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"


which is correct for this file.

The read_excel statement works well on some files, but returns warning message on others :

In read_xlsx_(path, sheet, col_names = col_names, col_types = col_types,... :
[54, 7]: expecting numeric: got '9999.990000'


Well, the value '9999.99000' looks like a numeric to me.
When I open the Excel file that creates this warning, the file indeed shows these values, and also shows that the column is formatted as text in Excel.
When I change the column formatting to numeric, re-save the Excel sheet, then the data is read in correctly.

However, I have several hundreds of these files to read ... how can
read_excel
ignore the column format indicated by Excel, and instead use the
col_type
defintion that I supply in the calling statement ?

Thanks,

Answer Source

I tried to build a toy example.

My xlsx file contains:

3   1
3   3
4   4
5   5
7   '999
6   3

Reading in it your way:

data_types<-c("numeric","numeric")
a<-read_excel("aa.xlsx",
       col_names = FALSE,
       na = "n/a",
       col_types = data_types 
       )
Warning message:
In read_xlsx_(path, sheet, col_names = col_names, col_types = col_types,  :
[5, 2]: expecting numeric: got '999'

Reading in everything as text

data_types<-c("text","text")
dat<-read_excel("aa.xlsx",
          col_names = FALSE,
          na = "n/a",
          col_types = data_types
          )

And using type.convert:

dat[]<-lapply(dat, type.convert)

works at least for this simple example.

*Edited:

There was a mistake in the code.

*Edit in response to comment:

Another toy example demonstrating how you could apply type.convert to your data:

#list of data frames
l<-list()
l[[1]]<-data.frame(matrix(rep(as.character(1:5),2), ncol = 2), stringsAsFactors = FALSE)
l<-rep(l,3)

#looping over your list to encode columns correctly:
for (i in 1: length(l)){ 
l[[i]][]<-lapply(l[[i]], type.convert)
}

There might be better solutions. But I think this should work.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download