ssdecontrol ssdecontrol - 1 year ago 69
R Question

read.xlsx takes a very long time and tons of memory

I'm trying to load an .xlsx file into R that has one sheet and is about 31 MB in size.

I run the following

options( java.parameters = "-Xmx6g" )
require(xlsx)
yt = read.xlsx("big_spreadsheet.xlsx",1)


and I get nothing. My system monitor program shows that the allotted memory slowly fills up and then just stays full. I haven't let it run for hours but ten minutes should be sufficient, especially when I could have just loaded into Numbers (I'm on Mavericks) and saved it as a CSV in that time.

Yes, I have much more than 6 GB of memory. 2 GB doesn't seem to be enough and yields the error:

Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl, :
java.lang.OutOfMemoryError: Java heap space


I did, however, make the mistake of letting the
rJava
package install its own version of Java. I downloaded JDK 8 after the fact but I have no idea how to check if this is being used.

So why does it take 6 GB of RAM to (fail to) load a 31 MB file? Can I fix this somehow?

Answer Source

I never got this to work. I've lately been using the readxl package for reading from Excel spreadsheets, which has no Java dependency and seems to work just fine.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download