I need to store in memory a huge amount of data by looping over a result set (Oracle database).
I carried out some test and, by using a profiler, I noticed that there is a considerable difference between the
By looking at your memory chart, it seems much of the data is of a temporary nature and can be removed from heap at some point. The final ratio of used heap vs. its total size says it all.
It seems like the temporary data's (e.g. buffered data from an Oracle
ResultSet) time to live is too high or the eden space it too small and thus data is being moved from the eden and/or survivor space to the old generation space where it's being collected as a result of the JVM detecting the need to run the GC on the old generation space. This possibly happens when you iterate over your
ResultSet and the Oracle driver needs to fetch the next chunk of data from the database, which can be fairly large.
At this point I should go a little bit into detail about the Oracle
ResultSet buffer. It's basically just a chunk of bytes on the heap. Depending on the column data it is stored as something different than you'd read from the
ResultSet. Take a
java.sql.Timestamp for instance. Inside the buffer it's stored as an oracle.sql.TIMESTAMP or even just plain bytes. This means that whenever you extract a
java.sql.Timestamp from a
ResultSet there's the need for another object to be allocated. And this object is most likely the "final" object you want to keep eventually.
I suggest tuning the JVM's GC to your needs. Maybe you can figure out which data is being constantly collected. Try adjusting the eden size so the JVM doesn't need to promote too much to the old generation. You can also adjust how much new space the JVM allocates on demand and how it shrinks when detecting a gap in usage and allocated size.
You can find a list of JVM options here.