Grey Panther Grey Panther - 1 year ago 42
Java Question

How to estimate if the JVM has enough free memory for a particular data structure?

I have the following situation: there are a couple of machines forming a cluster. Clients can load data-sets and we need to select the node on which the dataset will be loaded and refuse to load / avoid an OOM error if there is no one machine which could fit the dataset.

What we do currently: we now the

entry count
in the dataset and estimate the
memory to be used
entry count * empirical factor
(determined manually). Then check if this is lower than free memory (got by
) and if so, load it (otherwise redo the process on other nodes / report that there is no free capacity).

The problems with this approach are:

  • the
    empirical factor
    needs to be revisited and updated manually

  • freeMemory
    sometimes may underreport because of some non-cleaned-up garbage (which could be avoided by running
    before each such call, however that would slow down the sever and also potentially lead to premature promotion)

  • an alternative would be to "just try to load the dataset" (and back out if an OOM is thrown) however once an OOM is thrown, you potentially corrupted other threads running in the same JVM and there is no graceful way of recovering from it.

Are there better solutions to this problem?


The empirical factor can be calculated as build step and placed in a properties file.

While freeMemory() is almost always less than the amount which would be free after a GC, you can check it to see if it is available and call a System.gc() if the maxMemory() indicates there might be plenty.

NOTE: Using System.gc() in production only makes in very rare situations and in general it often incorrectly used resulting in a reduction in performance and obscuring the real problem.

I would avoid triggering an OOME unless you are running is a JVM you can restart as required.