Indi Indi - 12 days ago 5
R Question

initCoreNLP() method call from the Stanford's R coreNLP package throws error

I am trying to use the

coreNLP
package. I ran the following commands and encounter the GC overhead limit exceeded error.

library(rJava)

downloadCoreNLP()

initCoreNLP()


Error is like this :


Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... Error in rJava::.jnew("edu.stanford.nlp.pipeline.StanfordCoreNLP", basename(path)) :
java.lang.OutOfMemoryError: GC overhead limit exceeded
Error during wrapup: cannot open the connection


I don't know much of Java, can someone help me with this?

Answer

@indi I ran into the same problem (see R's coreNLP::initCoreNLP() throws java.lang.OutOfMemoryError) but was able to come up with a more repeatable solution than simply rebooting.

The full syntax for the init command is

initCoreNLP(libLoc, parameterFile, mem = "4g", annotators)

Increasing mem did not help me, but I realized that you and I were both getting stuck with one of the classifiers in the ner annotator (named entity recognition). Since all I needed was parts-of-speech tagging, I replaced the init command with the following:

initCoreNLP(mem = "8g", annotators = c("tokenize", "ssplit", "pos"))

This caused the init command to execute in a flash and with no memory problems. BTW, I increased mem to 8g just because I have that much RAM. I'm sure I could have left it at the default 4g and it would have been fine.

I don't know if you need the ner annotator. If not, then explicitly list the annotators argument. Here is a list of the possible values: http://stanfordnlp.github.io/CoreNLP/annotators.html. Just pick the ones you absolutely need to get your job done. If you do need ner, then again figure out the minimal set of annotators you need and specify those.

So there you (and hopefully others) go!

Comments