Ahmet Karakaya Ahmet Karakaya - 4 years ago 160
Java Question

Java Substring memory leak

Based on the discussion about getting substring of String Java String.split memory leak? , I have been analyzing two sample substring examples of usage.

It is said that objects don't get garbage collected if the caller stores a substring of a field in the object.
When I run the code I get and OutofMemory Exception, and see the incresing of char[] allocated size while monitoring it via VisualVM

public class TestGC {
private String largeString = new String(new byte[100000]);
String getString() {
return this.largeString.substring(0,2);
//return new String(this.largeString.substring(0,2));
}

public static void main(String[] args) {
java.util.ArrayList<String> list = new java.util.ArrayList<String>();
for (int i = 0; i < 100000; i++) {
TestGC gc = new TestGC();
list.add(gc.getString());
}
}
}


with the following code, I did not get an error and after analyzing memory usage via VisualVM I realized that allocated char[] size getting increasing then somehow decreased at some point , then increasing again and decreased at some point (GC works its job). And It continues forever.

public class TestGC {
private String largeString = new String(new byte[100000]);

String getString() {
//return this.largeString.substring(0,2);
return new String(this.largeString.substring(0,2));
}

public static void main(String[] args) {
java.util.ArrayList<String> list = new java.util.ArrayList<String>();
for (int i = 0; i < 100000; i++) {
TestGC gc = new TestGC();
list.add(gc.getString());
}
}
}


I really want to understand what does GC collect then remove from heap memory in second example?
Why GC cannot collect same object in the first example?

at the first example
largeString.substring(0,2));
send a reference and at the second example
new String(this.largeString.substring(0,2));
creates new objects. Both cases should not problem for behaviour of GC?

Answer Source

In the first example, every time around the loop when you create a new TestGC object you are also creating a new String initialised from the 100000 byte array. When you call String.substring you are returning the same big long string but with the offset set to 0 and count set to 2. So all the data is still in memory but when you use the String you will only see the 2 characters specified in the substring call.

In the second example you are again creating the new String every time around the loop, but by calling new String(String.substring) you are discarding the rest of the String and only keeping the 2 characters in memory, so the rest can be garbage collected.

As the links in the comments say, this behaviour has changed in 1.7.0_06 so that the String returned by String.substring will no longer share the same char[].

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download