Basil Bourque Basil Bourque - 1 year ago 102
Java Question

Is String Deduplication feature of the G1 garbage collector enabled by default?

JEP 192: String Deduplication in G1 implemented in Java 8 Update 20 added the new String deduplication feature:


Reduce the Java heap live-data set by enhancing the G1 garbage collector so that duplicate instances of String are automatically and continuously deduplicated.


The JEP page mentions that a command-line option
UseStringDeduplication (bool)
allows the dedup feature to be enabled or disabled. But the JEP page does not go so far as to indicate the default.

➠ Is the dedup feature ON or OFF by default in the G1 garbage collector bundled with Java 8 and with Java 9?

➠ Is there a “getter” method to verify the current setting at runtime?

I do not know where to look for documentation beyond the JEP page.

In at least the HotSpot-equipped implementations of Java 9, the G1 garbage collector is enabled by default. That fact prompted this Question now. For more info on String interning and deduplication, see this 2014-10 presentation by Aleksey Shipilev at 29:00.

Answer Source

String deduplication off by default

For the versions of Java 8 and Java 9 seen below, UseStringDeduplication is false (disabled) by default.

One way to verify the feature setting: list out all the final flags for JVM and then look for it.

build 1.8.0_131-b11

    $ java -XX:+UseG1GC  -XX:+UnlockDiagnosticVMOptions -XX:+PrintFlagsFinal -version | grep -i 'duplicat'
     bool PrintStringDeduplicationStatistics        = false                               {product}
    uintx StringDeduplicationAgeThreshold           = 3                                   {product}
     bool StringDeduplicationRehashALot             = false                               {diagnostic}
     bool StringDeduplicationResizeALot             = false                               {diagnostic}
     bool UseStringDeduplication                    = false                               {product}
java version "1.8.0_131"
Java(TM) SE Runtime Environment (build 1.8.0_131-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)

build 9+18

    $ java -XX:+UseG1GC  -XX:+UnlockDiagnosticVMOptions -XX:+PrintFlagsFinal -version | grep -i 'duplicat'
    uintx StringDeduplicationAgeThreshold          = 3                                        {product} {default}
     bool StringDeduplicationRehashALot            = false                                 {diagnostic} {default}
     bool StringDeduplicationResizeALot            = false                                 {diagnostic} {default}
     bool UseStringDeduplication                   = false                                    {product} {default}
java version "9"
Java(TM) SE Runtime Environment (build 9+181)
Java HotSpot(TM) 64-Bit Server VM (build 9+181, mixed mode)

Another way to test it is with

package jvm;

import java.util.ArrayList;
import java.util.List;

public class StringDeDuplicationTester {

    public static void main(String[] args) throws Exception {
        List<String> strings = new ArrayList<>();
        while (true) {
            for (int i = 0; i < 100_00; i++) {
                strings.add(new String("String " + i));
            }
            Thread.sleep(100);
        }
    }
}

run without explicitly specifying it.

$ java  -Xmx256m -XX:+UseG1GC -XX:+PrintStringDeduplicationStatistics jvm.StringDeDuplicationTester
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at jvm.StringDeDuplicationTester.main(StringDeDuplicationTester.java:12)

Run with explicitly turning it ON.

$ java  -Xmx256m -XX:+UseG1GC -XX:+UseStringDeduplication -XX:+PrintStringDeduplicationStatistics jvm.StringDeDuplicationTester
[GC concurrent-string-deduplication, 5116.7K->408.7K(4708.0K), avg 92.0%, 0.0246084 secs]
   [Last Exec: 0.0246084 secs, Idle: 1.7075173 secs, Blocked: 0/0.0000000 secs]
      [Inspected:          130568]
         [Skipped:              0(  0.0%)]
         [Hashed:          130450( 99.9%)]
         [Known:                0(  0.0%)]
         [New:             130568(100.0%)   5116.7K]
      [Deduplicated:       120388( 92.2%)   4708.0K( 92.0%)]
         [Young:                0(  0.0%)      0.0B(  0.0%)]
         [Old:             120388(100.0%)   4708.0K(100.0%)]
   [Total Exec: 1/0.0246084 secs, Idle: 1/1.7075173 secs, Blocked: 0/0.0000000 secs]
      [Inspected:          130568]
         [Skipped:              0(  0.0%)]
         [Hashed:          130450( 99.9%)]
         [Known:                0(  0.0%)]
         [New:             130568(100.0%)   5116.7K]
      [Deduplicated:       120388( 92.2%)   4708.0K( 92.0%)]
         [Young:                0(  0.0%)      0.0B(  0.0%)]
         [Old:             120388(100.0%)   4708.0K(100.0%)]
   [Table]
      [Memory Usage: 264.9K]
      [Size: 1024, Min: 1024, Max: 16777216]
      [Entries: 10962, Load: 1070.5%, Cached: 0, Added: 10962, Removed: 0]
      [Resize Count: 0, Shrink Threshold: 682(66.7%), Grow Threshold: 2048(200.0%)]
      [Rehash Count: 0, Rehash Threshold: 120, Hash Seed: 0x0]
      [Age Threshold: 3]
   [Queue]
      [Dropped: 0]
[GC concurrent-string-deduplication, deleted 0 entries, 0.0000008 secs]
...
output truncated

Note: this output is from build 1.8.0_131-b11. Looks like Java 9 has no option to print String de-duplication statistics. Potential bug ? No. Unified logging killed this specific option.

$ java  -Xmx256m -XX:+UseG1GC -XX:+PrintStringDeduplicationStatistics -version
Unrecognized VM option 'PrintStringDeduplicationStatistics'
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download