Pillar Pillar - 6 months ago 7
Java Question

Why doesn't G1 start a marking cycle when the InitiatingHeapOccupancyPercent is achieved?

According to the documentation,

XX:InitiatingHeapOccupancyPercent



Sets the Java heap occupancy threshold that triggers a marking cycle.
The default occupancy is 45 percent of the entire Java heap.


In my current environment, that does not happen.

My G1 garbage collection configuration is as follows

-Xms25000m
-Xmx25000m
-XX:+UseG1GC
-XX:MaxGCPauseMillis=1000
-XX:GCTimeRatio=99
-XX:InitiatingHeapOccupancyPercent=70
-XX:MaxTenuringThreshold=8
-XX:+UnlockExperimentalVMOptions
-XX:G1MixedGCCountTarget=16
-XX:G1OldCSetRegionThresholdPercent=3
-XX:G1NewSizePercent=30
-XX:G1RSetUpdatingPauseTimePercent=5


With a 25g heap and an
XX:InitiatingHeapOccupancyPercent
of 70%, you would expect a marking cycle to begin when 18g is occupied. I'm tailing the garbage collection logs and that doesn't happen.

Here's an extract:

{Heap before GC invocations=592 (full 0):
garbage-first heap total 25600000K, used 22802164K [0x00000001a5800000, 0x00000001a60061a8, 0x00000007c0000000)
region size 8192K, 1526 young (12500992K), 25 survivors (204800K)
Metaspace used 37386K, capacity 37948K, committed 38144K, reserved 1083392K
class space used 3948K, capacity 4080K, committed 4096K, reserved 1048576K
2016-04-20T22:06:38.272+0000: 4213.406: [GC pause (GCLocker Initiated GC) (young)
Desired survivor size 801112064 bytes, new threshold 8 (max 8)
- age 1: 98537800 bytes, 98537800 total
- age 2: 7053912 bytes, 105591712 total
- age 3: 6556320 bytes, 112148032 total
- age 4: 8836064 bytes, 120984096 total
- age 5: 5725448 bytes, 126709544 total
- age 6: 6702728 bytes, 133412272 total
- age 7: 3831920 bytes, 137244192 total
- age 8: 4166336 bytes, 141410528 total
4213.406: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 184844, predicted base time: 44.67 ms, remaining time: 955.33 ms, target pause time: 1000.00 ms]
4213.406: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 1501 regions, survivors: 25 regions, predicted young region time: 21.21 ms]
4213.406: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 1501 regions, survivors: 25 regions, old: 0 regions, predicted pause time: 65.88 ms, target pause time: 1000.00 ms]
4213.475: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: recent GC overhead higher than threshold after GC, recent GC overhead: 1.40 %, threshold: 1.00 %, uncommitted: 0 bytes, calculated expansion amount: 0 bytes (20.00 %)]
, 0.0687163 secs]
[Parallel Time: 61.7 ms, GC Workers: 28]
[GC Worker Start (ms): Min: 4213406.9, Avg: 4213407.1, Max: 4213407.3, Diff: 0.4]
[Ext Root Scanning (ms): Min: 6.0, Avg: 6.2, Max: 6.4, Diff: 0.4, Sum: 173.1]
[Update RS (ms): Min: 33.5, Avg: 34.0, Max: 34.6, Diff: 1.1, Sum: 951.9]
[Processed Buffers: Min: 27, Avg: 36.6, Max: 48, Diff: 21, Sum: 1024]
[Scan RS (ms): Min: 0.1, Avg: 0.2, Max: 0.5, Diff: 0.4, Sum: 6.3]
[Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
[Object Copy (ms): Min: 20.1, Avg: 20.6, Max: 20.8, Diff: 0.7, Sum: 577.5]
[Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.7]
[Termination Attempts: Min: 1, Avg: 13.2, Max: 19, Diff: 18, Sum: 371]
[GC Worker Other (ms): Min: 0.0, Avg: 0.2, Max: 0.4, Diff: 0.3, Sum: 4.7]
[GC Worker Total (ms): Min: 60.9, Avg: 61.2, Max: 61.6, Diff: 0.6, Sum: 1714.2]
[GC Worker End (ms): Min: 4213468.2, Avg: 4213468.3, Max: 4213468.5, Diff: 0.3]
[Code Root Fixup: 0.4 ms]
[Code Root Purge: 0.0 ms]
[Clear CT: 1.2 ms]
[Other: 5.4 ms]
[Choose CSet: 0.0 ms]
[Ref Proc: 0.5 ms]
[Ref Enq: 0.0 ms]
[Redirty Cards: 0.8 ms]
[Humongous Register: 0.2 ms]
[Humongous Reclaim: 0.0 ms]
[Free CSet: 2.4 ms]
[Eden: 11.7G(11.7G)->0.0B(11.7G) Survivors: 200.0M->200.0M Heap: 21.7G(24.4G)->10.0G(24.4G)]
Heap after GC invocations=593 (full 0):
garbage-first heap total 25600000K, used 10516798K [0x00000001a5800000, 0x00000001a60061a8, 0x00000007c0000000)
region size 8192K, 25 young (204800K), 25 survivors (204800K)
Metaspace used 37386K, capacity 37948K, committed 38144K, reserved 1083392K
class space used 3948K, capacity 4080K, committed 4096K, reserved 1048576K
}
[Times: user=1.70 sys=0.01, real=0.07 secs]
2016-04-20T22:06:38.342+0000: 4213.475: Total time for which application threads were stopped: 0.0701353 seconds, Stopping threads took: 0.0001600 seconds


I'll draw your attention to

{Heap before GC invocations=592 (full 0):
garbage-first heap total 25600000K, used 22802164K [0x00000001a5800000, 0x00000001a60061a8, 0x00000007c0000000)
[...]
[Eden: 11.7G(11.7G)->0.0B(11.7G) Survivors: 200.0M->200.0M Heap: 21.7G(24.4G)->10.0G(24.4G)]


More than 70% of the heap is occupied before this collection. Why did that not trigger a marking cycle?

The application continues doing young generation collections, filling up the old regions, eventually leading to an allocation failure and a lengthy Full GC.




Reducing
InitiatingHeapOccupancyPercent
to 55 had no discernible effect.

At 20, it did start doing mixed collections, but only when ~80% of the heap was occupied.

Answer

JDK-6976060 suggests that the need for a marking cycle is calculated at the end of a young GC. Depending on whether it uses the occupancy stats before or after the young GC this may or may not mean that eden space is always considered as 0% occupied for the purpose of IHOP calculation. With a 45% eden size that would mean 70% occupancy could never be reached, the maximum possible occupancy would be 55% and at that point the heap would be completely full and it would be too late for a mixed collection.

But I am doubtful whether this is truly the case because in the face of dynamic young generation sizing this would make the documentation misleading and IHOP tuning far more difficult. It should be fairly easy to verify this with an artificial test-case and manually sized generations.

If that's not the issue then GC pause (GCLocker Initiated GC) (young) might point to bug 8140597, which is fixed in jdk9b94.


Update: The description in Bug 8151176 indeed suggests that for the purpose of the IHO-percentage calculation it calculates oldgen occupancy/overall heap size. Which means young generation occupancy is completely ignored, which in turn means if young gen fraction > IHOP then it can never initiate a concurrent cycle.

The reason is that static IHOP starts if old gen occupancy exceeds a fixed percentage of the current heap capacity. If either the user or ergonomics decide that the old gen can not be larger than that fraction of the heap capacity that triggers concurrent mark, marking will never start.

So currently available solutions are

  • constrain the young gen fraction < IHOP
  • decrease IHOP to take the minimum possible old gen fraction into account
  • let the JVM adjust the IHOP dynamically
Comments