Clashsoft Clashsoft - 16 days ago 4
Java Question

Why does the Java Compiler copy finally Blocks

When trying to compile the following code with a simple

try/finally
block, it produces the output below (viewed in the ASM Bytecode Viewer):

Code:

try
{
System.out.println("Attempting to divide by zero...");
System.out.println(1 / 0);
}
finally
{
System.out.println("Finally...");
}


Bytecode:

TRYCATCHBLOCK L0 L1 L1
L0
LINENUMBER 10 L0
GETSTATIC java/lang/System.out : Ljava/io/PrintStream;
LDC "Attempting to divide by zero..."
INVOKEVIRTUAL java/io/PrintStream.println (Ljava/lang/String;)V
L2
LINENUMBER 11 L2
GETSTATIC java/lang/System.out : Ljava/io/PrintStream;
ICONST_1
ICONST_0
IDIV
INVOKEVIRTUAL java/io/PrintStream.println (I)V
L3
LINENUMBER 12 L3
GOTO L4
L1
LINENUMBER 14 L1
FRAME SAME1 java/lang/Throwable
ASTORE 1
L5
LINENUMBER 15 L5
GETSTATIC java/lang/System.out : Ljava/io/PrintStream;
LDC "Finally..."
INVOKEVIRTUAL java/io/PrintStream.println (Ljava/lang/String;)V
L6
LINENUMBER 16 L6
ALOAD 1
ATHROW
L4
LINENUMBER 15 L4
FRAME SAME
GETSTATIC java/lang/System.out : Ljava/io/PrintStream;
LDC "Finally..."
INVOKEVIRTUAL java/io/PrintStream.println (Ljava/lang/String;)V
L7
LINENUMBER 17 L7
RETURN
L8
LOCALVARIABLE args [Ljava/lang/String; L0 L8 0
MAXSTACK = 3
MAXLOCALS = 2


When adding a
catch
block in between, I noticed that the Compiler copied the
finally
block 3 times (not posting the bytecode again), which seems like a waste of space. The copying also doesn't seem to be limited to a maximum number of instructions (similar to how inlining works), since it even duplicated the
finally
block when I added a few more calls to
System.out.println
.




However, the result of a custom compiler of mine that uses a different approach of compiling the same code works exactly the same when executed, but requires less space by using the
GOTO
instruction:

public static main([Ljava/lang/String;)V
// parameter args
TRYCATCHBLOCK L0 L1 L1
L0
GETSTATIC java/lang/System.out : Ljava/io/PrintStream;
LDC "Attempting to divide by zero..."
INVOKEVIRTUAL java/io/PrintStream.println (Ljava/lang/String;)V
GETSTATIC java/lang/System.out : Ljava/io/PrintStream;
ICONST_1
ICONST_0
IDIV
INVOKEVIRTUAL java/io/PrintStream.println (I)V
GOTO L2
L1
FRAME SAME1 java/lang/Throwable
POP
L2
FRAME SAME
GETSTATIC java/lang/System.out : Ljava/io/PrintStream;
LDC "Finally..."
INVOKEVIRTUAL java/io/PrintStream.println (Ljava/lang/String;)V
L3
RETURN
LOCALVARIABLE args [Ljava/lang/String; L0 L3 0
MAXSTACK = 3
MAXLOCALS = 1


Why does the Java Compiler (or the Eclipse Compiler) copy the bytecode of the
finally
block multiple times, even using
athrow
to rethrow exceptions? Is this part of the optimization process, or is my compiler doing it wrong?




(The output in both cases is...)

Attempting to divide by zero...
Finally...

Answer

Inlining Finally Blocks

The question your asking has been analysed in part at: http://devblog.guidewire.com/2009/10/22/compiling-trycatchfinally-on-the-jvm/

The post will show an interesting example as well as information such as (quote):

finally blocks are implemented by inlining the finally code at all possible exits from the try or associated catch blocks, wrapping the whole thing in essentially a “catch(Throwable)” block that rethrows the exception when it finishes, and then adjusting the exception table such that the catch clauses skip over the inlined finally statements. Huh? (Small caveat: prior to the 1.6 compiler, apparently, finally statements used sub-routines instead of full-on code inlining. But we’re only concerned with 1.6 at this point, so that’s what this applies to).


The JSR instruction and Inlined Finally

There are differing opinions as to why inlining is used though I have not yet found a definitive one from an official document or source.

There are the following 3 explanations:

No offer advantages - more trouble:

Some believe that finally in-lining is used because JSR/RET did not offer major advantages such as the quote from What Java compilers use the jsr instruction, and what for?

The JSR/RET mechanism was originally used to implement finally blocks. However, they decided that the code size savings weren't worth the extra complexity and it got gradually phased out.

Problems with verification using stack map tables:

Another possible explanation has been proposed in the comments by @jeffrey-bosboom , who I quote below:

javac used to use jsr (jump subroutine) to only write finally code once, but there were some problems related to the new verification using stack map tables. I assume they went back to cloning the code just because it was the easiest thing to do.

Having to Maintain Subroutine Dirty Bits:

An interesting exchange in the comments of question What Java compilers use the jsr instruction, and what for? points that JSR and subroutines "added extra complexity from having to maintain a stack of dirty bits for the local variables".

Below the exchange:

@paj28: Would the jsr have posed such difficulties if it could only call declared "subroutines", each of which could only be entered at the start, would only be callable from one other subroutine, and could only exit via ret or abrupt completion (return or throw)? Duplicating code in finally blocks seems really ugly, especially since finally-related cleanup may often invoke nested try blocks. – supercat Jan 28 '14 at 23:18

@supercat, Most of that is already true. Subroutines can only be entered from the start, can only return from one place, and can only be called from within a single subroutine. The complexity comes from the fact that you have to maintain a stack of dirty bits for the local variables and when returning, you have to do a three way merge. – Antimony Jan 28 '14 at 23:40