Dici Dici - 5 months ago 15
Java Question

What is the difference between a lambda and a method reference at a runtime level

I've experienced a problem that was happening using a method reference but not with lambdas. That code was the following :

(Comparator<ObjectNode> & Serializable) SOME_COMPARATOR::compare


or, with lambda,

(Comparator<ObjectNode> & Serializable) (a,b) -> SOME_COMPARATOR.compare(a,b)


Semantically, it is strictly the same, but in practice it is different as in the first case I get an exception in one of the Java serialization classes. My question is not about this exception, because the actual code is running in a more complicated context that has proved to have strange behaviour with serialization, so it would just make it too difficult to answer if I gave any more details.

What I want to understand is the difference between those two ways of creating a lambda expression

Answer

Getting Started

To investigate this we start with the following class:

import java.io.Serializable;
import java.util.Comparator;

public final class Generic {

    // Bad implementation, only used as an example.
    public static final Comparator<Integer> COMPARATOR = (a, b) -> (a > b) ? 1 : -1;

    public static Comparator<Integer> reference() {
        return (Comparator<Integer> & Serializable) COMPARATOR::compare;
    }

    public static Comparator<Integer> explicit() {
        return (Comparator<Integer> & Serializable) (a, b) -> COMPARATOR.compare(a, b);
    }

}

After compilation, we can disassemble it using:

javap -c -p -s -v Generic.class

Removing the irrelevant parts (and some other clutter, such as fully-qualified types and the initialisation of COMPARATOR) we are left with

  public static final Comparator<Integer> COMPARATOR;    

  public static Comparator<Integer> reference();
      0: getstatic     #2  // Field COMPARATOR:LComparator;    
      3: dup    
      4: invokevirtual #3   // Method Object.getClass:()LClass;    
      7: pop    
      8: invokedynamic #4,  0  // InvokeDynamic #0:compare:(LComparator;)LComparator;    
      13: checkcast     #5  // class Serializable    
      16: checkcast     #6  // class Comparator    
      19: areturn

  public static Comparator<Integer> explicit();
      0: invokedynamic #7,  0  // InvokeDynamic #1:compare:()LComparator;    
      5: checkcast     #5  // class Serializable    
      8: checkcast     #6  // class Comparator    
      11: areturn

  private static int lambda$explicit$d34e1a25$1(Integer, Integer);
     0: getstatic     #2  // Field COMPARATOR:LComparator;
     3: aload_0
     4: aload_1
     5: invokeinterface #44,  3  // InterfaceMethod Comparator.compare:(LObject;LObject;)I
    10: ireturn

BootstrapMethods:    
  0: #61 invokestatic invoke/LambdaMetafactory.altMetafactory:(Linvoke/MethodHandles$Lookup;LString;Linvoke/MethodType;[LObject;)Linvoke/CallSite;    
    Method arguments:    
      #62 (LObject;LObject;)I    
      #63 invokeinterface Comparator.compare:(LObject;LObject;)I    
      #64 (LInteger;LInteger;)I    
      #65 5    
      #66 0    

  1: #61 invokestatic invoke/LambdaMetafactory.altMetafactory:(Linvoke/MethodHandles$Lookup;LString;Linvoke/MethodType;[LObject;)Linvoke/CallSite;    
    Method arguments:    
      #62 (LObject;LObject;)I    
      #70 invokestatic Generic.lambda$explicit$df5d232f$1:(LInteger;LInteger;)I    
      #64 (LInteger;LInteger;)I    
      #65 5    
      #66 0

Immediately we see that the bytecode for the reference() method is different to the bytecode for explicit(). However, the notable difference isn't actually relevant, but the bootstrap methods are interesting.

An invokedynamic call site is linked to a method by means of a bootstrap method, which is a method specified by the compiler for the dynamically-typed language that is called once by the JVM to link the site.

(Java Virtual Machine Support for Non-Java Languages, emphasis theirs)

This is the code responsible for creating the CallSite used by the lambda. The Method arguments listed below each bootstrap method are the values passed as the variadic parameter (i.e. args) of LambdaMetaFactory#altMetaFactory.

Format of the Method arguments

  1. samMethodType - Signature and return type of method to be implemented by the function object.
  2. implMethod - A direct method handle describing the implementation method which should be called (with suitable adaptation of argument types, return types, and with captured arguments prepended to the invocation arguments) at invocation time.
  3. instantiatedMethodType - The signature and return type that should be enforced dynamically at invocation time. This may be the same as samMethodType, or may be a specialization of it.
  4. flags indicates additional options; this is a bitwise OR of desired flags. Defined flags are FLAG_BRIDGES, FLAG_MARKERS, and FLAG_SERIALIZABLE.
  5. bridgeCount is the number of additional method signatures the function object should implement, and is present if and only if the FLAG_BRIDGES flag is set.

In both cases here bridgeCount is 0, and so there is no 6, which would otherwise be bridges - a variable-length list of additional methods signatures to implement (given that bridgeCount is 0, I'm not entirely sure why FLAG_BRIDGES is set).

Matching the above up with our arguments, we get:

  1. The function signature and return type (Ljava/lang/Object;Ljava/lang/Object;)I, which is the return type of Comparator#compare, because of generic type erasure.
  2. The method being called when this lambda is invoked (which is different).
  3. The signature and return type of the lambda, which will be checked when the lambda is invoked: (LInteger;LInteger;)I (note that these aren't erased, because this is part of the lambda specification).
  4. The flags, which in both cases is the composition of FLAG_BRIDGES and FLAG_SERIALIZABLE (i.e. 5).
  5. The amount of bridge method signatures, 0.

We can see that FLAG_SERIALIZABLE is set for both lambdas, so it's not that.

Implementation methods

The implementation method for the method reference lambda is Comparator.compare:(LObject;LObject;)I, but for the explicit lambda it's Generic.lambda$explicit$df5d232f$1:(LInteger;LInteger;)I. Looking at the disassembly, we can see that the former is essentially an inlined version of the latter. The only other notable difference is the method parameter types (which, as mentioned earlier, is because of generic type erasure).

When is a lambda actually serializable?

You can serialize a lambda expression if its target type and its captured arguments are serializable.

Lambda Expressions (The Java™ Tutorials)

The important part of that is "captured arguments". Looking back at the disassembled bytecode, the invokedynamic instruction for the method reference certainly looks like it's capturing a Comparator (#0:compare:(LComparator;)LComparator;, in contrast to the explicit lambda, #1:compare:()LComparator;).

Confirming capturing is the issue

ObjectOutputStream contains an extendedDebugInfo field, which we can set using the -Dsun.io.serialization.extendedDebugInfo=true VM argument:

$ java -Dsun.io.serialization.extendedDebugInfo=true Generic

When we try to serialize the lambdas again, this gives a very satisfactory

Exception in thread "main" java.io.NotSerializableException: Generic$$Lambda$1/321001045
        - element of array (index: 0)
        - array (class "[LObject;", size: 1)
/* ! */ - field (class "invoke.SerializedLambda", name: "capturedArgs", type: "class [LObject;") // <--- !!
        - root object (class "invoke.SerializedLambda", SerializedLambda[capturingClass=class Generic, functionalInterfaceMethod=Comparator.compare:(LObject;LObject;)I, implementation=invokeInterface Comparator.compare:(LObject;LObject;)I, instantiatedMethodType=(LInteger;LInteger;)I, numCaptured=1])
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1182)
    /* removed */
    at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
    at Generic.main(Generic.java:27)

What's actually going on

From the above, we can see that the explicit lambda is not capturing anything, whereas the method reference lambda is. Looking over the bytecode again makes this clear:

  public static Comparator<Integer> explicit();
      0: invokedynamic #7,  0  // InvokeDynamic #1:compare:()LComparator;    
      5: checkcast     #5  // class java/io/Serializable    
      8: checkcast     #6  // class Comparator    
      11: areturn

Which, as seen above, has an implementation method of:

  private static int lambda$explicit$d34e1a25$1(java.lang.Integer, java.lang.Integer);
     0: getstatic     #2  // Field COMPARATOR:Ljava/util/Comparator;
     3: aload_0
     4: aload_1
     5: invokeinterface #44,  3  // InterfaceMethod java/util/Comparator.compare:(Ljava/lang/Object;Ljava/lang/Object;)I
    10: ireturn

The explicit lambda is actually calling lambda$explicit$d34e1a25$1, which in turn calls the COMPARATOR#compare. This layer of indirection means it's not capturing anything that isn't Serializable (or anything at all, to be precise), and so is safe to serialize. The method reference expression directly uses COMPARATOR (the value of which is then passed to the bootstrap method):

  public static Comparator<Integer> reference();
      0: getstatic     #2  // Field COMPARATOR:LComparator;    
      3: dup    
      4: invokevirtual #3   // Method Object.getClass:()LClass;    
      7: pop    
      8: invokedynamic #4,  0  // InvokeDynamic #0:compare:(LComparator;)LComparator;    
      13: checkcast     #5  // class java/io/Serializable    
      16: checkcast     #6  // class Comparator    
      19: areturn

The lack of indirection means that COMPARATOR must be serialized along with the lambda. As COMPARATOR does not refer to a Serializable value, this fails.

The fix

I hesitate to call this a compiler bug (I expect the lack of indirection serves as an optimisation), although it is very strange. The fix is trivial, but ugly; adding the explicit cast for COMPARATOR at declaration:

public static final Comparator<Integer> COMPARATOR = (Serializable & Comparator<Integer>) (a, b) -> a > b ? 1 : -1;

This makes everything perform correctly on Java 1.8.0_45. It's also worth noting that the eclipse compiler produces that layer of indirection in the method reference case as well, and so the original code in this post does not require modification to execute correctly.