Ceilican Ceilican - 19 days ago 7
Scala Question

Caching Scala Case Class Instances

Suppose we have the following case classes:

abstract sealed class Tree
case class Leaf(i: Int) extends Tree
case class Node(left: Tree, right: Tree) extends Tree


Every time we call a case class constructor, a new object is created in memory. For instance, in the code below:

val a = Leaf(0)
val b = Leaf(0)


a and b point to distinct objects in memory:

a == b // true
a eq b // false


I would like to override the "apply" method of the case classes, to make them return a cached object, in case it already exists, so that, in the minimal example above, "a eq b" would return true.

I found these two related answers in Stackoverflow:



I am planning to implement my overriding "apply" method with caching in a way that combines the two approaches linked above. But I am wondering if there are alternative ways that I should consider. If you know any, could you please share your solution here?

Caching instances of case classes seems to be a very useful and natural thing to do to reduce memory consumption. And yet, the solution I am planning to implement (based on the two answers linked above) seems quite convoluted, requiring a lot of boilerplate code that will compromise the elegance and succinctness of case classes. Does anyone know if future versions of the Scala language might allow us to achieve case class instance caching by writing something simple like this:

abstract sealed class Tree
cached case class Leaf(i: Int) extends Tree
cached case class Node(left: Tree, right: Tree) extends Tree


??

Answer

Caching instances of case classes seems to be a very useful and natural thing to do to reduce memory consumption.

Note that this isn't even remotely an automatic improvement, and very much depends on usage pattern of the case class (not just yours, but anybody who uses your library):

  1. You need to take into account the memory cache needs and inability to garbage collect instances referenced from the cache (note that using a WeakHashMap won't help: it requires "that value objects do not strongly refer to their own keys, either directly or indirectly").

  2. If the keys are primitives (as in Leaf), they need to be boxed before lookup which will often already be a constructor call.

  3. Lookup in a map is significantly slower than a trivial constructor call.

  4. Escape analysis will often ensure the objects aren't actually constructed, while making sure your program works as if they were. Of course, caching will ensure that objects do escape.

But neglecting all that, you can write a macro annotation which will allow you @cached case class Leaf(i: Int) extends Tree and generate the code you want (or at least @cachedcase class; I am not sure if you'll be able to override apply otherwise). Because of the above I just wouldn't expect it to be a part of the language any time soon.

Comments