Don Hatch Don Hatch - 6 months ago 14
Java Question

could java.util.ArrayList<T>.toArray() be made friendlier?

I'm surprised by how painful it is to use java.util.ArrayList<T>.toArray().

Suppose I declare my array list as:

java.util.ArrayList<double[]> arrayList = new java.util.ArrayList<double[]>();
... add some items ...


Then to convert it to an array, I have to do one of the following:

double[][] array = (double[][])arrayList.toArray(new double[0][]);


or:

double[][] array = (double[][])arrayList.toArray(new double[arrayList.size()][]);


or:

double[][] array = new double[arrayList.size()];
arrayList.toArray(array);


None of the above are very readable. Shouldn't I be able to say the following instead?

double[][] array = arrayList.toArray();


But that gives a compile error because Object[] can't be converted to double[][].

Perhaps it's not possible because toArray has to return Object[]
for backwards compatibility with pre-template days.
But if that's the case, couldn't a friendlier alternative method be added
with a different name? I can't think of a good name, but almost anything
would be better than the existing ways; e.g. the following would be fine:

double[][] array = arrayList.toArrayOfNaturalType();


No such member function exists, but maybe it's possible to write a generic helper function that will do it?

double[][] array = MyToArray(arrayList);


The signature of MyToArray would be something like:

public static <T> T[] MyToArray(java.util.ArrayList<T> arrayList)


Is it possible to implement such a function?
My various attempts at implementing it resulted in compile errors
"error: generic array creation" or "error: cannot select from a type variable".

Here's the closest I was able to get:

public static <T> T[] MyToArray(java.util.ArrayList<T> arrayList, Class type)
{
T[] array = (T[])java.lang.reflect.Array.newInstance(type, arrayList.size());
arrayList.toArray(array);
return array;
}


It's called like this:

double[][] array = MyToArray(arrayList, double[].class);


I wish the redundant final parameter wasn't there, but, even so,
I think this is the least-horrible way that I've seen so far for converting array list to array.

Is it possible to do any better than this?

Answer

Unfortunately you cannot write

double[][] array = arrayList.toArray();

The reason is that toArray() was defined in JDK 1.2 (prior to generics) to return Object[]. This cannot be changed compatibly.

Generics were introduced in Java 5 but were implemented using erasure. This means that the ArrayList instance has no knowledge at runtime of the types of objects it contains; therefore, it cannot create an array of the desired element type. That's why you have to pass a type token of some sort -- in this case an actual array instance -- to tell ArrayList the type of the array to create.

You should be able to write

double[][] array = arrayList.toArray(new double[0][]);

without a cast. The one-arg overload of toArray() is generified, so you'll get the right return type.

One might think that it's preferable to pass a pre-sized array instead of a throwaway zero-length array. Aleksey Shipilev wrote an article analyzing this question. The answer is, somewhat counterintuitively, that creating a zero-length array is potentially faster.

Briefly, the reason is that allocation is cheap, a zero-length array is small, and it's probably going to be thrown away and garbage collected quickly, which is also cheap. By contrast, creating a pre-sized array requires it to be allocated and then filled with nulls/zeroes. It's then passed to toArray(), which then fills it with values from the list. Thus, every array element is typically written twice. By passing a zero-length array to toArray(), this allows the array allocation to occur in the same code as the array filling code, providing the opportunity for the JIT compiler to bypass the initial zero-fill, since it knows that every array element will be filled.

There is also JDK-8060192 which proposes to add the following:

<A> A[] Collection.toArray(IntFunction<A[]> generator)

This lets you pass a lambda expression that is given the array size and returns a created array of that size. (This is similar to Stream.toArray().) For example,

// NOT YET IMPLEMENTED
double[][] array = arrayList.toArray(n -> new double[n][]);
double[][] array = arrayList.toArray(double[][]::new);

This isn't implemented yet, but I'm still hopeful this can get into JDK 9.

You could rewrite your helper function along these lines:

static <T> T[] myToArray(List<T> list, IntFunction<T[]> generator) {
    return list.toArray(generator.apply(list.size()));
}

(Note that there is some subtlety here with concurrent modification of the list, which I'm ignoring for this example.) This would let you write:

double[][] array = myToArray(arrayList, double[][]::new);

which isn't terribly bad. But it's not actually clear that it's any better than just allocating a zero-length array to pass to toArray().

Finally, one might ask why toArray() takes an actual array instance instead of a Class object to denote the desired element type. Joshua Bloch (creator of the Java collections framework) said in comments on JDK-5072831 that this is feasible but that he's not sure it's a good idea, though he could live with it.

There an additional use case here as well, of copying the elements into an existing array, like the old Vector.copyInto() method. The array-bearing toArray(T[]) method also supports this use case. In fact, it's better than Vector.copyInto() because the latter cannot be used safely in the presence of concurrent modification, if the collection's size changes. The auto-sizing behavior of toArray(T[]) handles this, and it also handles the case of creating an array of the caller's desired type as described above. Thus, while adding an overload that takes a Class object would certainly work, it doesn't add much over the existing API.