Whimusical Whimusical - 3 months ago 25
Java Question

Shuffle/Random comparator

IS there any way to emulate the behavior of Collections.shuffle without a comparator being vulnerable to the sorting algorithm implementation for the result to be safe?

I mean not breaking the comparable contract etc..

Answer

It is impossible to implement a real “shuffling comparator” without breaking the contract. One fundamental aspect of the Comparator contract is that the results are reproducible so the ordering of a particular Comparator instance must be fixed.

Of course, you could pre-initialize that fixed ordering using a shuffling operation and create a comparator which will establish exactly this ordering. E.g.

List<ElementType> ordering=new ArrayList<>(list);
Collections.shuffle(ordering);

list.sort(Comparator.comparingInt(ordering::indexOf));

though it is a bit pointless. It’s clear that this comparator must not be used for collections containing elements not being in the ordering list.


Alternatively you can use a stable property of the values which hasn’t an ordering in the first place as sort criteria, e.g. the hash code. This can be augmented by a stable but randomizable transformation, e.g.

public static Comparator<String> randomOrder() {
    ThreadLocalRandom r = ThreadLocalRandom.current();
    int x = r.nextInt(), y = r.nextInt();
    boolean b = r.nextBoolean();
    return Comparator.comparingInt((String s)->s.hashCode()^x)
     .thenComparingInt(s->s.length()^y)
     .thenComparing(b? Comparator.naturalOrder(): Comparator.reverseOrder());
}

 

List<String> list=Arrays.asList("hello", "now", "shuffle", "this", "!");
list.sort(randomOrder());
System.out.println(list);
list.sort(randomOrder());
System.out.println(list);

The key point is that each Comparator instance represents a randomly chosen but fixed ordering and we create a new Comparator instance to request a different ordering. Therefore, no Comparator violates the contract.

Note that this Comparator looks a bit complicated as it has to care about possible hash collisions. It will resort to the length property (also randomized) then and for Strings having the same hash code and length it will simply fall back to either natural or reversed order which is unlikely to be noticeable as it only affects the relation of these uncommon pairs.

If you create such a comparator for values without collisions (e.g. Integer instances) or covering all properties of the values which define the equality (e.g. both, x and y, of a Point), the comparator will look much simpler.