SpreeTheGr8 SpreeTheGr8 - 4 months ago 28
Java Question

What determines sort order in Collections.sort where List contains non-alphanumeric characters?

I have code that sorts an ArrayList of elements based on one attribute called 'title' which is of type String. The code uses Collator like this:

Collator( Collator collator = Collator.getInstance(); ).


I have two objects with title "@a" and the other object has title "#a"

I pass these objects as a List and call

Collections.sort(list,comparator)


This gives the order as

"@a" "#a"


Why is "#a" appearing last even though its ASCII value is less than "@a" ?

Answer

Why is # appearing last even though its ASCII value is less than @ ?

My clean-room implementation:

final List<String> list = Arrays.asList("@a", "#a");
Collections.sort(list);
System.out.println(list);

Output:

[#a, @a]

This code doesn't reproduce your problem.

For reference:
'#' is 0x23
'@' is 0x40

Everything looks normal.


EDIT: new code following your comment "The code uses Collator but its used as Collator collator = Collator.getInstance(); not specific to any locale.":

final List<String> list = Arrays.asList("@a", "#a");
final Collator c = Collator.getInstance();

Collections.sort(list, c);
System.out.println(list);

Output:

[@a, #a]

This reproduces your problem.

If I use Collator.getInstance() to sort the ASCII table, this is the output I get:

-, _, ,, ;, :, !, ?, /, ., `, ^, ', ", (, ), [, ], {, }, @, $, *, \, &, #, %, +, <, =, >, |, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, a, A, b, B, c, C, d, D, e, E, f, F, g, G, h, H, i, I, j, J, k, K, l, L, m, M, n, N, o, O, p, P, q, Q, r, R, s, S, t, T, u, U, v, V, w, W, x, X, y, Y, z, Z

You can see this is quite different from the ASCII collating order:

", #, $, %, &, ', (, ), *, +, ,, -, ., /, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, :, ;, <, =, >, ?, @, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z, [, \, ], ^, _, `, a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z, {, |, }

For OP's interest, this is the code used to create this output:

final List<String> list = new ArrayList<String>();
final Collator col = Collator.getInstance();

for (char c = '!'; c < '~'; c++)
{
  list.add(c+"");
}

Collections.sort(list, col);
System.out.println(list);
Comments