oschrenk oschrenk - 2 months ago 8x
Java Question

Lexicographic Order in Java

How is the lexicographic order defined in Java especially in reference to special characters like

and so on?

An examplary order can be found here

But how does Java define it's order? I ask because I'm sorting Strings on Java and on Oracle and come up with different results and can't find the specification for the lexicographic order.


From the docs for String.compareTo:

Compares two strings lexicographically. The comparison is based on the Unicode value of each character in the strings.


This is the definition of lexicographic ordering. If two strings are different, then either they have different characters at some index that is a valid index for both strings, or their lengths are different, or both. If they have different characters at one or more index positions, let k be the smallest such index; then the string whose character at position k has the smaller value, as determined by using the < operator, lexicographically precedes the other string. In this case, compareTo returns the difference of the two character values at position k in the two string [...]

So basically, it treats each string like a sequence of 16-bit unsigned integers. No cultural awareness, no understanding of composite characters etc. If you want a more complex kind of sort, you should be looking at Collator.