Daniel Jørgensen Daniel Jørgensen - 1 month ago 11
MySQL Question

MySQL - Order substring digits lowest to highest

I want to order a column by its values, lowest to highest. However the column has a prefix set, so that its contents can look like this

SR1001000
. I've figured that ifi want to order this, i would need to remove the prefix. So for now my Query looks like this:

SELECT a2t_import.*,SUBSTRING(a2t_import.a2t_vare_nr, 3) as partial_vare_nr
FROM a2t_import
ORDER BY partial_vare_nr ASC`


However i also need to only get the rows where the column has a specific prefix which i get by adding a regular expression like so

SELECT a2t_import.*,SUBSTRING(a2t_import.a2t_vare_nr, 3) as partial_vare_nr
FROM a2t_import
WHERE a2t_vare_nr REGEXP '^(SR)+[0-9]+'
ORDER BY partial_vare_nr ASC


This gives me the correct output where the above example looks like this
1001000
, but the sorting is not what I'd expect.

I get the following output

10002000
1001000
...


As you can see, the first row is clearly of a higher number than the second. Why is this?

Answer Source

The reason the sorting is off that currently MySQL is treating your computed column as text, not as numerical data. This has the following unwanted side effect:

10002000
1001000
   ^

The value 10002000 appears first, because it would appear before 1001000 in a dictionary. One trick to workaround this would be to also use the lengths of the strings when sorting. Consider the following comparison:

1000200
1001000

Now 1000200 comes before 1001000, and the lexicographic sort agrees with the numeric sort.

Try the following query:

SELECT a2t_import.*,
       SUBSTRING(a2t_import.a2t_vare_nr, 3) AS partial_vare_nr
FROM a2t_import
WHERE a2t_vare_nr REGEXP '^(SR)+[0-9]+'
ORDER BY CHAR_LENGTH(partial_vare_nr),     -- shortest strings first
         partial_vare_nr                   -- lexigraphical sort among strings
                                           -- of same length - agrees with numeric sort