user1620091 user1620091 - 5 months ago 7
MySQL Question

return numbers from the middle of a string with irregular format

I have a field in a MySQL database that contains item descriptions of purchased products. Some of these are descriptions in plain English, others are part numbers, and others still are part numbers followed by a description. I have removed all spaces and dashes from the strings with a replace().


data looks like this:


1938420985390asdfih
1234812934810dflkasd
asdfasldkjfaasdfjasd
asd;flkjaklsdf
adfsdf1234073927357sdapjfas
1/4sdikhsd



and I would like to return:


1938420985390
1234812934810
(null)
(null)
1234073927357
(null)


What I really need is to write a SQL that will return the 13 digit part numbers, but not the extra letters/characters. I would prefer that it return the actual number, too, rather than a 1 or 0 for match/ no match.

I tried using a REGEXP function (someone suggested
regexp ('\d{13}')
or
regexp ('\p{13}')
but these didn't work. [These returned a 0 or 1, and not the part of the string that matched.] Any suggestions?

Thanks!

Answer

This is a non-trivial task in MySQL, there's no builtin function for returning a regular expression match. But because you are looking for exactly 13 digits, you could do something like this (obviously extend this to the number of positions you need to check...

-- setup test
CREATE TABLE t (foo VARCHAR(30));
INSERT INTO t VALUES 
('1938420985390asdfih')
,('1234812934810dflkasd')
,('asdfasldkjfaasdfjasd')
,('asd;flkjaklsdf')
,('adfsdf1234073927357sdapjfas')
,('1/4sdikhsd')


SELECT CASE
       WHEN SUBSTR(foo,1,13) REGEXP '^[0-9]{13}$' THEN SUBSTR(foo,1,13)
       WHEN SUBSTR(foo,2,13) REGEXP '^[0-9]{13}$' THEN SUBSTR(foo,2,13)
       WHEN SUBSTR(foo,3,13) REGEXP '^[0-9]{13}$' THEN SUBSTR(foo,3,13)
       WHEN SUBSTR(foo,4,13) REGEXP '^[0-9]{13}$' THEN SUBSTR(foo,4,13)
       WHEN SUBSTR(foo,5,13) REGEXP '^[0-9]{13}$' THEN SUBSTR(foo,5,13)
       WHEN SUBSTR(foo,6,13) REGEXP '^[0-9]{13}$' THEN SUBSTR(foo,6,13)
       WHEN SUBSTR(foo,7,13) REGEXP '^[0-9]{13}$' THEN SUBSTR(foo,7,13)
       WHEN SUBSTR(foo,8,13) REGEXP '^[0-9]{13}$' THEN SUBSTR(foo,8,13)
       WHEN SUBSTR(foo,9,13) REGEXP '^[0-9]{13}$' THEN SUBSTR(foo,9,13)
       END AS digits
  FROM t

-------------------
1938420985390
1234812934810
(NULL)
(NULL)
1234073927357
(NULL) 

No, it's not pretty. But you should be able to extend this to effectively "scan" a string of reasonable length.

NOTE: The regular expression is checking that the whole 13 character substring consists of exactly 13 characters, each of the characters is a decimal digit (0 thru 9).

Comments