Mistu4u Mistu4u - 2 months ago 8
SQL Question

Regexp_substr is not working as expected

So, I have a piece of simple SQL like:

select
REGEXP_SUBSTR (addtl_detail_info, '[^|]+', 1, 2)
||'|'|| REGEXP_SUBSTR (addtl_detail_info, '[^|]+', 1, 3)
||'|'|| REGEXP_SUBSTR (addtl_detail_info, '[^|]+', 1, 4)
from tbaadm.uad where ADDTL_DETAIL_INFO is not null and module_key='01-07-2016 00:00:00/2212/ 1';


The idea is to get the pipe separated values present in the addtl_detail_info column where the value present is:

~custom|HELLO1||HELLO3


So I need the values like HELLO1,whitespace (as there is no value between the second pipe and the third pipe) and HELLO3.

But when I ran the above query it returns as:

HELLO1|HELLO3|


and the white space is gone. I need this white space to retain. So what am I doing wrong here?

Answer

Regex of the form '[^|]+' does not work with NULL list elements and should be avoided! See this post for more info: Split comma separated values to columns in Oracle

Use this form instead:

select regexp_substr('1,2,3,,5,6', '(.*?)(,|$)', 1, 5, NULL, 1) from dual;

Which can be read as "get the 1st remembered group of the 5th occurrence of the set of characters ending with a comma or the end of the line".

So for your 4th element, you would use this to preserve the NULL in element 3 (assuming you want to build it by separate elements and not just grab the string from the character after the first separator to the end):

...
REGEXP_SUBSTR (addtl_detail_info, '(.*?)(\||$)', 1, 2) ||
REGEXP_SUBSTR (addtl_detail_info, '(.*?)(\||$)', 1, 3) ||
REGEXP_SUBSTR (addtl_detail_info, '(.*?)(\||$)', 1, 4)
...

You know, this may be easier. Just grab everything after the first pipe:

SQL> select REGEXP_replace('~custom|HELLO1||HELLO3', '^.*?\|(.*)', '\1') result
   from dual;

RESULT
--------------
HELLO1||HELLO3

SQL>

The parenthesis surround what you want to "remember" and the replace string references the 1st remembered group with "\1".