Den - Ben Den - Ben - 3 months ago 7
SQL Question

Oracle. Not valid ascii value of regex result

I'd like to edit a string. Get from 2 standing nearby digits digit and letter (00 -> 0a, 01 - 0b, 23-> 2c etc.)

111324 -> 1b1d2e.
Then my code:

set serveroutput on size unlimited

declare
str varchar2(128);

function convr(num varchar2) return varchar2 is
begin
return chr(ascii(num)+49);
-- return chr(ascii(num)+49)||'<-'||(ascii(num)+49)||','||ascii(num)||','||num||'|';
end;

function replace_dd(str varchar2) return varchar2 is
begin
return regexp_replace(str,'((\d)(\d))','\2'||convr('\3'));
end;
begin
str := '111324';
Dbms_Output.Put_Line(str);
Dbms_Output.Put_Line(replace_dd(str));
end;


But I get the next string: '112'.

When I checked result by commented return string I'v got:
'1<-141,92,1|1<-141,92,3|2<-141,92,4|'.

ascii(num) does not depend on num. It always works like ascii('\'). It is 92, plus 49 we got 141 and it is out of ascii table. But num by itself is printed correctly.
How can I get correct values? Or maybe another way to resolve this issue?

Answer

What is happening is that the replacement string is expanded first, and only after it is fully processed, any remaining backreferences like \2 are replaced by string fragments. So convr('\3') is processed first, and at this stage '\3' is a literal. ascii() returns the ascii code of the FIRST character of whatever string it receives as argument. So the 3 plays no role, you only get ascii('\') as you noticed. Then your user-defined function is evaluated and plugged back into the concatenation... by now there is no \3 left in the replacement string.

Exercise: Try to explain/understand why

regexp_replace('abcdef', '(b).*(e)', '\2' || upper('\1'))

is aebf and not aeBf. (Hint: what is the return from upper('\1') by itself, unrelated to anything else?)

You could split the input string into component characters, apply your transformation on those with even index and combine the string back (all in SQL, no need for loops and such). Something like this (done in plain SQL, you can rewrite it into your function if you like):

with
     inputs ( str ) as (
       select '111324' from dual union all
       select '372'    from dual
     ),
     singletons ( str, idx, ch ) as (
       select str, level, substr(str, level, 1)
       from   inputs
       connect by level <= length(str)
       and prior str = str
       and prior sys_guid() is not null
     )
select str, 
       listagg(case mod(idx, 2) when 1 then ch else chr(ascii(ch)+49) end, '')
                  within group (order by idx)
       as modified_str
from singletons
group by str
;

STR    MODIFIED_STR
------ --------------
111324 1b1d2e
372    3h2
Comments