RMason RMason - 3 months ago 21
SQL Question

Oracle replace a character not followed by another character

I am attempting to replace all of the

&
's in a string with
&amp
unless the
&
is followed by
lt
,
apos
,
gt
or
quot
.

Running this statement

select
regexp_replace('&lt &apos &gt &quot &','&(^lt|^gt|^quot|^apos)','&amp')


however results in no changes to the string.

The output I would be looking for is

'&lt &apos &gt &quot &amp'

Answer

A direct and efficient solution (but difficult to write, read and maintain) is:

set define off

(in case you are using a front-end that uses & to mark substitution variables)

then

with
     inputs ( inp_str ) as (
       select '&lt &apos &gt &quot &' from dual union all
       select 'Hello, World!'         from dual union all
       select ''                      from dual union all
       select '7 &lt 10 &and &&quot'  from dual
     )
select inp_str,
       regexp_replace(inp_str, 
  '&($|[^lagq]|(g|l)([^t]|$)|a($|[^p]|p($|[^o]|o($|[^s])))|q($|[^u]|u($|[^o]|o($|[^t]))))',
         '&amp\1') as new_str
from inputs;

Explanation: (partial...) This will replace every & with &amp, with a few exceptions. The & will be replaced if:

  • It is followed by the end of the string ($), or
  • It is followed by any character other than l, a, g or q; or
  • it is followed by g or l, which is then followed by a character other than t, or by the end of string ($); or
  • It is followed by a, followed by the end of string, by any letter other than p, or by the letter p followed by the end of string, or .........

Output (from my inputs):

INP_STR                      NEW_STR
---------------------------- ----------------------------
&lt &apos &gt &quot &        &lt &apos &gt &quot &amp
Hello, World!                Hello, World!

7 &lt 10 &and &&quot         7 &lt 10 &ampand &amp&quot

4 rows selected.

(Note: I always include an empty string and a string with no ampersands among the inputs, to verify that the query works correctly on them too.)

Comments