Bob Dunakey Bob Dunakey - 4 months ago 39
SQL Question

Teradata substring out of bounds

I'm having issues figuring out the bounds between a substring. For example for the string

063016_shape_tea_cleanse__emshptea1_
I want to substring out
emshptea1
, but it also has to work for the string
063016_shape_tea_cleanse__emshptea1_TESTDATA_HERE
.

Currently I have:

sel SUBSTR('063016_shape_tea_cleanse__emshptea1_',POSITION('__' IN '063016_shape_tea_cleanse__emshptea1_')+2,
POSITION('_' IN SUBSTR('063016_shape_tea_cleanse__emshptea1_',POSITION('__' IN '063016_shape_tea_cleanse__emshptea1_') + 2,CHARACTER_LENGTH('063016_shape_tea_cleanse__emshptea1_') - (POSITION('__' IN '063016_shape_tea_cleanse__emshptea1_') + 2)))-1)


But that is erroring out due to it trying to substring 27 to -1.

Answer

You might use a regular expression, this will extract everything between __ and the following _ or end of string:

REGEXP_SUBSTR(col, '(?<=__).+?(?=(_|$))')

'(?<= )' is a look-behind, i.e search for previous characters without adding it to the result. Here: search for __

'.+' matches any character, one or multiple times. This would match until the end of the string ("greedy"), '?' ("lazy") prevents that.

'(?= )' is a look-ahead, i.e. search for following characters without adding it to the result.

( | ) The pipe splits an expression in multiple alternatives. Here either an underscore character or the end of the string $

Comments