gaffcz gaffcz - 1 year ago 45
SQL Question

REPLACE/REMOVE similar substrings from string

How to remove all the

\trxxx
substrings from following test string?

blabla\tr568\tr1136\tr1704\tr2272-\tr2840\tr3408\tr3976\tr4544\tr5112\tr5680blabla


To get
blabla-blabla
?

I've tried regex, but I failed:

select REPLACE('blabla\tr568\tr1136\tr1704\tr2272-\tr2840\tr3408\tr3976\tr4544\tr5112\tr5680blabla', '\tr[0-9]+', '')

Answer Source

SQL Server lacks Regex. You can use CLR integration (preferred if you have millions of records) or use recursive query. Example of recursive query with some test cases below:

WITH TestCases AS
(
    SELECT * FROM (VALUES
    ('blabla\tr568\tr1136\tr1704\tr2272-\tr2840\tr3408\tr3976\tr4544\tr5112\tr5680blabla'),
    ('\tr23SomeText\tr1'),
    ('bla99bla\tr568\tr1136\tr1704\tr2272-\tr2840\tr3408\tr3976\tr4544\tr5112\tr5680rock'),
    (''),
    (NULL)) T(Expr)
), Cte AS
(
    SELECT 1 R, ROW_NUMBER() OVER (ORDER BY Expr) Rec, CONVERT(varchar(MAX), Expr) Expr
    FROM TestCases
    UNION ALL
    SELECT R+1 R, Rec, CASE
        WHEN PATINDEX('%\tr[0-9][0-9][0-9][0-9]%', Expr)>0 THEN STUFF(Expr, PATINDEX('%\tr[0-9][0-9][0-9][0-9]%', Expr), 7, '')
        WHEN PATINDEX('%\tr[0-9][0-9][0-9]%', Expr)>0 THEN STUFF(Expr, PATINDEX('%\tr[0-9][0-9][0-9]%', Expr), 6, '')
        WHEN PATINDEX('%\tr[0-9][0-9]%', Expr)>0 THEN STUFF(Expr, PATINDEX('%\tr[0-9][0-9]%', Expr), 5, '')
        ELSE STUFF(Expr, PATINDEX('%\tr[0-9]%', Expr), 4, '') END
    FROM Cte
    WHERE PATINDEX('%\tr[0-9]%', Expr)>0
)
SELECT TOP 1 WITH TIES Expr FROM Cte
ORDER BY (ROW_NUMBER() OVER (PARTITION BY Rec ORDER BY R DESC))

It yields:

bla99bla-rock
blabla-blabla
NULL

SomeText