Mark Heckmann Mark Heckmann - 1 month ago 12
SQL Question

retrieve and combine chunks of information from string using regex in postgres

I need to retrieve information from structured text fragments which have the following format:

(AAB) Some name1 here 1234 (BB) More 12-text 99 (XY*) Hello world 12


What I want to get out is the following:
{AAB1234, BB99, XY*12}


Strategy:


  1. get characters inside brackets [e.g. (
    XY*
    )]

  2. get last group of digits which is either followed by brackets or the end of string [e.g.
    1234
    (]



I did not get very far, as my regex skills are fairly limited.

SELECT regexp_matches('(AAB) Some name1 1234 (BB) More text 99 (XY*) Hello world 12',
'\((.*?)\).*?(\d+)', 'g');


Giving

{AAB,1}
{BB,9}
{XY*,1}


Any ideas?

Add-on question:

We have the above text information in a column
information
in table
my_table
and we want to write the results into column 'results'. How can I integrate the above solution into an
UPDATE
statement?
I.E.

UPDATE my_table SET results = ???.

Answer

You may try:

SELECT array_agg(v) FROM (
    SELECT array_to_string(
               regexp_matches(
                   '(AAB) Some name1 1234 (BB) More text 99 (XY*) Hello world 12',
                   '\((.*?)\).*?(\d+)(?=$| \()', 'g'
               ),
               ''
           ) as v
    ) s;

Note that as usual, regexps can be quite fragile if you don't have a very formal definition of the syntax.

Comments