himanshu himanshu - 3 months ago 12
SQL Question

Postgres query to calculate matching strings

I have following table:

id description additional_info

And an array as:

[{It is known to be XYZ},{It is know to be none},{It is know to be XYD}]

I need to map both the content in such a way that for every record of table I'm able to define the number of successful match.
The result of the above example will be:

id RID Matches
1 123 2

Only the content at position 0 and 2 match the record's
is 2 in the result.

I am struggling to transform this to a query in Postgres - dynamic SQL to create a
in a PL/pgSQL function to be precise.


For instance, I'm having the following array :

array_content varchar[]:= '{BBQ,sports,Marathon}'

And now what I need is to work with the following array type:


This is exactly what I'm getting from my query which I need to process. So let's say if any of the name or description matched with any of the tags separated by a comma, we count it as 1. Example

I have following table:

id description additional_info
123 games XYD
124 Festivals sport swim

Array content exactly the same as written down:

array_content varchar[]:= {"Festivals,games","sport,swim"}

The result of the above example will be:

id RID Matches
1 123 1
2 124 2


It's undefined how to deal with array elements that match both description and additional_info at the same time. I'll assume you want to count that as 1 match.

It's also undefined where id = 1 comes from in the result.

One way is to unnest() the array and LEFT JOIN the main table to each element on a match on either of the two columns:

SELECT 1 AS id, t.id AS "RID", count(a.txt) AS "Matches"
FROM   tbl t
LEFT   JOIN unnest(my_arr) AS a(txt) ON a.txt ~ t.description
                                     OR a.txt ~ t.additional_info
GROUP  BY t.id;

I use a regular expression for the match. Special characters like (.\?) etc. in the strings to the right have special meaning. You might have to escape those if possible.

Addressing your comment

You should have mentioned that you are using a plpgsql function with EXECUTE. Probably 2 errors:

  1. The variable array_content is not visible inside EXECUTE, you need to pass the value with a USING clause - or concatenate it as string literal in a CREATE VIEW statement which does not allow parameters.

  2. Missing single quotes around the string 'brand_relevance_calculation_‌​view'. It's still a string literal before you concatenate it as identifier. You did good to use format() with %I there.


   array_content varchar[]:= '{FREE,DAY}'; 

EXECUTE format('
   SELECT id, description, additional_info, name, count(a.text) AS business_objectives
        , multi_city, category IS NOT NULL AS category
   FROM initial_events i
   LEFT JOIN unnest(%L::varchar[]) AS a(text) ON a.text ~ i.description
                                              OR a.text ~ i.additional_info'
 , 'brand_relevance_calculation_‌​view', array_content);