himanshu himanshu - 1 month ago 9
SQL Question

Postgres Complex queries in functions

For reference please see my earlier question thread which was solved but now I need to develop more complex queries.

For instance, I'm having the following array :

array_content varchar[]:= '{BBQ,sports,Marathon}'


And now what I need is to work with the following array type:

{"Festivals,games","sport,swim"}


So let's say if any of the name or description matched with any of the tags separated by a comma, we count it as 1. Example

I have following table:

id description additional_info
123 games XYD
124 Festivals sport swim


Array content exactly the same as written down:

array_content varchar[]:= {"Festivals,games","sport,swim"}


The result of the above example will be:

id RID Matches
1 123 1
2 124 2

Answer

The answer isn't simple, but figuring out what you are asking was harder:

SELECT row_number() OVER (ORDER BY t.id) AS id
     , t.id AS "RID"
     , count(DISTINCT a.ord) AS "Matches"
FROM   tbl t
LEFT   JOIN (
   unnest(array_content) WITH ORDINALITY x(elem, ord)
   CROSS JOIN LATERAL
   unnest(string_to_array(elem, ',')) txt
   ) a ON t.description ~ a.txt
       OR t.additional_info ~ a.txt
GROUP  BY t.id;

Produces your desired result exactly.
array_content is your array of search terms.

How does this work?

Each array element of the outer array in your search term is a comma-separated list. Decompose the odd construct by unnesting twice (after transforming each element of the outer array into another array). Example:

SELECT *
FROM   unnest('{"Festivals,games","sport,swim"}'::varchar[]) WITH ORDINALITY x(elem, ord)
CROSS  JOIN LATERAL
       unnest(string_to_array(elem, ',')) txt;

Result:

 elem            | ord |  txt
-----------------+-----+------------
 Festivals,games | 1   | Festivals
 Festivals,games | 1   | games
 sport,swim      | 2   | sport
 sport,swim      | 2   | swim

Since you want to count matches for each outer array element once, we generate a unique number on the fly with WITH ORDINALITY. Details:

Now we can LEFT JOIN to this derived table on the condition of a desired match:

   ... ON t.description ~ a.txt
       OR t.additional_info ~ a.txt

.. and get the count with count(DISTINCT a.ord), counting each array only once even if multiple search terms match.

Finally, I added the mysterious id in your result with row_number() OVER (ORDER BY t.id) AS id - assuming it's supposed to be a serial number. Voilá.

The same considerations for regular expression matches (~) as in your previous question apply: