Theofilos Myrillas Theofilos Myrillas - 5 months ago 12
SQL Question

Distinct values in a column

I have a table with the following data:

Row Names
1 George, George, John, Chris
2 Helen, John
3 John, John, John,John


and I want to make a new column with the following output

Row Output
1 George, John, Chris
2 Helen, John
3 John


How can I do this in Redshift SQL

Answer

Since Amazon Redshift does not support stored procedures or table functions, you should create a UDF:

CREATE OR REPLACE FUNCTION f_list_dedup
(cs_list VARCHAR)
RETURNS varchar
IMMUTABLE AS $$
  return ','.join(set([x.strip() for x in cs_list.upper().split(',')]))
$$ LANGUAGE plpythonu;

And then call it in your query:

WITH cte AS 
(   
    SELECT 'George, George, John, Chris' AS Names UNION
    SELECT 'Helen, John' UNION
    SELECT 'John, John, John,John'
)
SELECT * 
, f_list_dedup(Names)
FROM cte