Ian Ian - 3 months ago 6
SQL Question

How to filter duplicate rows and filter rows based on a single column value?

select co.id, udf.string_val, udf.id

from customer_order co left join user_def_fields as udf on co.id = udf.document_id

where status = 'h' and order_date between '1/1/2016' and '12/31/2016'

and co.id <> (select document_id from user_def_fields

where (string_val = 'questions' or string_val = 'credit card' or string_val = 'credit hold'
or string_val = 'design'))

co.id | String_val | udf.id
------|------------|--------
9798 | QUESTIONS |UDF-000054
9798 | RUSH ORDER |UDF-000047
9798 | RUSH ORDER |UDF-000024
9799 | RUSH ORDER |UDF-000047
9799 | RUSH ORDER |UDF-000024
9799 | DESIGN |UDF-000054
9801 | RUSH ORDER |UDF-000047
9801 | RUSH ORDER |UDF-000024
9802 | NULL |NULL
9803 | RUSH ORDER |UDF-000047
9803 | RUSH ORDER |UDF-000024
9803 | CHECKED |UDF-000054


Here is a sample of the query results without any filters. Co.id is the order number, the various string_vals are notes within there respective udf fields, udf.id is the field id, and document_id is the order number reference in the udf table. Customer service enters 'Rush Order' in two fields, which is why it shows up twice. Sometimes no notes are entered and the udf fields are null. I really just need the co.id results returned and to be unique. From the above data sample, I want my query to return

co.id
------
9801
9802
9803


My current query gives me an error because it returns multiple results.

Answer

You can do this to avoid using DISTINCT.

SELECT  co.id
FROM    customer_order co
WHERE   status = 'h'
        AND order_date BETWEEN '1/1/2016' AND '12/31/2016'
        AND NOT EXISTS ( SELECT 1
                         FROM   user_def_fields udf
                         WHERE  udf.document_id = co.id
                                AND string_val IN ('questions','credit card','credit hold','design')  
        )

or you could use left join and check for null

SELECT  co.id
FROM    customer_order co
        LEFT JOIN user_def_fields udf ON udf.document_id = co.id 
            AND udf.string_val IN ('questions','credit card','credit hold','design') 
WHERE   co.status = 'h'
        AND order_date BETWEEN '1/1/2016' AND '12/31/2016'
        AND udf.document_id IS NULL