Ben Ben - 7 days ago 7
MySQL Question

SQL Query Still having duplicates after group by

SELECT *
FROM `eBayorders`
WHERE (`OrderIDAmazon` is null or `OrderIDAmazon` = "null") and `Flag` = "True" and `TYPE` = "GROUP" and (`Carrier` is null or `Carrier` = "null") and LEFT(`SKU`, 1) = "B" and datediff(now(), `TIME`) < 4 and (`TrackingInfo` is null or `TrackingInfo` = "null") and `STATUS` = "PROCESSING"
group by `Name`, `SKU`
order by `TIME` asc
LIMIT 7


I am trying to make sure that none of the names and skus will show up in the same result. I am trying to group by name and then sku, however I ran into the problem where a result showed up that has the same name and different skus, which I dont want to happen. How can I fix this query to make sure that there is always distinct names and skus in the result set?!

For example say I have an Order:

Name: Ben Z, SKU : B000334, oldest
Name: Ben Z, SKU : B000333, second oldest
Name: Will, SKU: B000334, third oldest
Name: John, SKU: B000036, fourth oldest

The query should return only:
Name: Ben Z, SKU : B000334, oldest
Name: John, SKU: B000036, fourth oldest


This is because all of the Names should only have one entry in the set along with SKU.

Answer
select T1.*
from eBayorders T1
join (
    SELECT `Name`, `SKU`, max(`TIME`) as MAX_TIME
    FROM eBayorders
    WHERE (`OrderIDAmazon` is null or `OrderIDAmazon` = "null") and `Flag` = "True" and `TYPE` = "GROUP" and (`Carrier` is null or `Carrier` = "null") and LEFT(`SKU`, 1) = "B"  and datediff(now(), `TIME`) < 4 and (`TrackingInfo` is null or `TrackingInfo` = "null") and `STATUS` = "PROCESSING" 
    group by `Name`, `SKU` 
) as dedupe
 on T1.`Name` = dedupe.`Name`
and T1.`SKU`  = dedupe.`SKU`
and T1.`Time` = dedupe.`MAX_TIME`
order by `TIME` asc 
LIMIT 7

Your database platform should have complained because your original query had items in the select list which were not present in the group by (generally not allowed). The above should resolve it.

An even better option would be the following if your database supported window functions (MySQL doesn't, unfortunately):

select * from
(
    SELECT *,
           row_number() over (partition by `Name`, `SKU`
                              order by `TIME` asc)
           as dedupe_rank
    FROM eBayorders
    WHERE (`OrderIDAmazon` is null or `OrderIDAmazon` = "null") and `Flag` = "True" and `TYPE` = "GROUP" and (`Carrier` is null or `Carrier` = "null") and LEFT(`SKU`, 1) = "B"  and datediff(now(), `TIME`) < 4 and (`TrackingInfo` is null or `TrackingInfo` = "null") and `STATUS` = "PROCESSING" 
) T
where dedupe_rank = 1
order by T.`TIME` asc 
LIMIT 7