Doe Doe - 1 year ago 53
SQL Question

Select several most recent emails as one row

In my project I have a table of user emails EMAILS:

| ID | PROFILE_ID | EMAIL | LAST_UPDATED_TIMESTAMP|
___________________________________________________________________
| 1| 1| [email protected]| 2017-02-02 15:13:46|
| 2| 1| [email protected]| 2017-02-01 15:13:46|
| 3| 2| [email protected]| 2017-02-03 15:13:46|


where PROFILE_ID refers to the user profiles table PROFILES. I'd like to get 2(N in general) most recent distinct emails for each user profile as a one row:

| PROFILE_ID | EMAIL_1 | EMAIL_2 |
__________________________________________________
| 1| [email protected]| [email protected]|
| 2| [email protected]| NULL|


I tried many queries, the last one I stopped on is below:

SELECT
EMAIL1.PROFILE_ID,
EMAIL1.EMAIL AS EMAIL_1,
EMAIL2.EMAIL AS EMAIL_2
FROM EMAILS EMAIL1
LEFT JOIN
EMAILS EMAIL2
ON EMAIL1.PROFILE_ID = EMAIL2.PROFILE_ID AND EMAIL2.LAST_UPD_TMST <= EMAIL1.LAST_UPD_TMST AND EMAIL1.ID <> EMAIL2.ID


but it gives me extra records for profile containing several emails:

| PROFILE_ID | EMAIL_1 | EMAIL_2 |
__________________________________________________
| 1| [email protected]| [email protected]|
| 1| [email protected]| NULL|
| 2| [email protected]| NULL|


I use Oracle 11g but keep in mind MySql compatibility if possible. Could anyone say what is wrong in the query mentioned above?

Answer Source

One method is conditional aggregation:

select profile_id,
       max(case when seqnum = 1 then email end) as email1,
       max(case when seqnum = 2 then email end) as email2
from (select e.*,
             row_number() over (partition by profile_id order by last_updated_timestamp
      from emails e
     ) e
group by profile_id;

If you really do want to remove duplicates, then you can aggregate the data before the row_number(). To get the last occurrence of duplicate emails:

select profile_id,
       max(case when seqnum = 1 then email end) as email1,
       max(case when seqnum = 2 then email end) as email2
from (select e.*,
             row_number() over (partition by profile_id order by last_updated_timestamp
      from (select profile_id, email, max(last_updated_timestamp) as last_updated_timestamp
            from emails e
            group by profile_id, email
           ) e
     ) e
group by profile_id;
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download