Josh Klein Josh Klein - 1 month ago 6
SQL Question

'Merging' multiple rows with NULLs into rows with distinct sums

Imagine the following table

| EEID | CODE1 | VALUE1 | CODE2 | VALUE2 |
|------|-------|--------|-------|--------|
| 001 | ABC | 10 | NULL | 0 |
| 001 | ABC | 5 | NULL | 0 |
| 001 | DEF | 2 | NULL | 0 |
| 001 | NULL | 0 | 123 | 3 |
| 001 | NULL | 0 | 123 | 6 |
| 001 | NULL | 0 | 456 | 4 |
| 001 | NULL | 0 | 789 | 1 |


Trying to avoid vendor specific functions, what is the best way to 'merge' this down using SQL to look like the following

| EEID | CODE1 | VALUE1 | CODE2 | VALUE2 |
|------|-------|--------|-------|--------|
| 001 | ABC | 15 | 123 | 9 |
| 001 | DEF | 2 | 456 | 4 |
| 001 | NULL | 0 | 789 | 1 |


Basically I need to be able to sum the distinct values in the CODEx columns.

Answer

One way would be

SELECT COALESCE(T1.EEID, T2.EEID) AS EEID,
       CODE1,
       VALUE1,
       CODE2,
       VALUE2
FROM   (SELECT EEID,
               CODE1,
               SUM(VALUE1)                        AS VALUE1,
               ROW_NUMBER() OVER (ORDER BY CODE1) AS RN
        FROM   YourTable
        WHERE  CODE1 IS NOT NULL
        GROUP  BY EEID,
                  CODE1) T1
       FULL JOIN (SELECT EEID,
                         CODE2,
                         SUM(VALUE2)                        AS VALUE2,
                         ROW_NUMBER() OVER (ORDER BY CODE2) AS RN
                  FROM   YourTable
                  WHERE  CODE2 IS NOT NULL
                  GROUP  BY EEID,
                            CODE2) T2
         ON  ON T1.RN = T2.RN AND T1.EEID = T2.EEID;

It doesn't use any vendor specific functions though MySQL hasn't got around to implementing window functions (or full join for that matter) yet.

Or another way.

SELECT EEID,
       MAX(CODE1)  AS CODE1,
       SUM(VALUE1) AS VALUE1,
       MAX(CODE2)  AS CODE2,
       SUM(VALUE2) AS VALUE2
FROM   (SELECT EEID,
               CODE1,
               VALUE1,
               CODE2,
               VALUE2,
               DENSE_RANK() OVER (PARTITION BY EEID ORDER BY CASE WHEN CODE1 IS NULL THEN 1 ELSE 0 END, Code1) AS RN1,
               DENSE_RANK() OVER (PARTITION BY EEID ORDER BY CASE WHEN CODE2 IS NULL THEN 1 ELSE 0 END, Code2) AS RN2
        FROM   YourTable) T
GROUP  BY EEID,
          CASE
            WHEN CODE1 IS NULL
              THEN RN2
            ELSE RN1
          END
HAVING COALESCE(MAX(CODE1), MAX(CODE2)) IS NOT NULL;