Hairy Maclary Hairy Maclary - 3 months ago 11
MySQL Question

LEFT OUTER JOIN cant group column and returns duplicates

I have simplified my question and tidied up the example. Sorry for the confusion.

I have two tables

kpi_index and audit_results

the kpi_index contains 8 unique sor_code's for subcategory 'C1'
the audit_results contains records with audits
kpi_index is joined via INDEX field to sor_code kpi_index field

I need the result to show 'All' 8 sor_codes and the number of audits, audits that complied and the percentage pass, per sor_code, however, if there are no results for any sor_code, show a NULL or '0' against the sor_code.

I came up with this query, but for some reason I get duplicate sor_codes if there is a result found i.e. values in the audit_results e.g. C1.TAM and C1.TOM values are returned from the audit_results table and I therefore need to not show a null from the kpi_index matching sor_code/record.

My goal is to show all the sor_codes available from the kpi_index and populate those sor_codes where matches are found in the joined audit_results table and replace the matching null line with the populated 'audited/complied/percentage' values - there should be no duplicate sor_code with a null value if a matching sor_code was found in the audit_results.

Thanks

sor_code Spec. Code audits complied percentage contractor
C1.SNM C1 0 null null null
C1.SSM C1 0 null null null
C1.TAM C1 0 null null null
C1.TAM C1 186 151 81% South
C1.TIM C1 0 null null null
C1.TNM C1 0 null null null
C1.TOM C1 0 null null null
C1.TOM C1 41 40 98% South

SELECT DISTINCT
kpi_index.sor_code,
kpi_index.`Spec. Code`,
COUNT(audit_results.compliance) AS audits,
SUM(audit_results.compliance) AS complied,
CONCAT(ROUND(SUM(audit_results.compliance) / COUNT(kpi_index.sor_code) * 100, 0), '%') AS percentage,
audit_results.contractor
FROM kpi_index
LEFT OUTER JOIN audit_results
ON kpi_index.`INDEX` = audit_results.kpi_index
WHERE audit_results.contractor = 'South'
OR auit_results.contractor IS NULL
GROUP BY kpi_index.sor_code,
audit_results.contractor
HAVING kpi_index.`Spec. Code` = 'C1'
ORDER BY kpi_index.sor_code

Answer
SELECT
    kpi_index.sor_code,
    min(kpi_index.`Spec. Code`) AS spec_code,
    COUNT(audit_results.compliance) AS audits,
    SUM(audit_results.compliance) AS complied,
    CONCAT(ROUND(
        SUM(audit_results.compliance) * 100.00
            / COUNT(kpi_index.sor_code),
        0), '%') AS percentage,
    MIN(audit_results.contractor) AS contractor
FROM
    kpi_index
    LEFT OUTER JOIN audit_results
        ON kpi_index.`INDEX` = audit_results.kpi_index
           AND audit_results.contractor = 'South'
WHERE kpi_index.`Spec. Code` = 'C1'
GROUP BY kpi_index.sor_code
ORDER BY kpi_index.sor_code
  1. DISTINCT is not necessary. When you're including all the GROUP BY columns the results are already distinct.

  2. The main problem you were having was with the attempt to filter your join to only rows with contractor = 'South'. When you use an outer join you have to be careful with WHERE clause conditions because the joins are logically completed first and then the conditions are applied afterward. When your join had matched only on contractors of a different name then none of your rows matched the is null or = 'South' tests.

  3. The condition in the HAVING really belongs in the WHERE clause. It actually doesn't strictly make sense even though MySQL appears to accept it.

  4. I've removed the extra grouping on contractor preferring to use MIN() in the SELECT list instead. Ultimately it should be redundant if my presumption that contractor is not nullable is correct.

Comments