Jaco Jaco - 6 months ago 23
SQL Question

Oracle SQL Query for Stacked Pareto Chart

I have an Oracle Table that contains data similar to the following basic example:

+--------+----------+
| SERIES | CATEGORY |
+--------+----------+
| green | apple |
| green | pear |
| green | pear |
| yellow | apple |
| yellow | apple |
| yellow | pear |
| yellow | pear |
| yellow | pear |
| yellow | banana |
| yellow | banana |
| yellow | banana |
| red | apple |
+--------+----------+


I would like to generate a Pareto-like Graph of this data that should look as like the image in this link: Stacked Pareto Chart

To create this graph I would like to run a SQL query and get the following output:

+----------+--------+-------+
| CATEGORY | SERIES | COUNT |
+----------+--------+-------+
| pear | green | 2 |
| pear | yellow | 3 |
| apple | green | 1 |
| apple | yellow | 2 |
| apple | red | 1 |
| banana | yellow | 3 |
+----------+--------+-------+


The actual table has millions of entries and it currently takes a significant amount of time to query the database as the current procedure I am using is not very efficient:

Order the categories by the amount of entries in each category:

SELECT CATEGORY, COUNT(CATEGORY) FROM FRUIT GROUP BY CATEGORY ORDER BY COUNT(CATEGORY);


Then for each category I list the relevant series in order of the series:

SELECT SERIES, COUNT(SERIES) FROM FRUIT WHERE CATEGORY = [current category] GROUP BY SERIES ORDER BY SERIES;


What would be the most efficient way to query the database (Preferably a single SQL statement) in order to get the desired output?

Answer

You can achieve the desired result by grouping on both CATEGORY and SERIES:

SELECT 
    CATEGORY, SERIES, COUNT(*) 
FROM FRUIT 
GROUP BY CATEGORY, SERIES 
ORDER BY COUNT(*);

UPDATE:

To order by total of CATEGORY first and then green, yellow, red, just like your expected output:

SELECT t1.*
FROM (
    SELECT 
        CATEGORY, SERIES, COUNT(*) AS CNT
    FROM FRUIT 
    GROUP BY CATEGORY, SERIES 
) t1
INNER JOIN (
    SELECT
        CATEGORY, COUNT(*) AS CNT
    FROM FRUIT
    GROUP BY CATEGORY
) t2
    ON t1.CATEGORY = t2.CATEGORY
ORDER BY 
    t2.CNT DESC,
    CASE t1.SERIES
        WHEN 'green' THEN 1
        WHEN 'yellow' THEN 2
        WHEN 'red' THEN 3
    END
Comments