upsideDownPaddy upsideDownPaddy - 2 months ago 6
SQL Question

MySQL - Dataset based on field list, if field not there enter dummy row for that field

Ok,

So I've tried to find a similar question on here about this. Not seeing one that relates exactly.

I have a table that has sales data by state. The data is organised like this;

SELECT * FROM SALES_BY_STATE;

PRODUCTGROUP SALES STATE MONTH YEAR
PRODUCTGROUP1 1000 A JAN 2016
PRODUCTGROUP1 1000 B JAN 2016
PRODUCTGROUP1 1000 F JAN 2016
PRODUCTGROUP1 1000 G JAN 2016
PRODUCTGROUP1 1000 H JAN 2016
PRODUCTGROUP1 1000 I JAN 2016
PRODUCTGROUP1 1000 J JAN 2016
PRODUCTGROUP1 1000 K JAN 2016
PRODUCTGROUP2 1000 A JAN 2016
PRODUCTGROUP2 1000 B JAN 2016
PRODUCTGROUP2 1000 C JAN 2016
PRODUCTGROUP2 1000 D JAN 2016
PRODUCTGROUP2 1000 E JAN 2016
PRODUCTGROUP2 1000 F JAN 2016
PRODUCTGROUP2 1000 G JAN 2016
PRODUCTGROUP2 1000 H JAN 2016
PRODUCTGROUP2 1000 I JAN 2016
PRODUCTGROUP2 1000 J JAN 2016
PRODUCTGROUP2 1000 K JAN 2016


I have 11 states (A - K). I want to create a statement that shows the states regardless of whether there is a related row for that state. As you can see in the above data, PRODUCT2 has record for state A - K. However, PRODUCTGROUP1 is missing some.

I want to extract the data and display it as follows;

PRODUCTGROUP SALES STATE MONTH YEAR
PRODUCTGROUP1 1000 A JAN 2016
PRODUCTGROUP1 1000 B JAN 2016
PRODUCTGROUP1 0 C JAN 2016
PRODUCTGROUP1 0 D JAN 2016
PRODUCTGROUP1 0 E JAN 2016
PRODUCTGROUP1 1000 F JAN 2016
PRODUCTGROUP1 1000 G JAN 2016
PRODUCTGROUP1 1000 H JAN 2016
PRODUCTGROUP1 1000 I JAN 2016
PRODUCTGROUP1 1000 J JAN 2016
PRODUCTGROUP1 1000 K JAN 2016
PRODUCTGROUP2 1000 A JAN 2016
PRODUCTGROUP2 1000 B JAN 2016
PRODUCTGROUP2 1000 C JAN 2016
PRODUCTGROUP2 1000 D JAN 2016
PRODUCTGROUP2 1000 E JAN 2016
PRODUCTGROUP2 1000 F JAN 2016
PRODUCTGROUP2 1000 G JAN 2016
PRODUCTGROUP2 1000 H JAN 2016
PRODUCTGROUP2 1000 I JAN 2016
PRODUCTGROUP2 1000 J JAN 2016
PRODUCTGROUP2 1000 K JAN 2016


Does that make sense? I basically want to show the state in the result regardless of whether there is a related line in the dataset.

Not sure how I would go about this. Any help would be greatly appreciated.

Answer

You can build an overall list using Cartesian product. Note that we are building this from the existing data, so if you decide all of a sudden that you have a STATE "L" then there must be at least one row in the SALES_BY_STATE table with that STATE. Same goes for time period.

Using some temporary tables will prevent the cartesian product from blowing up if you have a large table.

create temporary table if not exists pg AS select distinct PRODUCTGROUP from SALES_BY_STATE;
create temporary table if not exists st AS select distinct STATE from SALES_BY_STATE;
create temporary table if not exists mo AS select distinct MONTH from SALES_BY_STATE;
create temporary table if not exists yr AS select distinct YEAR from SALES_BY_STATE;

select list.PRODUCTGROUP, list.STATE, list.MONTH,
    list.YEAR, IFNULL(s.SALES, 0) 

FROM
    (select distinct PRODUCTGROUP, STATE, MONTH, YEAR 
    FROM pg, st, mo, yr  ) as list

LEFT JOIN
    SALES_BY_STATE as s
ON
    list.PRODUCTGROUP = s.PRODUCTGROUP and list.STATE = s.STATE
   and list.MONTH = s.MONTH AND list.YEAR = s.YEAR
ORDER BY
    list.PRODUCTGROUP, list.STATE, list.MONTH, list.YEAR
Comments