mh3982 mh3982 -4 years ago 96
SQL Question

Join query - avoid "SET SQL_BIG_SELECTS=1" and timeouts

I've got two mysql tables:

Table1 (80K+ products):

_products

product_id status ean category


Table 2: (700K+ product attributes, matched on table one on column "ean")

_productinformation

info_id ean info_group info_type info_value


My challenge:

I want to select 30 products out of the "products" table (based on their status and category) and match those products on the column ean with the other table. Next, I want to filter out of the matches in the "productinformation" table for which the ean is the same as the ones selected in "products" and column "info_type" matches a specific value. The following SQL does almost what I want, except two issues:


  1. SQL BIG QUERIES needs to be used (which makes the query extremely slow up to 5 minutes)

  2. The query returns not just one row with the ean and the selected rows for info_type, but returns 40+ rows with 40+ times exactly the same info per ean (I think that is the total number of rows with that specific ean code in table "productinformation").



The query I constructed:

SELECT _products.ean
, _products.status
, product_brand.info_value as product_brand
, product_type.info_value as product_type
, product_price.info_value as product_price
FROM _products
LEFT JOIN _productinformation ON _products.ean = _productinformation.ean
LEFT JOIN _productinformation as product_brand ON _products.ean = product_brand.ean
LEFT JOIN _productinformation as product_type ON _products.ean = product_type.ean
LEFT JOIN _productinformation as product_price ON _products.ean = product_price.ean
WHERE product_brand.info_type = 'brand'
AND product_type.info_type = 'type'
AND product_price.info_type = 'price'
AND _products.category='1'


This returns something like (40+ rows with the same product):


ean status product_brand product_type product_price
0123456789 1 brand1 type1 0.00
0123456789 1 brand1 type1 0.00
0123456789 1 brand1 type1 0.00
0123456789 1 brand1 type1 0.00
0123456789 1 brand1 type1 0.00
etc.


However, I'd like to see 30 different products:


ean status product_brand product_type product_price
0123456789 1 brand1 type1 0.00
9876543210 3 brand6 type3 15.00
6548214656 45 brand34 type1 99.00
245511411241 4 brand324 type1 98.00
etc.


Is there someone who can tell me if the query I am looking for is possible? And how should it look like? I have already tried 100+ different queries (3 days further...), but all failed. The query above came most close.
Hope someone could help me out! Thnx!

Answer Source

I will us the following tables for examples to explain this answer. They are similar to yours except I left out a few fields that don't seem to come into play in this query.

#Products
+ ---------- + ------ + ------------ +
| Product_Id | Status | EAN          |
+ ---------- + ------ + ------------ +
| 1          | 1      | 0123456789   |
| 2          | 3      | 9876543210   |
| 3          | 45     | 6548214656   |
| 4          | 4      | 245511411241 |
+ ---------- + ------ + ------------ +

#Info
+ ------- + ------------ + --------- + ---------- +
| Info_Id | EAN          | Info_Type | Info_Value |
+ ------- + ------------ + --------- + ---------- +
| 1       | 0123456789   | brand     | brand1     |
| 2       | 0123456789   | type      | type1      |
| 3       | 0123456789   | price     | 0.00       |
| 4       | 9876543210   | brand     | brand6     |
| 5       | 9876543210   | type      | type3      |
| 6       | 9876543210   | price     | 15.00      |
| 7       | 6548214656   | brand     | brand34    |
| 8       | 6548214656   | type      | type1      |
| 9       | 6548214656   | price     | 99.00      |
| 10      | 245511411241 | brand     | brand324   |
| 11      | 245511411241 | type      | type1      |
| 12      | 245511411241 | price     | 98.00      |
+ ------- + ------------ + --------- + ---------- +

Now consider the following query

select    i.EAN
        , p.ProductStatus
        , case info_type when 'brand' then info_value end as brand
        , case info_type when 'type' then info_value end as [type]
        , case info_type when 'price' then info_value end as price
    from #Info i
    inner join #Products p on p.ean = i.ean

which produces a table like

+ ------------ + ------ + -------- + ----- + ----- +
| EAN          | status | brand    | type  | price |
+ ------------ + ------ + -------- + ----- + ----- +
| 0123456789   | 1      | brand1   | NULL  | NULL  |
| 0123456789   | 1      | NULL     | type1 | NULL  |
| 0123456789   | 1      | NULL     | NULL  | 0.00  |
| 9876543210   | 3      | brand6   | NULL  | NULL  |
| 9876543210   | 3      | NULL     | type3 | NULL  |
| 9876543210   | 3      | NULL     | NULL  | 15.00 |
| 6548214656   | 45     | brand34  | NULL  | NULL  |
| 6548214656   | 45     | NULL     | type1 | NULL  |
| 6548214656   | 45     | NULL     | NULL  | 99.00 | 
| 245511411241 | 4      | brand324 | NULL  | NULL  |
| 245511411241 | 4      | NULL     | type1 | NULL  |
| 245511411241 | 4      | NULL     | NULL  | 98.00 |
+ ------------ + ------ + -------- + ----- + ----- +

As you can see, I only use a single join with some case statements in the selection portion to create the columns I want. This has a huge advantage over using multiple joins because none of the records are duplicated.

To finish up, a simple aggregation will give us what we want

select    i.EAN
        , p.ProductStatus
        , max(case info_type when 'brand' then info_value end) as brand
        , max(case info_type when 'type' then info_value end) as [type]
        , max(case info_type when 'price' then info_value end) as price
    from #Info i
    inner join #Products p on p.ean = i.ean
    group by i.EAN, p.ProductStatus

+ ------------ + ------ + -------- + ----- + ----- +
| EAN          | Status | brand    | type  | price |     
+ ------------ + ------ + -------- + ----- + ----- +
| 0123456789   | 1      | brand1   | type1 | 0.00  |
| 245511411241 | 4      | brand324 | type1 | 98.00 |
| 6548214656   | 45     | brand34  | type1 | 99.00 |
| 9876543210   | 3      | brand6   | type3 | 15.00 |
+ ------------ + ------ + -------- + ----- + ----- +

You can then use a CTE to filter specific fields of this table, like so:

Hope this helps!

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download