liebgotts liebgotts - 4 months ago 26
MySQL Question

Slow Query Due to Multiple, JOIN-laden Subqueries

I'm currently developing a LIMS. I need to retrieve information relating to lab samples which need to be prepped using a certain prep method. (I'm simplifying this quite a lot, so apologies for any mistakes.) The columns I need to produce in my query results are:

Sample number|
Container type|
List of ALL chain of custodies this sample appears on|
List of ALL analysis items this sample needs to be prepped for|
List of ALL prep methods this sample needs to go through


For example:

Sample 1|100g|101, 102, 103, 104|BTEX, TPH|Moisture Content, Pentane
Sample 2|200g|101, 103|BTEX, Atterberg Limits|Headspace, Moisture Content, Pentane
Sample 3|200g|101, 102, 104|Atterberg Limits|Headspace, Moisture Content


I have to retrieve the lists for the last three columns from three different tables. I'm currently using three different subqueries to do this:

SELECT
lab_coc_samples.Sample_Number,
lab_coc_samples.Container,
(Subquery 1),
(Subquery 2),
(Subquery 3)
FROM lab_coc_samples
JOIN lab_coc ON lab_coc_samples.CoC_ID = lab_coc.Data_ID
JOIN lab_analysis_items_to_prep_methods ON lab_coc.Analysis_Item = lab_analysis_items_to_prep_methods.Analysis_Code
WHERE lab_analysis_items_to_prep_methods.Prep_Code = 1


Subquery 1: each sample will appear on several different chain of custodies, so my subquery for the first column is:

SELECT GROUP_CONCAT( DISTINCT lab_coc_samples.CoC_ID SEPARATOR ',')
FROM lab_coc_samples

WHERE lab_coc_samples.Sample_Code = s1.Sample_Code
GROUP BY lab_coc_samples.Sample_Code


Subquery 2: each chain of custody has one analysis item, but each sample will be linked to several analysis items because it appears on multiple CoCs:

SELECT GROUP_CONCAT( DISTINCT lab_analysis_items.Name SEPARATOR ',')
FROM lab_coc_samples

JOIN lab_coc ON lab_coc.Data_ID = lab_coc_samples.CoC_ID
JOIN lab_analysis_items ON lab_analysis_items.Code = lab_coc.Analysis_Item

WHERE lab_coc_samples.Sample_Code = s1.Sample_Code
GROUP BY lab_coc_samples.Sample_Code


Subquery 3: Each analysis item is linked to multiple prep methods (so: multiple samples which are related to multiple analysis items which are related to multiple prep methods -- joy):

SELECT GROUP_CONCAT( DISTINCT lab_prep_methods.Name SEPARATOR ', ')
FROM lab_coc_samples

JOIN lab_coc ON lab_coc_samples.CoC_ID = lab_coc.Data_ID
JOIN lab_analysis_items_to_prep_methods ON lab_analysis_items_to_prep_methods.Analysis_Code = lab_coc.Analysis_Item
JOIN lab_prep_methods ON lab_prep_methods.Code = lab_analysis_items_to_prep_methods.Prep_Code

WHERE lab_coc_samples.Sample_Code = s1.Sample_Code
GROUP BY lab_coc_samples.Sample_Code


There are thousands of samples in this database that need prepping, which makes this query very slow. How can I increase the efficiency of this query without modifying the structure of the database? Can I use joins somehow, even though I need information about ALL instances of the sample in each of the tables? Would joins even help?

Answer Source

I improved the query time myself by creating a nested table containing my list fields, and adding indexes to the JOIN fields in my queries. Most notably, adding an index to the VARCHAR 'Sample_Code' field sped things up by a few seconds.