rohits rohits - 7 months ago 14
SQL Question

Scalar subquery taking a lot of time to execute

I have the following tables:


  1. table1
    -
    session_id, company_id, session_start_time

  2. table2
    -
    id, session_id, message_time, message_type, message

  3. table3
    -
    company_id, company_name



table1 stores the sessions done on companies. Each session has a lot of messages, which are stored in table2. For a specific message_type (say message_type = 2), a substring needs to be extracted from message. This substring is a company name. Now I need to find the next session_id from table1, where the company_id matches that of the company name extracted from message and the session_start_time >= message_time.

I am using the following query.

select t1.session_id as session1,
t1.company_id as company1,
@transfer_time := t2.message_time as transfer_time,
@company2 := trim(substring(t2.message, 38, locate(' abc', t2.message) - 38)) as company2,
(select t1.session_id
from table1 as t1
inner join table3 as t3 on t1.company_id = t3.company_id
where t1.session_start_time >= @transfer_time
and t3.company_name = @company2
order by t1.session_start_time
limit 1) as session 2
from table1 as t1
inner join table2 as t2 on t1.session_id = t2.session_id
inner join table3 as t3 on t1.company_id = t3.company_id
where t2.message_type = 2


The original query is slightly more complicated having a few more scalar sub-queries. This query is taking exceptionally long to execute. I checked with the explain function and the scalar sub-query seems to be taking very long to execute. However, I am unable to think of a better way.

Answer

This is the query:

select t1.session_id as session1, t1.company_id as company1,
       @transfer_time := t2.message_time as transfer_time,
       @company2 := trim(substring(t2.message, 38, locate(' abc', t2.message) - 38)) as company2,
       (select t1.session_id
        from table1 t1 inner join
             table3 t3
             on t1.company_id = t3.company_id
        where t1.session_start_time >= @transfer_time and
              t3.company_name = @company2
        order by t1.session_start_time
        limit 1
       ) s session 2
from table1 t1 inner join
     table2 t2
     on t1.session_id = t2.session_id inner join
     table3 t3
     on t1.company_id = t3.company_id
where t2.message_type = 2;

First, the use of variables is not correct. MySQL does not guarantee the order of evaluation of variables in a SELECT. So, you need to put in the original definitions:

select t1.session_id as session1, t1.company_id as company1,
       t2.message_time as transfer_time,
       trim(substring(t2.message, 38, locate(' abc', t2.message) - 38)) as company2,
       (select t1.session_id
        from table1 t1 inner join
             table3 t3
             on t1.company_id = t3.company_id
        where t1.session_start_time >= t2.message_time and
              t3.company_name = trim(substring(t2.message, 38, locate(' abc', t2.message) - 38)) 
        order by t1.session_start_time
        limit 1
       ) s session 2
from table1 t1 inner join
     table2 t2
     on t1.session_id = t2.session_id inner join
     table3 t3
     on t1.company_id = t3.company_id
where t2.message_type = 2;

Next, I would try indexes: table3(company_name, company_id) and table1(company_id, session_start_time, session_id).

Comments