Samrat Khan Samrat Khan - 26 days ago 6
MySQL Question

Why is this query taking so long to execute

I wrote this query few months ago. It was running well. But, day by day this query is becoming slower to execute.

This query is checking bill history in multiple tables besides same table execution.

Here is the query -

SELECT * FROM (
SELECT *,
(SELECT username FROM users WHERE id = u_bills.UserId) AS username,
(SELECT first_name FROM users WHERE id = u_bills.UserId) AS first_name,
(SELECT last_name FROM users WHERE id = u_bills.UserId) AS last_name,
(SELECT phone FROM users WHERE id = u_bills.UserId) AS phone,
(SELECT email FROM users WHERE id = u_bills.UserId) AS email,
(SELECT CPRate FROM cpt WHERE UserId = u_bills.UserId ORDER BY AddedDate DESC LIMIT 0,1) AS cprate,
(SELECT (SELECT PopName FROM pops WHERE PopId = p.PopName) AS PopFullName FROM u_setupinfos AS p
WHERE UserId = u_bills.UserId) AS popname,
(SELECT active FROM users WHERE id = u_bills.UserId) AS active,
(SELECT SUM(PaidAmount) AS PaidAmount FROM u_billhistory
WHERE UserId = u_bills.UserId AND MONTH(AddedDate) = MONTH(PaidDate)) AS PaidAmount,
(SELECT PaidDate FROM u_billhistory
WHERE UserId = u_bills.UserId AND MONTH(AddedDate) = MONTH(PaidDate)
GROUP BY MONTH(PaidDate)) AS PaidDate,
(SELECT PaymentMedia FROM u_billhistory
WHERE UserId = u_bills.UserId AND MONTH(AddedDate) = MONTH(PaidDate)
GROUP BY MONTH(PaidDate)) AS PaymentMedia,
(SELECT TransactionId FROM u_billhistory
WHERE UserId = u_bills.UserId AND MONTH(AddedDate) = MONTH(PaidDate) GROUP BY MONTH(PaidDate)) AS TransactionId,
(SELECT GROUP_CONCAT(CONCAT(`BillHisId`,';',`PaidAmount`,';',`PaidDate`) separator '|') AS vals
FROM u_billhistory
WHERE UserId = u_bills.UserId
AND YEAR(PaidDate) = YEAR(CURDATE())) AS TotalPaids
FROM u_bills) AS m
WHERE m.username = 'abc'


Please check the
EXPLAIN
Report at image-

enter image description here

I want to organize this query. I need suggestions to make this query faster like 3 to 5 seconds rather than 1.30 minute or more. What mistakes did I make on this query?

Important note-

I indexed almost each the columns using on the query


Answer

It's slow because you're making it run a separate lookup into an additional table for every field of every record in the results... sometimes more than one. You should use JOINs instead of nested SELECTs. For some of the grouped results, you can join to a SELECT statement that does it's own GROUP BY.


Additionally, separate indexes on individual columns DO NOT HELP!!

Think of indexes like a phone book. A simple phone book uses a clustered index on Last Name and First Name. It then also lists the address and phone number for each item. That's what a clustered index is: just the order in which the data is stored.

Additional indexes are like supplements. An non-clustered index on "Phone Number" would be like a supplement at the end of the book that lists phone numbers in order, and then shows the Last Name and First Name (but not the address). An index on "Address" would do the same: a separate supplement that lists addresses in order, and then gives you the name.

With these index supplements, if you have an phone number and want to know the address, you must first lookup the phone number in the supplement in order to find the name, and then lookup the name in the main book to get the address. In this example the phone supplement does help, because it's better than a full scan of the phone book.

However, the important thing to notice here is the address supplement _does not help this search, even though the address is the exact field you wanted to find. It may be useful for other searches, but it does not help in this case. Additionally, keep in mind that every update to your book requires you to update not only the book, but all of the supplements. It's not just storage space, but the time spent updating each index. You need to consider whether it's worth it. If you have a bunch of indexes that are not used, you're better off without them.


Getting back to your specific problem, about the only indexes that will really help you are u_bills.username, user.UserId (hopefully the cluster for it's table) and a single index for both u_billhistory.UserId and u_billhistory.PaidDate.

Here's an initial attempt at refactoring this query:

SELECT ub.*, u.username, u.first_name, u.last_name, u.phone, u.email, u.active,
    m.PaidAmount, m.PaymentMedia, m.TransactionId, 
    y.TotalPaids, p.PopName As PopFullName, cpt2.CPRate       

FROM u_bills ub
INNER JOIN users u on u.id = ub.UserId
INNER JOIN u_setupinfos usi on usi.UserId = ub.UserID
INNER JOIN pops p ON p.PopId = usi.PopName
INNER JOIN (
    SELECT UserId, MIN(AddedDate) As cptDate
    FROM cpt
    GROUP BY UserID        
) cpt1 ON cp1.UserID = ub.UserID
INNER JOIN (
    SELECT UserID, AddedDate, MIN(CPRate) AS CPRate
    FROM cpt
    GROUP BY UserID, AddedDate
) cpt2 ON cpt2.UserID = ub.UserID AND cpt2.AddedDate = cpt1.ctpDate
INNER JOIN (
    SELECT UserID, MONTH(PaidDate) PaidMonth, SUM(PaidAmount) AS PaidAmount,
        PaymentMedia, TransactionId --these two fields need an aggregate function of some type!
    FROM u_billhistory
    GROUP BY UserId, MONTH(PaidDate)
) m ON m.UserID = ub.UserId AND m.PaidMonth = MONTH(ub.AddedDate)
INNER JOIN (
    SELECT UserID, YEAR(PaidDate) As PaidYear, 
        GROUP_CONCAT(CONCAT(`BillHisId`,';',`PaidAmount`,';',`PaidDate`) separator '|') AS TotalPaids
    FROM u_billhistory
    WHERE YEAR(PaidDate) = YEAR(CurDate())
    GROUP BY UserId, YEAR(PaidDate)
) y ON Y.UserId = ub.UserId 
WHERE username='abc'

Note that it's not quite complete, because we don't have all of the information about your schema and data, and because there was what seemed to be a missing GROUP BY and two missing aggregate functions. MySql is bad this way... a better database would not have allowed your query to run until you fixed those things.

Comments