John John - 4 months ago 8
SQL Question

Tuning a group by thats running slowly

The following SQL takes 5 seconds on a database with the largest table PremiseProviderBillings having 350,000 records. But on same database with 1.5 million records it takes over a minute

SELECT
n.CustomerInvoiceNumberId as InvoiceNo,C.CustomerBillId,c.customerid, S.Volumetric, S.Fixed, S.VAT, S.Discount, C.Debit,c.EffectiveDate,c.TransactionDateTime,s.Consumption,r.CustomerCreditNoteId--,s.Volumetric + s.Fixed + s.Vat - s.discount - c.debit as variance
FROM
CustomerPayments C
INNER JOIN
(SELECT
CustomerBillId, SUM(a.VolumetricCharge) as Volumetric,SUM(a.FixedCharge) as Fixed,
SUM(a.VAT) as VAT,SUM(a.Discount) as Discount,sum(a.EstimatedConsumption) as Consumption
FROM
PremiseProviderBillings a, PremiseProviderBills b
WHERE a.PremiseProviderBillId = b.PremiseProviderBillId
GROUP BY
CustomerBillId) S
ON
C.CustomerBillId = S.CustomerBillId
and debit <> 0 -- hide credit note lines, we mark these results with customerCreditNoteId to show they have been credited
INNER JOIN dbo.CustomerInvoiceNumbers n on c.CustomerBillId = n.CustomerBillId
left OUTER JOIN
dbo.CustomerCreditNotes AS r ON c.CustomerPaymentId = r.CustomerPaymentId
where isnull(c.transactionDateTimeEnd,'')=''


if I then run the inner SQL part that sums the values on the smaller database it takes 2 seconds. On the bigger database it takes 34 seconds, inner SQL below...

SELECT
CustomerBillId, SUM(a.VolumetricCharge) as Volumetric,SUM(a.FixedCharge) as Fixed,
SUM(a.VAT) as VAT,SUM(a.Discount) as Discount,sum(a.EstimatedConsumption) as Consumption
FROM
PremiseProviderBillings a, PremiseProviderBills b
WHERE a.PremiseProviderBillId = b.PremiseProviderBillId
GROUP BY
CustomerBillId


So its clear that this SQL is simply not scale-able. Given that the database will grow what technique should one apply to improve this?

I have checked all joins to ensure that there are no missing indexes, well, to ensure that all joins are key based and ok

I would have thought this approach is ok but should I alter the structure of the SQL, is this non scale-able and inefficient?

Regards

Answer

If you use the query often enough, and depending on how often you write to the tables it might be worth creating an indexed view for this. It is however worth noting that this is speculation, and indexed views do come with a trade off, your reads will be faster but your writes will be slower.

CREATE VIEW dbo.CustomerBillingView
WITH SCHEMABINDING
AS
    SELECT  b.CustomerBillId,
            SUM(a.VolumetricCharge) AS Volumetric,
            SUM(a.FixedCharge) AS Fixed,
            SUM(a.VAT) AS VAT,
            SUM(a.Discount) AS Discount,
            SUM(a.EstimatedConsumption) AS Consumption,
            COUNT_BIG(*) AS Records -- REQUIRED TO CREATE INDEX
    FROM    dbo.PremiseProviderBillings a
            INNER JOIN dbo.PremiseProviderBills b
                ON a.PremiseProviderBillId = b.PremiseProviderBillId
    GROUP BY b.CustomerBillId;
GO

CREATE UNIQUE CLUSTERED INDEX UQ_CustomerBillingView__CustomerBillId
    ON dbo.CustomerBillingView (CustomerBillId);

GO

Then you just need to use your view with the hint NOEXPAND to ensure that the index is used.

SELECT  n.CustomerInvoiceNumberId as InvoiceNo,
        c.CustomerBillId,
        c.customerid, 
        s.Volumetric, 
        s.Fixed, 
        s.VAT, 
        s.Discount, 
        c.Debit,
        c.EffectiveDate,
        c.TransactionDateTime,
        s.Consumption,
        r.CustomerCreditNoteId
        --,s.Volumetric + s.Fixed + s.Vat - s.discount - c.debit as variance
FROM    CustomerPayments AS c 
        INNER JOIN dbo.CustomerBillingView AS s WITH (NOEXPAND)
            ON c.CustomerBillId = s.CustomerBillId 
            AND c.Debit <> 0 
            -- hide credit note lines, we mark these results with customerCreditNoteId to show they have been credited
        INNER JOIN dbo.CustomerInvoiceNumbers n 
            ON c.CustomerBillId = n.CustomerBillId
        LEFT OUTER JOIN dbo.CustomerCreditNotes AS r 
            ON c.CustomerPaymentId = r.CustomerPaymentId
WHERE   ISNULL(c.transactionDateTimeEnd,'') = '';

As with every query tuning question, you are the only one with all the information needed to answer properly. In my experience (which is primarily in billing systems), indexed views like this are generally okay with billing data, because most invoice runs are cyclical, and so writes are in batches rather than continual, and reads also tend to outweigh writes because the data is static, once an invoice is created then it is very rarely updated.

Comments