ealfons1 ealfons1 - 3 months ago 13
SQL Question

How does the following code read and look in T-SQL?

The code originates from the following post:


What SQL can I use to retrieve counts from my payments data?

I am working with a large amount of data, and this code (in MS-ACCESS JET SQL) currently takes about 10-15 minutes to query (when it doesn't crash). I want to increase the speed in which my query runs, which is why I am looking to utilize the pass through query approach. Currently, the back end database I use is SQL Server (2012), which I use an ODBC connection to link to.

I am under the impression that writing your SQL in the back end database's language allows the pass through query to perform more efficiently. I believe that T-SQL would be the language. I have done the research, and there really is no "JET-SQL to T-SQL converter". Therefore, I kindly ask if anyone has any suggestions on how the following code reads and looks in T-SQL.

TRANSFORM Nz(First(CountOfStudents),0) AS n
SELECT
YearNumber,
MonthNumber,
School
FROM
(
SELECT
YearNumber,
MonthNumber,
School,
[School Service Type],
COUNT(*) AS CountOfStudents
FROM
(
SELECT DISTINCT
mtr.YearNumber,
mtr.MonthNumber,
pym.[Student ID],
pym.School,
pym.[School Service Type]
FROM
MonthsToReport AS mtr
INNER JOIN
PaymentsYearMonth AS pym
ON mtr.YYYYMM>=pym.StartYYYYMM
AND mtr.YYYYMM<=pym.EndYYYYMM
) AS distinctQuery
GROUP BY
YearNumber,
MonthNumber,
School,
[School Service Type]
) AS countQuery
GROUP BY
YearNumber,
MonthNumber,
School
PIVOT [School Service Type]


As always, thank you for your time.

Answer

Converting the Access SQL crosstab query to T-SQL is only part of the process. In order to push all of the processing onto the SQL Server we also need to make sure that all of the dependencies are on the SQL Server, too. (For example, a T-SQL query running on the SQL Server won't be able to pull data from a saved query in Access.)

Following the steps from the previous question, we have our table named [Payments] on the SQL Server:

Payment Row  Student ID  School  School Service Type  PaymentStartDate  PaymentEndDate
-----------  ----------  ------  -------------------  ----------------  --------------
          1  001         ABC     ED                   2010-01-02        2012-02-04    
          2  001         ABC     ED                   2010-01-02        2010-01-05    
          3  001         ABC     ED                   2010-04-02        2010-05-05    
          4  001         DEF     EZ                   2010-01-02        2012-02-04    
          5  001                 RR                   2012-02-02        2012-02-03    
          6  002         ABC     ED                   2010-02-02        2011-02-03    
          7  002         ABC     EZ                   2010-02-02        2010-06-03    
          8  002         GHI     ED                   2011-02-04        2012-02-04    
          9  003         ABC     ED                   2011-02-02        2012-02-03    
         10  003         DEF     ED                   2010-01-02        2010-08-03    
         11  003                 RR                   2011-02-02        2011-02-03    
         12  004                 RR                   2011-02-02        2011-02-03    
         13  005         GHI     ED                   2010-08-02        2011-02-04    
         14  006         GHI     ED                   2010-08-02        2010-08-02    

We create the [PaymentsYearMonth] view in SQL Server

CREATE VIEW PaymentsYearMonth AS
SELECT 
    [Student ID], 
    School, 
    [School Service Type], 
    (Year(PaymentStartDate) * 100) + Month(PaymentStartDate) AS StartYYYYMM, 
    (Year(PaymentEndDate) * 100) + Month(PaymentEndDate) AS EndYYYYMM
FROM Payments

The SQL Server also needs to have copies of our [MonthNumbers] table

MonthNumber
-----------
          1
          2
          3
          4
          5
          6
          7
          8
          9
         10
         11
         12

and our [YearNumbers] table

YearNumber
----------
      2009
      2010
      2011
      2012
      2013

So now we can create the [MonthsToReport] view. T-SQL doesn't have DMin() and DMax() functions, so we need to change the query slightly

CREATE VIEW MonthsToReport AS
SELECT
    yn.YearNumber,
    mn.MonthNumber,
    (yn.YearNumber * 100) + mn.MonthNumber AS YYYYMM
FROM
    YearNumbers AS yn,
    MonthNumbers AS mn
WHERE ((yn.YearNumber * 100) + mn.MonthNumber)>=(SELECT MIN(StartYYYYMM) FROM PaymentsYearMonth)
    AND ((yn.YearNumber * 100) + mn.MonthNumber)<=(SELECT MAX(EndYYYYMM) FROM PaymentsYearMonth)

The SQL query to count the distinct rows is exactly the same, so let's create a view for that so when we do the PIVOT in the next step it will be easier to see what's going on

CREATE VIEW DistinctCountsByMonth AS
SELECT
    YearNumber,
    MonthNumber,
    School,
    [School Service Type],
    COUNT(*) AS CountOfStudents
FROM
    (
        SELECT DISTINCT
            mtr.YearNumber,
            mtr.MonthNumber,
            pym.[Student ID],
            pym.School,
            pym.[School Service Type]
        FROM
            MonthsToReport AS mtr
            INNER JOIN
            PaymentsYearMonth AS pym
                ON mtr.YYYYMM>=pym.StartYYYYMM 
                    AND mtr.YYYYMM<=pym.EndYYYYMM
    ) AS distinctQuery
GROUP BY 
    YearNumber,
    MonthNumber,
    School,
    [School Service Type]

Now, if we were doing a crosstab query in Access it would simply be

TRANSFORM First(CountOfStudents) AS n
SELECT YearNumber, MonthNumber, School
FROM DistinctCountsByMonth
GROUP BY YearNumber, MonthNumber, School
PIVOT [School Service Type]

but the PIVOT clause in T-SQL requires that we give it the actual list of column names (as opposed to Access, which can generate the column names automatically). So we'll create a stored procedure on the SQL Server that builds the list of column names, constructs the SQL statement, and executes it:

CREATE PROCEDURE DistinctPaymentsCrosstab
AS
BEGIN
    SET NOCOUNT ON;
    DECLARE 
            @ColumnList AS NVARCHAR(MAX),
            @SQL AS NVARCHAR(MAX)

    -- build the list of column names based on the current contents of the table
    --     e.g., '[ED],[EZ],[RR]'
    --     required by PIVOT ... IN below
    --     ref: http://stackoverflow.com/a/14797796/2144390     
    SET @ColumnList = 
            STUFF(
                (
                    SELECT DISTINCT ',' + QUOTENAME([School Service Type])
                    FROM [DistinctCountsByMonth] 
                    FOR XML PATH(''), TYPE
                ).value('.', 'NVARCHAR(MAX)'), 
                1, 
                1, 
                '')
    SET @SQL = '
            WITH rollup
            AS
            (
                SELECT 
                    [School Service Type], 
                    YearNumber, 
                    MonthNumber, 
                    School, 
                    SUM(CountOfStudents) AS n 
                FROM [DistinctCountsByMonth]
                GROUP BY 
                    [School Service Type],
                    YearNumber, 
                    MonthNumber, 
                    School
            )
            SELECT * FROM rollup
            PIVOT (SUM([n]) FOR [School Service Type] IN (' + @ColumnList + ')) AS Results'
    EXECUTE(@SQL)
END

Now we can call that stored procedure from Access by using a pass-through query

PassThroughQuery.png

returning

Results.png

Comments