BeemerGuy BeemerGuy - 5 months ago 10
SQL Question

How to group-concatenate multiple columns?

Assume this table:

PruchaseID | Customer | Product | Method
-----------|----------|----------|--------
1 | John | Computer | Credit
2 | John | Mouse | Cash
3 | Will | Computer | Credit
4 | Will | Mouse | Cash
5 | Will | Speaker | Cash
6 | Todd | Computer | Credit


I want to generate a report on each customer of what they bought, and their payment methods.

But I want that report to be one row per customer, such as:

Customer | Products | Methods
---------|--------------------------|--------------
John | Computer, Mouse | Credit, Cash
Will | Computer, Mouse, Speaker | Credit, Cash
Todd | Computer | Credit


What I've found so far is to group-concatenate using the
XML PATH
method, such as:

SELECT
p.Customer,
STUFF(
SELECT ', ' + xp.Product
FROM Purchases xp
WHERE xp.Customer = p.Customer
FOR XML PATH('')), 1, 1, '') AS Products,
STUFF(
SELECT ', ' + xp.Method
FROM Purchases xp
WHERE xp.Customer = p.Customer
FOR XML PATH('')), 1, 1, '') AS Methods
FROM Purchases


This gives me the result, but my concern is the speed of this.

At first glance there are three different selects going on here, two would each multiply by the number of rows Purchases has. Eventually this would slow down expenentially.

So, is there a way to do this with better performance?

I want to add even more columns to aggregate, should I do this STUFF() block for every column? That doesn't sound fast enough for me.

Siggestions?

Answer

Just an idea:

DECLARE @t TABLE (
    Customer VARCHAR(50),
    Product VARCHAR(50),
    Method VARCHAR(50),
    INDEX ix CLUSTERED (Customer)
)

INSERT INTO @t (Customer, Product, Method)
VALUES
    ('John', 'Computer', 'Credit'),
    ('John', 'Mouse', 'Cash'),
    ('Will', 'Computer', 'Credit'),
    ('Will', 'Mouse', 'Cash'),
    ('Will', 'Speaker', 'Cash'),
    ('Todd', 'Computer', 'Credit')

SELECT t.Customer
     , STUFF(CAST(x.query('a/text()') AS NVARCHAR(MAX)), 1, 2, '')
     , STUFF(CAST(x.query('b/text()') AS NVARCHAR(MAX)), 1, 2, '')
FROM (
    SELECT DISTINCT Customer
    FROM @t
) t
OUTER APPLY (
    SELECT DISTINCT [a] = CASE WHEN id = 'a' THEN ', ' + val END
                  , [b] = CASE WHEN id = 'b' THEN ', ' + val END
    FROM @t t2
    CROSS APPLY (
        VALUES ('a', t2.Product)
             , ('b', t2.Method)
    ) t3 (id, val)
    WHERE t2.Customer = t.Customer
    FOR XML PATH(''), TYPE
) t2 (x)

Output:

Customer   Product                    Method     
---------- -------------------------- ------------------
John       Computer, Mouse            Cash, Credit
Todd       Computer                   Credit
Will       Computer, Mouse, Speaker   Cash, Credit

Another idea with more performance benefits:

IF OBJECT_ID('tempdb.dbo.#EntityValues') IS NOT NULL
    DROP TABLE #EntityValues

DECLARE @Values1 VARCHAR(MAX)
      , @Values2 VARCHAR(MAX)

SELECT Customer
     , Product
     , Method
     , RowNum = ROW_NUMBER() OVER (PARTITION BY Customer ORDER BY 1/0)
     , Values1 = CAST(NULL AS VARCHAR(MAX))
     , Values2 = CAST(NULL AS VARCHAR(MAX))
INTO #EntityValues
FROM @t

UPDATE #EntityValues
SET 
      @Values1 = Values1 =
        CASE WHEN RowNum = 1 
            THEN Product
            ELSE @Values1 + ', ' + Product 
        END
    , @Values2 = Values2 = 
        CASE WHEN RowNum = 1 
            THEN Method
            ELSE @Values2 + ', ' + Method
        END

SELECT Customer
      , Values1 = MAX(Values1) 
      , Values2 = MAX(Values2)
FROM #EntityValues
GROUP BY Customer

But with some limitations:

Customer      Values1                       Values2
------------- ----------------------------- ----------------------
John          Computer, Mouse               Credit, Cash
Todd          Computer                      Credit
Will          Computer, Mouse, Speaker      Credit, Cash, Cash

Also check my old post about string aggregation:

http://www.codeproject.com/Articles/691102/String-Aggregation-in-the-World-of-SQL-Server

Comments