Alex Alex - 1 month ago 6
SQL Question

How to group values around a dynamic number of groups in SQL Server

I've read many/most of the posts regarding grouping in ranges but none seems to cover my dilemma, to group date according to a dynamic range of values, so I'm hoping someone can help or at least point me in the right direction.

Firstly, I have a set of ranges (but these can change and it would be normal to see between 2 and 9 rows in this set and of course the numbers would change too):

CREATE TABLE [Ranges] ([label] [varchar](160), [score] [int])

INSERT INTO [Ranges] ([label],[score]) VALUES('Red', 33)
INSERT INTO [Ranges] ([label],[score]) VALUES('Amber', 66)
INSERT INTO [Ranges] ([label],[score]) VALUES('Green', 100)


Then there is the actual data itself, which is stored in here:

CREATE TABLE [TableData] ([Name] varchar(160), [Value] int)


with the following values:

INSERT INTO [TableData] ([Name],[value]) VALUES('Cat', 100)
INSERT INTO [TableData] ([Name],[value]) VALUES('Dog', 37)
INSERT INTO [TableData] ([Name],[value]) VALUES('Bat', 16)
INSERT INTO [TableData] ([Name],[value]) VALUES('Mole', 87)
INSERT INTO [TableData] ([Name],[value]) VALUES('Hen', 55)
INSERT INTO [TableData] ([Name],[value]) VALUES('Dove', 28)
INSERT INTO [TableData] ([Name],[value]) VALUES('Rat', 6)


And the idea is to sort the [Ranges] data into a contiguous range, something like:

if value <= 33 then Red
if value > 33 and value <= 66 then Amber
if value > 66 and value <= 100 then Green


And plug that into a grouped query that would return the count of the [TableData] rows, in this case:

Red Amber Green
3 2 2


But as I've said, I've no idea as to how to go about it dynamically, assuming it is possible.

Thanks in advance.




This is what I went with to get the max and min for each level. I haven't yet tested it on more than 3 entries for which it works. It will need to work on up to 9, as mentioned.

SELECT TOP 100 PERCENT cx.label AS label,
-- when this one is the lowest one, go to the lowest in the range
CASE
WHEN (SELECT TOP 1 cy.label
FROM Ranges cy
ORDER BY cy.score ASC) = cx.label
THEN 0

WHEN cx.score > 0 AND cx.score < 100
THEN (SELECT TOP 1 cy.score + 1
FROM Ranges cy
AND cy.score < cx.score
ORDER BY cy.score ASC)

ELSE (SELECT TOP 1 cy.score + 1
FROM Ranges cy
AND cy.score <> cx.score
ORDER BY cy.score DESC) END AS scoreMin,

cx.score AS scoreMax,
cx.color
FROM Ranges cx
GROUP BY cx.label,
cast(cx.score AS int),
cx.color
ORDER BY cast(cx.score AS int) ASC


Using that, I inserted it into the selected option with the pivot and it works for 3 options, but I am still looking for a way of dynamically populating that list of columns.

Answer

solution 1

   DECLARE @Ranges TABLE ([label] [varchar](160), [scoreMin] [int], [scoreMax] [int]);

   INSERT INTO @Ranges ([label],[scoreMin],[scoreMax]) VALUES('Red',  0,  33)
   INSERT INTO @Ranges ([label],[scoreMin],[scoreMax]) VALUES('Amber',34, 66)
   INSERT INTO @Ranges ([label],[scoreMin],[scoreMax]) VALUES('Green',67, 100)


   DECLARE @TableData TABLE ([Name] varchar(160), [Value] int)

   INSERT INTO @TableData ([Name],[value]) VALUES('Cat', 100)
   INSERT INTO @TableData ([Name],[value]) VALUES('Dog', 37)
   INSERT INTO @TableData ([Name],[value]) VALUES('Bat', 16)
   INSERT INTO @TableData ([Name],[value]) VALUES('Mole', 87)
   INSERT INTO @TableData ([Name],[value]) VALUES('Hen', 55)
   INSERT INTO @TableData ([Name],[value]) VALUES('Dove', 28)
   INSERT INTO @TableData ([Name],[value]) VALUES('Rat', 6)

   SELECT * FROM 
   (
   SELECT
       R.label ,COUNT(*) AS CountRows
   FROM
       @TableData AS T
       INNER JOIN @Ranges AS R
           ON  T.Value >= R.scoreMin
           AND T.Value <= R.scoreMax
   GROUP BY R.label

   ) tmp 

   PIVOT (SUM(CountRows) FOR label IN ( Red, Amber, Green ) 
   ) AS pvt
Comments