Mehmet Otkun Mehmet Otkun - 2 months ago 8
SQL Question

SQL Server - Grouping Combination of possibilities by fixed value

enter image description here

I have to create cheapest basket which inculde fixed items.

For example for a basket which have (5) items

1 and 4 = (1 * 50) + (1 * 100) = 150

2 and 3 = (1 * 60) + (1 * 80) = 140 -- this is my guy

2 and 2 and 1 = (1 * 60) + (1 * 60) + (1 * 50) = 170

3 and 3 = (1 * 80) + (1 * 80) = 160 **** this 6 items but total item can exceed min items. The important thing is total cost...
....

Also this is valid for any number of items a basket may have. Also there are lots of stores and each stores have different package may include several items.

How can handle this issue with SQL?

UPDATE



Here is example data generation code. Recursive CTE solutions are more expensive. I should finish the job under 500-600ms over 600-700 stores each time. this is a package search engine. Manual scenario creation by using ´#temp´ tables or ´UNUION´ is 15-20 times cheaper then Recursive CTE.

Also concatenating
Item
or
PackageId
is very expensive. I can found required package id or item after selecting cheapest package with join to source table.

I am expecting a megical solution which can be ultra fast and get the correct option.
Only cheapest basket required for each store. Manual scenario creation is very fast but sometimes fail for correct cheapest basket.

CREATE TABLE #storePackages(
StoreId int not null,
PackageId int not null,
ItemType int not null, -- there are tree item type 0 is normal item, 1 is item has discount 2 is free item
ItemCount int not null,
ItemPrice decimal(18,8) not null,
MaxItemQouta int not null, -- in generaly a package can have between 1 and 6 qouata but in rare can up to 20-25
MaxFullQouta int not null -- sometimes a package can have additional free or discount item qouta. MaxFullQouta will always greater then MaxItemQouta
)

declare @totalStores int
set @totalStores = (SELECT TOP 1 n = number FROM master..[spt_values] WHERE number BETWEEN 200 AND 400 ORDER BY NEWID())

declare @storeId int;
declare @packageId int;
declare @maxPackageForStore int;
declare @itemMinPrice decimal(18,8);
set @storeId = 1;
set @packageId = 1

while(@storeId <= @totalStores)
BEGIN
set @maxPackageForStore = (SELECT TOP 1 n = number FROM master..[spt_values] WHERE number BETWEEN 2 AND 6 ORDER BY NEWID())
set @itemMinPrice = (SELECT TOP 1 n = number FROM master..[spt_values] WHERE number BETWEEN 40 AND 100 ORDER BY NEWID())
BEGIN
INSERT INTO #storePackages
SELECT DISTINCT
StoreId = @storeId
,PackageId = CAST(@packageId + number AS int)
,ItemType = 0
,ItemCount = number
,ItemPrice = @itemMinPrice + (10 * (SELECT TOP 1 n = number FROM master..[spt_values] WHERE number BETWEEN pkgNo.number AND pkgNo.number + 2 ORDER BY NEWID()))
,MaxItemQouta = @maxPackageForStore
,MaxFullQouta = @maxPackageForStore + (CASE WHEN number > 1 AND number < 4 THEN 1 ELSE 0 END)
FROM master..[spt_values] pkgNo
WHERE number BETWEEN 1 AND @maxPackageForStore
UNION ALL
SELECT DISTINCT
StoreId = @storeId
,PackageId = CAST(@packageId + number AS int)
,ItemType = 1
,ItemCount = 1
,ItemPrice = (@itemMinPrice / 2) + (10 * (SELECT TOP 1 n = number FROM master..[spt_values] WHERE number BETWEEN pkgNo.number AND pkgNo.number + 2 ORDER BY NEWID()))
,MaxItemQouta = @maxPackageForStore
,MaxFullQouta = @maxPackageForStore + (SELECT TOP 1 n = number FROM master..[spt_values] WHERE number BETWEEN 0 AND 2 ORDER BY NEWID())
FROM master..[spt_values] pkgNo
WHERE number BETWEEN 2 AND (CASE WHEN @maxPackageForStore > 4 THEN 4 ELSE @maxPackageForStore END)


set @packageId = @packageId + @maxPackageForStore;
END
set @storeId =@storeId + 1;
END

SELECT * FROM #storePackages
drop table #storePackages


MY SOLUTION



First of all I am thankful for everyone who try to help me. However all suggested solutions are based on CTE. As I said before recursive CTEs cause performace problems when hunderds of stores are considered. Also multiple packages are requested for one time. This means, I request can include mutiple baskets. One is 5 items other is 3 items and another one is 7 items...

Last Solution



First of all I generates all possible scenarios in a table by item size... By this way, I have option eleminate unwanted scenarios.

CREATE TABLE ItemScenarios(
Item int,
ScenarioId int,
CalculatedItem int --this will be joined with Store Item
)


Then I generated all possible scenario from 2 item to 25 item and insert to the
ItemScenarios
table. Scenarios can be genereated one time by using WHILE or recursive CTE. The advantage of this way, scenarios generated only for one time.

Resuls are like below.



Item | ScenarioId | CalculatedItem
--------------------------------------------------------
2 1 2
2 2 3
2 3 1
2 3 1
3 4 5
3 5 4
3 6 3
3 7 2
3 7 2
3 8 2
3 8 1
3 9 1
3 9 1
3 9 1
....
.....
......
25 993 10


By this way, I can restrict scenario sizes, Max different store, max different package etc.

Also I can eleminate some scenarios which matematically impossible cheapest then other. For example for 4 items request, some scenario

Scenario 1 : 2+2

Scenario 2: 2+1+1

Scenario 3: 1+1+1+1

Among these scenarios; It is impossible Scenario 2 would be cheapest basket. Because,

If Scenario 2 < Scenario 3 --> Scenario 1 would be lower then Scenario 2. Because the thing decreasing cost is 2 item price and **Scenario 1* have double 2 items

Also If Scenario 2 < Scenario 1 --> Scenario 3 would be lower then Scenario 2

Now, If I delete scenarios like Scenario 2 I would gain some performance advantages.

Now I can chose chepest item prices among stores

DECLARE @requestedItems int;
SET @requestedItems = 5;

CREATE TABLE #JoinedPackageItemWithScenarios(
StoreId int not null,
PackageId int not null,
ItemCount int not null,
ItemPrice decimal(18,8)
ScenarioId int not null,
)
INSERT INTO #JoinedPackageItemWithScenarios
SELECT
SPM.StoreId
,SPM.PackageId
,SPM.ItemCount
,SPM.ItemPrice
,SPM.ScenarioId
FROM (
SELECT
SP.StoreId
,SP.PackageId
,SP.ItemCount
,SP.ItemPrice
,SC.ScenarioId
,RowNumber = ROW_NUMBER() OVER (PARTITION BY SP.StoreId,SC.ScenarioId,SP.ItemCount ORDER BY SP.ItemPrice)
FROM ItemScenarios SC
LEFT JOIN StorePackages AS SP ON SP.ItemCount = SC.CalculatedItem
WHERE SC.Item = @requestedItems
) SPM
WHERE SPM.RowNumber = 1

-- NOW I HAVE CHEAPEST PRICE FOR EACH ITEM, I CAN CREATE BASKET

CREATE TABLE #selectedScenarios(
StoreId int not null,
ScenarioId int not null,
TotalItem int not null,
TotalCost decimal(18,8)
)
INSERT INTO #selectedScenarios
SELECT
StoreId
,ScenarioId
,TotalItem
,TotalCost
FROM (
SELECT
StoreId
,ScenarioId
--,PackageIds = dbo.GROUP_CONCAT(CAST(PackageId AS nvarchar(20))) -- CONCATENING PackageId decreasing performance here. We can joing seleceted scenarios with #JoinedPackageItemWithScenarios after selection complated.
,TotalItem = SUM(ItemCount)
,TotalCost = SUM(ItemPrice)
,RowNumber = ROW_NUMBER() OVER (PARTITION BY StoreId ORDER BY SUM(ItemPrice))
FROM #JoinedPackageItemWithScenarios JPS
GROUP BY StoreId,ScenarioId
HAVING(SUM(ItemCount) >= @requestedItems)
) SLECTED
WHERE RowNumber = 1

-- NOW WE CAN POPULATE PackageIds if needed

SELECT
SS.StoreId
,SS.ScenarioId
,TotalItem = MAX(SS.TotalItem)
,TotalCost = MAX(SS.TotalCost)
,PackageIds = dbo.GROUP_CONCAT(CAST(JPS.PackageId AS nvarchar(20)))
FROM #selectedScenarios SS
JOIN #JoinedPackageItemWithScenarios AS JPS ON JPS.StoreId = SS.StoreId AND JPS.ScenarioId = SS.ScenarioId
GROUP BY SS.StoreId,SS.ScenarioId


SUM



In my test, this way is mimimum 10 times faster then recursive CTE, especially when number of stores and requested items increased. Also It gets 100% correct results. Because recursive CTE tried milions of unrequired JOINs when number of stores and requested items increased.

Answer

MY SOLUTION

First of all I am thankful for everyone who try to help me. However all suggested solutions are based on CTE. As I said before recursive CTEs cause performace problems when hunderds of stores are considered. Also multiple packages are requested for one time. This means, A request can include mutiple baskets. One is 5 items other is 3 items and another one is 7 items...

Last Solution

First of all I generates all possible scenarios in a table by item size... By this way, I have option eleminate unwanted scenarios.

CREATE TABLE ItemScenarios(
   Item int,
   ScenarioId int,
   CalculatedItem int  --this will be joined with Store Item
)

Then I generated all possible scenario from 2 item to 25 item and insert to the ItemScenarios table. Scenarios can be genereated one time by using WHILE or recursive CTE. The advantage of this way, scenarios generated only for one time.

Resuls are like below.

Item          |   ScenarioId       |     CalculatedItem
--------------------------------------------------------
2                   1                     2
2                   2                     3
2                   3                     1
2                   3                     1
3                   4                     5
3                   5                     4
3                   6                     3
3                   7                     2
3                   7                     2
3                   8                     2
3                   8                     1
3                   9                     1
3                   9                     1
3                   9                     1
....
.....
......
25                  993                   10

By this way, I can restrict scenario sizes, Max different store, max different package etc.

Also I can eleminate some scenarios which matematically impossible cheapest then other. For example for 4 items request, some scenario

Scenario 1 : 2+2

Scenario 2: 2+1+1

Scenario 3: 1+1+1+1

Among these scenarios; It is impossible Scenario 2 would be cheapest basket. Because,

If Scenario 2 < Scenario 3 --> Scenario 1 would be lower then Scenario 2. Because the thing decreasing cost is 2 item price and **Scenario 1* have double 2 items

Also If Scenario 2 < Scenario 1 --> Scenario 3 would be lower then Scenario 2

Now, If I delete scenarios like Scenario 2 I would gain some performance advantages.

Now I can chose chepest item prices among stores

DECLARE @requestedItems int;
SET @requestedItems = 5;

CREATE TABLE #JoinedPackageItemWithScenarios(
   StoreId int not null,
   PackageId int not null,
   ItemCount int not null,
   ItemPrice decimal(18,8) 
   ScenarioId int not null,
)
INSERT INTO #JoinedPackageItemWithScenarios
 SELECT
    SPM.StoreId  
   ,SPM.PackageId 
   ,SPM.ItemCount 
   ,SPM.ItemPrice 
   ,SPM.ScenarioId
   FROM (
      SELECT 
            SP.StoreId  
           ,SP.PackageId 
           ,SP.ItemCount 
           ,SP.ItemPrice 
           ,SC.ScenarioId
           ,RowNumber = ROW_NUMBER() OVER (PARTITION BY SP.StoreId,SC.ScenarioId,SP.ItemCount ORDER BY SP.ItemPrice) 
      FROM ItemScenarios SC
      LEFT JOIN StorePackages AS SP ON SP.ItemCount = SC.CalculatedItem
      WHERE SC.Item = @requestedItems
 ) SPM
 WHERE SPM.RowNumber = 1

-- NOW I HAVE CHEAPEST PRICE FOR EACH ITEM, I CAN CREATE BASKET

 CREATE TABLE #selectedScenarios(
   StoreId int not null,
   ScenarioId int not null,
   TotalItem int not null,
   TotalCost decimal(18,8) 
)
 INSERT INTO #selectedScenarios
 SELECT 
      StoreId
     ,ScenarioId
     ,TotalItem 
     ,TotalCost 
  FROM (
     SELECT 
           StoreId
          ,ScenarioId
          --,PackageIds = dbo.GROUP_CONCAT(CAST(PackageId AS nvarchar(20))) -- CONCATENING PackageId decreasing performance here. We can joing seleceted scenarios with #JoinedPackageItemWithScenarios after selection complated.
          ,TotalItem = SUM(ItemCount)
          ,TotalCost = SUM(ItemPrice)
          ,RowNumber = ROW_NUMBER() OVER (PARTITION BY StoreId ORDER BY SUM(ItemPrice))
        FROM #JoinedPackageItemWithScenarios JPS
        GROUP BY StoreId,ScenarioId
        HAVING(SUM(ItemCount) >= @requestedItems)
     ) SLECTED
     WHERE RowNumber = 1

  -- NOW WE CAN POPULATE PackageIds if needed

  SELECT 
      SS.StoreId
     ,SS.ScenarioId
     ,TotalItem = MAX(SS.TotalItem)
     ,TotalCost = MAX(SS.TotalCost)
     ,PackageIds = dbo.GROUP_CONCAT(CAST(JPS.PackageId AS nvarchar(20)))
    FROM #selectedScenarios SS
    JOIN #JoinedPackageItemWithScenarios AS JPS ON JPS.StoreId = SS.StoreId AND JPS.ScenarioId = SS.ScenarioId
    GROUP BY SS.StoreId,SS.ScenarioId

SUM

In my test, this way is mimimum 10 times faster then recursive CTE, especially when number of stores and requested items increased. Also It gets 100% correct results. Because recursive CTE tried milions of unrequired JOINs when number of stores and requested items increased.