picobit - 1 year ago 57
SQL Question

# Determine which POS sale takes precedence. Following logic tree in Sybase SQL Anywhere 10

My small grocery store has partnered with a third party to take online orders. I made an Excel sheet that queries our product database for a bunch of product information, and I feed that data into another macro-enabled worksheet that does the heavy lifting and generates a .CSV that I upload each week.

What I'm trying to accomplish now is to pull in current and future sales information so we can advertise them. Sales are created in our POS by putting them into groupings called "worksheets." The complicating factor is that a product can exist on multiple worksheets at once, and there's a logic tree that is followed to determine which of the worksheets will affect the scan price at any given time.

The two attributes that determine precedence are "priority" and "date committed".

• The three priorities, High, Medium, and Low are represented as 1, 2, and 3 respectively. Higher priority sheets will always override lower priority sheets for the duration of their life.

• When worksheets have equal priorities, the one that is most recently committed takes precedence.

So, given data that looks like this:

``````item_id        worksheet_name   priority   date_committed            sale_start_date           sale_end_date             sale_price
011259904209   A                2          2016-06-22 09:21:09.041   2016-06-29 00:00:00.000   2016-07-20 11:00:00.000   2.0000
074682105322   B                2          2016-06-22 09:49:31.722   2016-07-20 00:00:00.000   2016-08-03 11:00:00.000   2.0000
074682105322   C                2          2016-06-22 08:57:04.641   2016-07-19 00:00:00.000   2016-08-03 16:00:00.000   2.0000
042563013660   A                2          2016-06-22 09:21:09.048   2016-06-29 00:00:00.000   2016-07-20 11:00:00.000   3.9900
042563013660   D                1          2016-06-25 14:03:33.499   2016-06-29 00:00:00.000   2016-07-05 23:59:59.000   2.9900
042563013660   E                2          2016-06-22 08:49:13.515   2016-06-28 00:00:00.000   2016-07-20 16:00:00.000   3.9900
073360772054   A                2          2016-06-22 09:21:09.114   2016-06-29 00:00:00.000   2016-07-20 11:00:00.000   3.9900
073360772054   B                2          2016-06-22 09:49:31.831   2016-07-20 00:00:00.000   2016-08-03 11:00:00.000   3.9900
073360772054   E                2          2016-06-22 08:49:13.520   2016-06-28 00:00:00.000   2016-07-20 16:00:00.000   3.9900
073360772054   C                2          2016-06-22 08:57:04.649   2016-07-19 00:00:00.000   2016-08-03 16:00:00.000   3.9900
012993221010   A                2          2016-06-22 09:21:09.110   2016-06-29 00:00:00.000   2016-07-20 11:00:00.000   3.3900
012993221010   B                2          2016-06-22 09:49:31.828   2016-07-20 00:00:00.000   2016-08-03 11:00:00.000   3.3900
012993221010   D                1          2016-06-25 14:03:33.502   2016-06-29 00:00:00.000   2016-07-05 23:59:59.000   2.9900
012993221010   E                2          2016-06-22 08:49:13.517   2016-06-28 00:00:00.000   2016-07-20 16:00:00.000   3.3900
012993221010   C                2          2016-06-22 08:57:04.646   2016-07-19 00:00:00.000   2016-08-03 16:00:00.000   3.3900
``````

I want to get this:

``````Run on 6/27
item_id        worksheet_name   sale_start_date           sale_end_date             sale_price
011259904209   A                2016-06-29 00:00:00.000   2016-07-20 11:00:00.000   2.0000
074682105322   C                2016-07-19 00:00:00.000   2016-07-20 00:00:00.000   2.0000
042563013660   E                2016-06-28 00:00:00.000   2016-06-29 00:00:00.000   3.9900
073360772054   E                2016-06-28 00:00:00.000   2016-06-29 00:00:00.000   3.9900
012993221010   E                2016-06-28 00:00:00.000   2016-06-29 00:00:00.000   3.9900

Run on 6/29
item_id        worksheet_name   sale_start_date           sale_end_date             sale_price
011259904209   A                2016-06-29 00:00:00.000   2016-07-20 11:00:00.000   2.0000
074682105322   C                2016-07-19 00:00:00.000   2016-07-20 00:00:00.000   2.0000
042563013660   D                2016-06-29 00:00:00.000   2016-07-05 23:59:59.000   2.9900
073360772054   A                2016-06-29 00:00:00.000   2016-07-19 00:00:00.000   3.9900
012993221010   D                2016-06-29 00:00:00.000   2016-07-05 23:59:59.000   2.9900
``````

Bonus points for combining overlapping sale periods to reflect the shopper's perception, but that's not necessary.

How can I get this result using SQL? Our sales run Wednesday through Tuesday, and I'm ideally generating my data file for the coming week on Wednesday afternoon or Thursday morning after we finish our weekly price changes.

We have tens of thousands of products in file.

Here's a graphical representation of the worksheet priorities per day

This is for SQL Anywhere 10, and running
`SELECT @@VERSION`
tells me 12.0.1.3967

I'm mostly familiar with SQL Server, but Sybase's SQL Anywhere is still very similar.

This is the "greatest N per group" problem. The typical solution (in anything other than MySQL) is to use ROW_NUMBER(), which is available on SQL Anywhere 10 and later as far as I can tell from Sybase books online.

``````SELECT a.item_id,
a.worksheet_name,
a.sale_start_date,
a.sale_end_date,
a.sale_price
FROM (
SELECT item_id,
worksheet_name,
sale_start_date,
sale_end_date,
sale_price,
ROW_NUMBER() OVER (PARTITION BY item_id ORDER BY priority, date_committed DESC) AS rn
FROM UnnamedSalesTable
WHERE sale_start_date <= CURRENT DATE
AND sale_end_date > CURRENT DATE) a
WHERE a.rn = 1
``````

Obviously, you can replace `CURRENT DATE` with whatever date value you want to run.

If you happen to have two sales with an identical `priority` and `date_committed`, then you'll still only get one row. In other words, if there are duplicates you won't know. I suspect that's what you want. However, if you need to see duplicates, then instead of ROW_NUMBER() you'd want to use RANK() or DENSE_RANK() (either work in this case). That will allow "ties" to show up. Otherwise the query would be identical. If this duplication happens a lot, then you'll want to add a third column to the ORDER BY portion of the WINDOW clause.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download