Mladen Oršolić Mladen Oršolić - 1 month ago 4
SQL Question

Make value from every second row appear in new 3rd column

Lets assume my data looks like this :
Every second row represents old (previous value) in a table that holds historical data.

table 1 :

id value
------------
1 a
1 b
2 c
2 d
3 a
3 b


and i want to get value of every second row to appear in new 3rd column like this :

table 2:

id new_value old_value
------------------------
1 a b
2 c d
3 a b


EDIT:
For clarity ill post the skeleton of query thats producing data i want to transform (so its clear i am already using
WITH
so cant use additional one due to oracle not yet allowing nesting of
WITH
elements) :

skeleton code that produces data in table 1 :

with candidates as
(
--select list of candidates
)
SELECT * FROM
(
(
--select new values
MINUS
--select old values
)
UNION
(
--select old values
MINUS
--select new values
)
)
ORDER BY id;


The goal is to finally get only a list of ids that changed with their old and new values.

Thanks in advance.

Answer

It is quite possible that overall query can be written in a much simpler way. Just join intermediary results with old and new values together on id to put them in two different columns instead of unioning them into the same column.

WITH
candidates
AS
(
    --select list of candidates
)
,CTE_NewValues
AS
(
    --select new values
    select id, value AS new_value
    FROM candidates
    WHERE ...
    -- assumes id is unique, one row per id
)
,CTE_OldValues
AS
(
    --select old values
    select id, value AS old_value
    FROM candidates
    WHERE ...
    -- assumes id is unique, one row per id
)
SELECT
    CTE_NewValues.id
    ,CTE_NewValues.new_value
    ,CTE_OldValues.old_value
FROM
    CTE_NewValues
    INNER JOIN CTE_OldValues ON CTE_NewValues.id = CTE_OldValues.id
WHERE
    CTE_NewValues.new_value <> CTE_OldValues.old_value
ORDER BY
    CTE_NewValues.id;

If we stick to the skeleton of the query in the question, there are also many ways to do it. Self-join is likely to be less efficient than using analytic functions, like ROW_NUMBER and LEAD.

Sorting just by id is not enough to unambiguously define which value is new or old. You need to have some extra column to resolve it.

You don't "nest" WITH (common-table expressions), you "chain" them. Something like the following. As you do that, make sure to add the sort_order column to be able to distinguish old and new values, if you don't have a similar column already.

WITH
candidates 
AS
(
    --select list of candidates
)
,CTE_YourQuery
AS
(
    SELECT * FROM
    (
        (
        --select new values
        select 1 AS sort_order, id, value
        MINUS
        --select old values
        select 1 AS sort_order, id, value
        )
    UNION ALL
        (
        --select old values
        select 2 AS sort_order, id, value
        MINUS
        --select new values
        select 2 AS sort_order, id, value
        )
    )
)
,CTE_RowNumber
AS
(
    SELECT
        id
        ,value AS new_value
        ,ROW_NUMBER() OVER (PARTITION BY id ORDER BY sort_order) AS rn
        ,LEAD(value) OVER (PARTITION BY id ORDER BY sort_order) AS old_value
    FROM CTE_YourQuery
)
SELECT
    id
    ,new_value
    ,old_value
FROM CTE_RowNumber
WHERE rn = 1
ORDER BY id;
Comments