Oak_3260548 Oak_3260548 - 5 months ago 8
SQL Question

SQL Split line during import - run INSRERT INTO twice for specific lines

let's assume a super-simplified data example:

ID AKey AVal
-----------------
1 AB 94
2 Q 48
3 Z 56
4 AB 12
5 T 77
... ... ...


I would like to split "AB" into separate lines "A" and "B" in my import script, where I typically do:

INSERT INTO MyNewTable
SELECT
SRC.ID as OldIDRef,
SRC.AKey as NewKey,
SRC.AVal as NewVal
FROM OldTable as SRC


So basically, I would like to duplicate the "AB" lines in the select and perform some calculations specific for this line (let's say divide AVal by 2)

The only solution I can think of would be someting like this:

INSERT INTO MyNewTable
SELECT
SRC.ID as OldIDRef,
CASE SRC.AKey = 'BA' THEN SUBSTRING(SRC.AKey,1,1) END as NewKey,
CASE SRC.AKey = 'BA' THEN SRC.AVal / 2 END as NewVal
FROM OldTable as SRC

UNION ALL

SELECT
SRC.ID as OldIDRef,
SUBSTRING(SRC.AKey,2,1) as NewKey,
SRC.AVal / 4 + 10 as NewVal
FROM OldTable as SRC
WHERE SRC.AKey = 'BA'


I will need such process more then few times in my imports, so I wonder, If I am not missing some simpler solution?

Answer

Let's say you are probably looking for an answer that can be scaled to N# of characters in the Key Length and one that aportions the new value to the number of keys that are split from it. I would go with a recursive cte to pull it off. With your sample data add another row with 3 characters in it like 'GHI' and run this code and see the results scale beyond just 2 characters.

;WITH cteRecursive AS (
    SELECT
       Id
       ,AKey
       ,LEFT(AKey,1) AS NewAKey
       ,RIGHT(Akey,LEN(AKey) - 1) AS RemainingKey
       ,AVal
       ,1 AS [Level]
    FROM
       @Table

    UNION ALL

    SELECT
       t.Id
       ,t.AKey
       ,LEFT(c.RemainingKey,1) AS NewAKey
       ,RIGHT(RemainingKey,LEN(RemainingKey) - 1) AS RemainingKey
       ,t.AVal
       ,c.[Level] + 1 AS [Level]
    FROM
       @Table t
       INNER JOIN cteRecursive c
       ON t.Id = c.Id
       AND LEN(c.RemainingKey) > 0
)

SELECT
    Id
    ,AKey AS OriginalAKey
    ,NewAKey
    ,AVal AS OriginalAVal
    ,AVal / 2.00 AS NewVal
    ,AVal / CAST(MAX([Level]) OVER (PARTITION BY Id) AS DECIMAL(4,2)) AS NewValAsPortionOfLevel
    ,AVal / CAST(LEN(AKey) AS DECIMAL(4,2)) AS NewValAsPortionOfKeyLength
FROM
    cteRecursive

Here is the Table variable I used if you want it

DECLARE @Table AS TABLE (Id INT IDENTITY(1,1), AKey VARCHAR(100), AVal INT)
INSERT INTO @Table (AKey, AVal)
VALUES ('AB',94),('Q',48),('Z',56),('AB',12),('T',77),('ghi',100)

If not splitting the key you can actually simplify the recursive cte and go that route. By using Level < LEN(AKey) the recursion will stop at the right spot and you don't need any of the other string manipulations.

;WITH cteRecursive AS (
    SELECT
       Id
       ,AKey
       ,AVal
       ,1 AS [Level]
    FROM
       @Table

    UNION ALL

    SELECT
       t.Id
       ,t.AKey
       ,t.AVal
       ,c.[Level] + 1 AS [Level]
    FROM
       @Table t
       INNER JOIN cteRecursive c
       ON t.Id = c.Id
       AND c.[Level] < LEN(t.Akey)
)

SELECT
    Id
    ,AKey AS OriginalAKey
    ,AVal AS OriginalAVal
    ,AVal / 2.00 AS NewVal
    ,AVal / CAST(MAX([Level]) OVER (PARTITION BY Id) AS DECIMAL(4,2)) AS NewValAsPortionOfLevel
    ,AVal / CAST(LEN(AKey) AS DECIMAL(4,2)) AS NewValAsPortionOfKeyLength
FROM
    cteRecursive

And another technique if you have a very large dataset and don't want to use recursion, is you can build a Tally Table To Join on. I would be curious to know which performs better. I actually have a permanent tally table for use in some of the record manipulation for ETL of data warehouse that I use, but you should probably use a temp table rather than a table variable too. Anyway, here is that method.

DECLARE @TallyTable AS TABLE (I INT)

DECLARE @MaxLen INT
SELECT @MaxLen = MAX(LEN(AKey)) FROM @Table

IF (@MaxLen > 0)
BEGIN
    WHILE @MaxLen > 0
    BEGIN
       INSERT INTO @TallyTable (I) VALUES (@MaxLen)
       SET @MaxLen -= 1
    END
END

SELECT
    *
    ,NewValueApportionedByLengthOfKey = CAST(AVal AS DECIMAL) / ISNULL(NULLIF(LEN(AKey),0),1)
FROM
    @Table t
    INNER JOIN @TallyTable tt
    ON LEN(t.AKey) >= tt.I

Note all of these methods assume AKey will never be NULL or 0 length but all are easily adapted to handle that should it be a need.