Nick Bummara Nick Bummara - 6 months ago 10
SQL Question

SQL separating address into mutiple columns using spaces

I have over 7 million rows, otherwise I would use Excel.

My address column has a varying number of words. Some are as short as '123 bay street', while others can be as long as '1234 west spring hill drive apt 123'.

My goal is to put each word into its own column. I was able to get the first word, using the query below. But I can't create a query efficient enough to do the rest.

update X
set X.Address_number = Y.[address]
from
(SELECT
unique_id,
CASE
WHEN SUBSTRING(phy_addr1, 1, CHARINDEX(' ', phy_addr1)) = ''
THEN phy_addr1 + ' '
ELSE SUBSTRING(phy_addr1, 1, CHARINDEX(' ', phy_addr1))
END 'address'
FROM
[RD_GeoCode].[dbo].[PA_Stg_excel]) as Y
inner join
[RD_GeoCode].[dbo].[rg_ApplicationData_AllForms_20160401_address] as X on X.unique_id = Y.unique_id
where
X.Address_number is null

Answer

you need to have a Numbers table and one of the split strings mentioned here.once you have that ,then its simple..

-----String splitter function

CREATE FUNCTION dbo.SplitStrings_Numbers
(
   @List       NVARCHAR(MAX),
   @Delimiter  NVARCHAR(255)
)
RETURNS TABLE
WITH SCHEMABINDING
AS
   RETURN
   (
       SELECT Item = SUBSTRING(@List, Number, 
         CHARINDEX(@Delimiter, @List + @Delimiter, Number) - Number)
       FROM dbo.Numbers
       WHERE Number <= CONVERT(INT, LEN(@List))
         AND SUBSTRING(@Delimiter + @List, Number, LEN(@Delimiter)) = @Delimiter
   );
GO

you can use the above function like below..

select 
*
 from yourtable t
cross apply
dbo.SplitStrings_Numbers(t.address,' ') b

Assuming,your table is of below format..

create table tt
(
address varchar(300)  
)

insert into tt
select 'abc def ghi'

instead of updating values into same table,i suggest create some other table which has links to above table.This requires some schema modification to your existing table

create table addressreferences
(
addresss varchar(300),
delimitedvalue varchar(100)
)

insert into addressreferences
 select 
    address,b.*
     from yourtable t
    cross apply
    dbo.SplitStrings_Numbers(t.address,' ') b

This is just a pseudo code to give an idea,you will have to take care of references...Updating same table will not work ,because you are not aware how many rows an address column can span

Update:
I think a trigger will be better suit for your scenario instead of references ..But you have to do an insert first on references table for existing values .here is some pseudo code..

create trigger trg_test
after insert,update,delete 
on dbo.yourtable
as
begin
---check for inserts
if exists(Select * from inserted)
begin
insert into addressreferences
select address,b.* from inserted i
cross apply
dbo.splitstrings(address,' ') b

--check for deletes
if exists(select 1 from deleted)
begin

delete * from 
deleted d
join
adressreferences a
on a.address=d.address

end

if update(address)
begin
---here i recommend doing delete first since your old address and new one may not have equal rows

delete * from
deleted d
join
addressreferences a
on a.address=d.address

--then do a insert
insert into addressreferences
select address,a.* from
inserted i
join
addressreferences a
on a.address=i.address

end

end

end