Craigoh1 Craigoh1 - 5 months ago 12
SQL Question

My SQL loop takes to long to complete. Is there a better way to do what I'm trying to achieve?

I am trying to de-duplicate a set of data based on certain columns. That is it isn't as easy as just

SELECT DISTINCT
.

I want to select from my set all rows where a column is unique. I have sorted my set in a way that I just want the loop to grab the first occurrence of the "proxy" key column. I'm saying proxy as it isn't the actual primary key of the table.

I'm using a while loop and using a counter variable based on the count of rows in a temp table. I delete each row from my temp table after processing so this should reduce the base table by the one record which was processed and any duplicate rows.

Although my code works it is seems 'cowboyish' and would like your opinion on how to do it 'cleaner' thanks

Here is my code:

declare @cnt int

set @cnt = (select COUNT(*) from #temp)

while @cnt > 0
begin
select top 1 * into #temp2 from #temp

insert into #temp3 (Member_ID, email, meeting_status,member_type,firstname,lastname,address1, Match_Method, Match_Score)
select #temp2.*
from #temp2
left outer join #temp3 on #temp2.Member_ID = #temp3.Member_ID
where #temp3.Member_ID is null

delete #temp
from #temp
inner join #temp2 on #temp.Member_ID = #temp2.Member_ID

drop table #temp2

set @cnt = (select COUNT(*) from #temp)
end

Answer

Use below code:

WITH uniqueRecords AS(
    SELECT  *,ROW_NUMBER()OVER(PARTITION BY T.Member_ID ORDER BY (SELECT 1)) AS RowNum
    FROM #Temp AS T
)
INSERT INTO #temp3(Member_ID, email, meeting_status,member_type,firstname,lastname,address1, Match_Method, Match_Score)
SELECT U.Member_ID, U.email, U.meeting_status,U.member_type,U.firstname,U.lastname,U.address1, U.Match_Method, U.Match_Score
FROM uniqueRecords AS U
LEFT OUTER JOIN #temp3 T3 on U.Member_ID = T3.Member_ID
WHERE T.RowNum=1
AND T3.Member_ID is null;

There might be scope for further improvement.