Pedram Salamati Pedram Salamati - 1 month ago 4x
SQL Question

How to manipulate values and get a new result on in SQL?

I have this query:

SELECT c.ID, c.Firstname, c.lastname, c.BDaY, c.gender, cp.code, cp.Citizenship, r.race, e.ethnicity
From Client AS C (nolock)
Join Citizenship as cp (nolock) on c.ID = cp.client_ID
Join Race as r (nolock) ON c.ID = R.Client_ID Join Ethnicity as E (nolock) ON E.Client_ID = c.ID

this query will give me some of clients name duplicated because they have different race and ethnicity. Example:

ID |FirstName|Lastname| BDay | gender | code |citizenship| race | ethnicity
1 Pedram Salamati 01-20-1998 M 1 US citizen Middle-east Spanish
1 Pedram Salamati 01-20-1998 M 1 US Citizen Middle-east unknown
1 Pedram Salamati 01-20-1998 M 1 US Citizen Middle-east Brazilian
2 Jesse Albert 03-05-1982 F 1 US Citizen African not Spanish
2 Jesse Albert 03-05-1982 F 1 US Citizen American not Spanish

I was wondering if there is any way to say if race is not = than Race should be Multiracial and if ethnicity Is not = to each other for same Id choose the last updated one.

** PS.Ethnicity has time stamp and I can use

I think maybe a sub query can help!

Any help or thought will be much appreciate!


Here is some test data to mimic your environment in the future you should separate the tables involved and test data. including DML statements is appropriate and helpful as well so people can try their solution prior to answering.

DECLARE @Client AS TABLE (ID INT, Firstname VARCHAR(25), LastName VARCHAR(25), BDay DATE, Gender CHAR(1))
INSERT INTO @Client VALUES (1,'Pedram','Salamati','01-20-1998','M')
DECLARE @Citizenship AS TABLE (Client_ID INT, Code INT, Citizenship VARCHAR(100))
INSERT INTO @Citizenship VALUES (1,1,'US citizen'),(2,1,'US citizen')
DECLARE @Ethnicity AS TABLE (Client_ID INT, Ethnicity VARCHAR(50))
INSERT INTO @Ethnicity VALUES (1,'Spanish'),(1,'unknown'),(1,'Brazilian'),(2,'not Spanish')
DECLARE @Race AS TABLE (Client_Id INT, Race VARCHAR(50), LastUpdate DATETIME)
INSERT INTO @Race VALUES (1,'Middle-east',GETDATE()),(2,'African',GETDATE()),(2,'American',GETDATE() -1)

With those variables you can do the following, there of course is more than 1 way this is simply 1 way I am choosing for a few reasons:

;WITH cteEthnicity AS (
       ,CASE WHEN COUNT(DISTINCT e.Ethnicity) > 1 THEN 'Multiracial' ELSE MIN(e.Ethnicity) END as Ethnicity
       @Ethnicity e

, cteRace AS (
       ,ROW_NUMBER() OVER (PARTITION BY r.Client_Id ORDER BY r.LastUpdate DESC) as RowNumber
       @Race r

    @Client AS C --(nolock) 
    Join @Citizenship as cp --(nolock)
    on  c.ID = cp.client_ID
    Join cteRace as r --(nolock)
    ON c.ID = R.Client_ID
    AND r.RowNumber = 1
    Join cteEthnicity as E --(nolock)
    ON E.Client_ID = c.ID

You displayed 2 issue 1 with race and 1 with ethnicity

  • For Ethnicity: you want to use aggregation to determine which ethnicity to assign. this can also be done with a window function but the way I wrote it here it will account for duplicates to exist even in the Ethnicity table.

  • For Race: you simply want the latest row partitioned by client you can use the ROW_NUMBER() function to generate that and then select where it equals 1 in the join statement

A third issue that you didn't point out but could be possible in some countries anyway is DUAL CITIZENSHIP. In that case you could use a method similar to that of Race.

Note even though Common Table Expressions [CTE] are used you can actually nest those as subselect as well.