Crazysasman Crazysasman - 5 months ago 11
SQL Question

Sum and count variables while executing a left join

UPDATED example code below - using SQL within SAS:

proc sql;
create table add_losses as
select *,
sum(bb.gross_loss) as gl format = comma15.2,
count(bb.gross_loss) as n_losses
from add_startend as aa
left join LED as bb
on (aa.process_name = bb.process_name and
aa.group_id = bb.group_code and
aa.start_date le bb.first_loss_posting_date le aa.end_date)

group by aa.process_name, aa.group_id, aa.start_date, aa.end_date
order by aa.process_name, aa.group_id, aa.start_date, aa.end_date;
quit;


Example data and desired output below:

Table AA

variable 1 variable 2 start date end date
AAAA BBB 1/1/2010 6/1/2010


Table BB

variable 1 variable 2 Date losses
AAAA BBB 1/5/2010 100
AAAA BBB 2/1/2010 100
AAAA BBB 3/5/2010 100
AAAA BBB 4/23/2010 100
AAAA BBB 5/11/2010 100
AAAA BBB 5/25/2010 100


Table YY (current output)

variable 1 variable 2 Date gross_loss gl n_losses
AAAA BBB 1/5/2010 100 600 6
AAAA BBB 2/1/2010 100 600 6
AAAA BBB 3/5/2010 100 600 6
AAAA BBB 4/23/2010 100 600 6
AAAA BBB 5/11/2010 100 600 6
AAAA BBB 5/25/2010 100 600 6


Table XX (desired output)

variable 1 variable 2 start date end date gl n_losses
AAAA BBB 1/1/2010 6/1/2010 600 6


The problem is the current code creates additional observations. I would like to keep the same # of rows and all variables in table AA while adding on the columns 'gl' and 'n_losses'.

Answer

You can use Between clause for this type of condition

proc sql;
create table add_losses as 
select aa.*, bb.gl format, bb.n_losses
from
add_startend as aa
left join
(
select aa.process_name, aa.group_id, aa.start_date, aa.end_date, 
sum(bb.gross_loss) as gl format = comma15.2, 
count(bb.gross_loss) as n_losses
from add_startend as aa     
left join LED as bb 

on (aa.process_name = bb.process_name and
aa.group_id = bb.group_code and
bb.first_loss_posting_date between aa.start_date and aa.end_date)

group by aa.process_name, aa.group_id, aa.start_date, aa.end_date
) bb
on  aa.process_name = bb.process_name 
and aa.group_id = bb.group_code
and aa.start_date = aa.start_date
and aa.end_date = bb.end_date
order by aa.process_name, aa.group_id, aa.start_date, aa.end_date;
quit;
Comments