velvetrock velvetrock - 5 months ago 23
SQL Question

SAS proc freq for distinct value of a variable

I have a table which contains the information of course selected in 2 semesters of each student. Those students didn't validate their first semester, so all the

valid_or_not_of_semester='N'
for
semester='1st'
:

student semester course_selected valid_or_not_of_semester
A 1st math N
A 1st english N
A 2nd math Y
A 2nd english Y
B 1st math N
B 2nd math Y
B 2nd english Y
C 1st math N
C 2nd math N


For those students who selected
math
(or
english
) in the first semester, I want to research if they have selected
math
(or
english
) in the second semester, if yes, I'm going to create a cross-tabulation, which counts the number of those students who validated or not their second semester:

--------------------------------------------------------------------------
1st semester \ 2nd semester | Math | English
invalid \ |---------------------|--------------------
students \ | valid | invalid | valid | invalid
--------------------------------------------------------------------------
Math | 2 | 1 | 2 | 0
--------------------------------------------------------------------------
English | 1 | 0 | 1 | 0
--------------------------------------------------------------------------


Each row stands for the number of students who didn't validate the first semester and have selected the course in the first semester. And columns seprate students who selected the course into valid and invalid of their second semester. To be more precise,

--------------------------------------------------------------------------
1st semester \ 2nd semester | Math | English
invalid \ |---------------------|--------------------
students \ | valid | invalid | valid | invalid
--------------------------------------------------------------------------
Math | 2 | 1 | 2 | 0
| | |
\ / \ / \ /
(students A&B) (student C) (students A&B)


I tried proc sql:

data math;
merge have
have (where=(semester='1st') in=these);
by student;
if these then output;
run;

proc sql;
create table result as
select count(distinct student) as nb_student
from math (where=(semester='2nd'))
group by course_selected, valid_or_not_of_semester;
quit;


And do the same thing for
english
.

But is there any way to obtain the result of 2 courses directly? How could I use the proc freq?

Hope to get your answer. Thanks in advance!

Answer

This doesn't give you exactly the table you were looking for, but it does generate the values you were interested in. The idea is to transpose the original dataset and then count up observations afterwards.

You may also want to look into proc tabulate though you may run into problems because you're double-counting students under certain circumstances.

data temp;
   input student $ semester $ course_selected $ valid_or_not_of_semester $;
   datalines;
     A 1st math N
     A 1st english N
     A 2nd math Y
     A 2nd english Y
     B 1st math N
     B 2nd math Y
     B 2nd english Y
     C 1st math N
     C 2nd math N 
    ;
    proc sort; by student;
run;

proc transpose data = temp out = temp2;
    by student;
    id course_selected semester;
    var valid_or_not_of_semester;
run;

proc sql;
    create table temp3 as select distinct
        sum(case when math1st = "N" and math2nd = "Y" then 1 else 0 end) as math_math_valid,
        sum(case when math1st = "N" and math2nd = "N" then 1 else 0 end) as math_math_invalid,
        sum(case when english1st = "N" and math2nd = "Y" then 1 else 0 end) as english_math_valid,
        sum(case when english1st = "N" and math2nd = "N" then 1 else 0 end) as english_math_invalid,
        sum(case when math1st = "N" and english2nd = "Y" then 1 else 0 end) as math_english_valid,
        sum(case when math1st = "N" and english2nd = "N" then 1 else 0 end) as math_english_invalid,
        sum(case when english1st = "N" and english2nd = "Y" then 1 else 0 end) as english_english_valid,
        sum(case when english1st = "N" and english2nd = "N" then 1 else 0 end) as english_english_invalid
        from temp2;
quit;
Comments