These are some newbie questions about statistical programming for R for which I haven't been able to find an answer online. My dataframe is labeled "eitc" in the code below.
1) Once I've loaded in a data frame, I would like to look at summary statistics. I've used the functions:
eitc <- read.dta(file="/Users/Documents/eitc.dta")
sapply(eitc,mean,na.rm=TRUE) #for sample mean, min, max, etc.
summarize if children >= 1
mean work if post93==0 & anykids==1
post93.dummy <- as.numeric(eitc$year>1993)
A lot of your requirements are answered by
summary(subset(eitc, post93 == 0 & anykids == 1, select=work)) nrow(subset(eitc, post93 == 0 & anykids == 1, select=work)) # for number of obs.
?subset documentation has good examples.
cbind method of attaching dummy variables is unneccesary. Just do:
eitc$post93.dummy <- as.numeric(eitc$year>1993)