π Stata Lecture Notes β Preparing for Lab 3#
Session Length: ~4 hours
Goal: Students should confidently use if/else
, create binary variables, write simple programs
, loop over variable lists, and export regression tables.
πΉ 1. Conditional Logic + Chi-Square#
π‘ Stata can be taught to speak in complete English sentences β when you know how to control the logic.
Concepts Covered:
tabulate var1 var2, chi2
scalar
,return list
display
,if
,else
Example:#
use transplants.dta, clear
tabulate rec_work gender, chi2
scalar p = r(p)
if p < 0.05 {
display "Question 1: Working for income is associated with gender, p = " %5.3f p
}
else {
display "Question 1: Working for income is NOT associated with gender, p = " %5.3f p
}
β Skills Gained: How to retrieve and use the p-value from a test using
scalar
andif/else
.
πΉ 2. Creating Clean Binary Variables#
π§ Binary variables are the currency of logistic regression. You must be able to define your own currency.
Goal: Translate real-world conditions into clean 0/1 variables#
gen hypertensive = (dx == 4)
gen college = ///
inlist(educ, 4, 5, 6) // any college, associates, bachelors, grad
gen male = (gender == 1) // assuming 1=male, 2=female
β³οΈ Tip: Use
label define
andlabel values
to keep your binary variables readable.
πΉ 3. Logistic Regression and Interpreting Odds Ratios#
logistic hypertensive age bmi college male
Then get readable odds ratios:
estimates store model1
esttab model1, eform
π 4. Writing a program
to Display Odds Ratios in a Clean Format#
π You donβt want to copy/paste regression output. Make Stata do it for you.
table_loop
β Basic Loop Over Fixed Variables#
program define table_loop
foreach var in age bmi college male {
quietly logistic hypertensive `var'
local or = exp(_b[`var'])
local lb = exp(_b[`var'] - 1.96 * _se[`var'])
local ub = exp(_b[`var'] + 1.96 * _se[`var'])
display "`var'" _col(15) %4.2f `or' " (" %4.2f `lb' " - " %4.2f `ub' ")"
}
end
table_loop
β Skills Gained: Using coefficient and standard error to build a confidence interval manually.
π§° 5. Generalizing the Program with syntax varlist
#
π§ Now that theyβve seen it hardcoded, time to teach them how to generalize it with
syntax varlist
.
table_varlist
β Flexible Version#
program define table_varlist
syntax varlist
display "Regression table (Question 3)"
foreach var of varlist `varlist' {
quietly logistic hypertensive `var'
local or = exp(_b[`var'])
local lb = exp(_b[`var'] - 1.96 * _se[`var'])
local ub = exp(_b[`var'] + 1.96 * _se[`var'])
display "`var'" _col(15) %4.2f `or' " (" %4.2f `lb' " - " %4.2f `ub' ")"
}
end
table_varlist age bmi college male
π 6. Percentiles, Macros, and Variable Creation#
π How do you make a variable that depends on the 75th percentile of another?
summarize rec_hgt_cm, detail
scalar p75 = r(p75)
gen tall = (rec_hgt_cm >= p75)
label variable tall "Height β₯ 75th percentile"
Other variables:
gen white = (race == 1) // assuming 1 = white
gen age_10y = age / 10
label variable age_10y "Age (per 10y)"
π€ 7. Exporting a Clean Regression Table to Excel#
π€ Professional tables come from
putexcel
.
logistic tall age_10y rec_wgt_kg gender white
putexcel set lab3_output.xlsx, replace
putexcel A1 = ("Variable") B1 = ("OR (95% CI)")
foreach var in age_10y rec_wgt_kg gender white {
local or = exp(_b[`var'])
local lb = exp(_b[`var'] - 1.96 * _se[`var'])
local ub = exp(_b[`var'] + 1.96 * _se[`var'])
local ci = string(`or', "%4.2f") + " (" + ///
string(`lb', "%4.2f") + "-" + ///
string(`ub', "%4.2f") + ")"
putexcel A`=_n+1' = "`var'" B`=_n+1' = "`ci'"
}
π§ End of Lecture: Recap + Stretch#
By now, they can:
Use conditional logic to control output
Define 0/1 variables cleanly
Run logistic models and interpret ORs
Write flexible
program
s that loopExport clean tables to Excel
π§ͺ Stretch challenge (optional): Try making
table_varlist
also accept the outcome variable as a parameter.