5. π Graphs#
Weβre going to approach Day 5 with a full view of Lab 4. Since Lab 4 (Thursday) was lost to Juneteenth, and Lab 5 (Friday) depends on graphing, our job is to infuse Lab 5 with the missing logic of Lab 4 β especially the egen
, tag
, and survival analysis foundation.
What follows is a refined, conceptually integrated Friday (Lab 5) that fully honors Lab 4, but without dumping 2 labs on you.
π₯ Final Day Rework: Lab 5, Enriched by Lab 4#
π©π½βπ» Lecture + Walkthrough Plan (8:30β12:20)#
Time |
Topic |
Fromβ¦ β Toβ¦ |
---|---|---|
8:30β9:00 |
Merge + Tagging Refresher |
Lab 4 β Lab 5 graph structure |
9:00β9:45 |
EDA & Grouped Summaries |
|
9:45β10:30 |
Survival Analysis Setup ( |
|
10:30β11:15 |
Merge + Transform for Graphs |
Prepare merged vars for plotting |
11:15β12:00 |
Final Graphs + Export |
Graph 1β5 in Lab 5 |
12:00β12:20 |
Lab Setup |
Students build |
π Merge Lab 4 Concepts into Lab 5 with Purpose#
π§© 1. Use egen
, tag
to prep variables for graphs#
Before you graph, group your data smartly.
* Center volume
egen ctr_volume = total(tag(fake_id)), by(ctr_id)
* Only one record per center:
egen one_ctr = tag(ctr_id)
list ctr_id ctr_volume if one_ctr
β 2. Unknown ESRD indicator#
Used again in Lab 5, Q5 for unknown
gen unknown = ///
inlist(extended_dgn, ///
"ESRD UNKNOWN ETIOLOGY", ///
"ESRD OF UNCERTAIN ETIOLOGY", "") // or trim/regex
count
count if unknown == 1
β±οΈ 3. Died within 6 months#
Pre-setup for survival analysis in Lab 5
gen died_within_6mo = (died == 1 & end_d - transplant_d <= 180)
count if died_within_6mo
β³ 4. Wait time by blood type#
egen mean_wait = mean(wait_yrs), by(abo)
egen med_wait = median(wait_yrs), by(abo)
egen tag_blood = tag(abo)
list abo mean_wait med_wait if tag_blood
β°οΈ Survival Setup for Lab 5 Graphs#
These prep Lab 5 graphs β including over50
stratified survival.
gen over50 = age > 50
gen f_time = end_d - transplant_d
format transplant_d end_d %td
stset f_time, failure(died)
sts graph, by(over50)
graph export survival_by_over50.png, replace
stcox over50
π§ͺ Add to Lab 5 β Q6 (Bonus, but essential now)#
gen over50 = age > 50
stset f_time, failure(died)
sts graph, by(over50)
graph export lastname_q6.png, replace
Add:
stcox over50
Print:
di "Hazard ratio: " _b[over50] " (95% CI " _b[over50] - 1.96*_se[over50] " β " _b[over50] + 1.96*_se[over50] ")"
πΎ Lab 5 .do
Scaffold (Lab4-infused)#
clear all
set more off
log using lab5_lastname.log, replace
* Merge for setup
use transplants, clear
merge 1:1 fake_id using donors_recipients.dta
drop if _merge != 3
* Create ctr_volume
egen ctr_volume = total(tag(fake_id)), by(ctr_id)
* Unknown cause
gen unknown = inlist(extended_dgn, "ESRD UNKNOWN ETIOLOGY", "ESRD OF UNCERTAIN ETIOLOGY", "")
* Wait time by blood type
egen mean_wait = mean(wait_yrs), by(abo)
egen med_wait = median(wait_yrs), by(abo)
egen tag_blood = tag(abo)
list abo mean_wait med_wait if tag_blood
* Over50 variable for survival
gen over50 = age > 50
gen f_time = end_d - transplant_d
format transplant_d end_d %td
stset f_time, failure(died)
* Q1βQ5 graph code goes here...
log close
π§ Takeaway#
You cannot graph well without transforming meaningfully. Graphs are not step 1 β they are step 5.