5. πŸ“Š Graphs#

See also

We’re going to approach Day 5 with a full view of Lab 4. Since Lab 4 (Thursday) was lost to Juneteenth, and Lab 5 (Friday) depends on graphing, our job is to infuse Lab 5 with the missing logic of Lab 4 β€” especially the egen, tag, and survival analysis foundation.

What follows is a refined, conceptually integrated Friday (Lab 5) that fully honors Lab 4, but without dumping 2 labs on you.


πŸ’₯ Final Day Rework: Lab 5, Enriched by Lab 4#

πŸ‘©πŸ½β€πŸ’» Lecture + Walkthrough Plan (8:30–12:20)#

Time

Topic

From… β†’ To…

8:30–9:00

Merge + Tagging Refresher

Lab 4 β†’ Lab 5 graph structure

9:00–9:45

EDA & Grouped Summaries

egen, tag, total, median()

9:45–10:30

Survival Analysis Setup (over50)

stset, sts graph, stcox

10:30–11:15

Merge + Transform for Graphs

Prepare merged vars for plotting

11:15–12:00

Final Graphs + Export

Graph 1–5 in Lab 5

12:00–12:20

Lab Setup

Students build .do for Lab 5


πŸ” Merge Lab 4 Concepts into Lab 5 with Purpose#

🧩 1. Use egen, tag to prep variables for graphs#

Before you graph, group your data smartly.

* Center volume
egen ctr_volume = total(tag(fake_id)), by(ctr_id)

* Only one record per center:
egen one_ctr = tag(ctr_id)
list ctr_id ctr_volume if one_ctr

❓ 2. Unknown ESRD indicator#

Used again in Lab 5, Q5 for unknown

gen unknown = ///
    inlist(extended_dgn, ///
    "ESRD UNKNOWN ETIOLOGY", ///
    "ESRD OF UNCERTAIN ETIOLOGY", "") // or trim/regex

count
count if unknown == 1

⏱️ 3. Died within 6 months#

Pre-setup for survival analysis in Lab 5

gen died_within_6mo = (died == 1 & end_d - transplant_d <= 180)
count if died_within_6mo

⏳ 4. Wait time by blood type#

egen mean_wait = mean(wait_yrs), by(abo)
egen med_wait  = median(wait_yrs), by(abo)
egen tag_blood = tag(abo)
list abo mean_wait med_wait if tag_blood

⚰️ Survival Setup for Lab 5 Graphs#

These prep Lab 5 graphs β€” including over50 stratified survival.

gen over50 = age > 50
gen f_time = end_d - transplant_d
format transplant_d end_d %td

stset f_time, failure(died)

sts graph, by(over50)
graph export survival_by_over50.png, replace

stcox over50

πŸ§ͺ Add to Lab 5 β€” Q6 (Bonus, but essential now)#

gen over50 = age > 50
stset f_time, failure(died)
sts graph, by(over50)
graph export lastname_q6.png, replace

Add:

stcox over50

Print:

di "Hazard ratio: " _b[over50] " (95% CI " _b[over50] - 1.96*_se[over50] " – " _b[over50] + 1.96*_se[over50] ")"

πŸ’Ύ Lab 5 .do Scaffold (Lab4-infused)#

clear all
set more off
log using lab5_lastname.log, replace

* Merge for setup
use transplants, clear
merge 1:1 fake_id using donors_recipients.dta
drop if _merge != 3

* Create ctr_volume
egen ctr_volume = total(tag(fake_id)), by(ctr_id)

* Unknown cause
gen unknown = inlist(extended_dgn, "ESRD UNKNOWN ETIOLOGY", "ESRD OF UNCERTAIN ETIOLOGY", "")

* Wait time by blood type
egen mean_wait = mean(wait_yrs), by(abo)
egen med_wait  = median(wait_yrs), by(abo)
egen tag_blood = tag(abo)
list abo mean_wait med_wait if tag_blood

* Over50 variable for survival
gen over50 = age > 50
gen f_time = end_d - transplant_d
format transplant_d end_d %td
stset f_time, failure(died)

* Q1–Q5 graph code goes here...

log close

🧠 Takeaway#

You cannot graph well without transforming meaningfully. Graphs are not step 1 β€” they are step 5.