Lab 4#
Part 1#
Write a .do file which imports transplants.dta and performs the data management/exploratory data analysis tasks described below using the slides we have discussed in class. Your .do file (lab4_lastname.do) and must create a log file (lab4_lastname.log). This file will contain your answers for both part 1 and part 2 of today’s lab. Your .do file should follow conventions for .do file structure described in class. Do not submit your log files as part of the assignment.
Create a new variable called
ctr_volume
that contains the transplant center volume (total number of transplants performed in each transplant center). The transplant center can be identified byctr_id
.Use
summarize, detail
to show the distribution ofctr_volume
. For question 2, you should analyze one observation per center, not one observation per transplant recipient. (Hint: Tagging can help with this)Generate a variable called
unknown
which is a 1 for all patients whose cause of end-stage renal disease (ESRD) is unknown/uncertain/blank (extended_dgn
= “ESRD UNKNOWN ETIOLOGY”, “ESRD OF UNCERTAIN ETIOLOGY”, etc.) and a zero for all other patients. Display the following sentence:
xxx of 2000 patients (yy.y%) have an unknown cause of ESRD.
Except fill in the correct numbers.
How many patients died within six months of their transplant (died==1 and transplant date falls ≤ 180 days before end date)? Display the following sentence:
xxx of 2000 patients (yy.y%) died within 6 months of their transplant date.
Except fill in the correct numbers.
For each blood type, what is the mean waiting time for a transplant? (variable:
wait_yrs
) What is the median waiting time for a transplant? Useegen
,tagging
, andlist
to display the blood type, mean wait time, and median wait time of each blood type so that only one record is displayed per blood type.
Lab 4 Part 2
Create a variable called
over50
which is a 1 for any recipient age >50 and 0 for everyone else. Draw a survival curve stratified by over50.Is the difference in post-transplant survival by over50 statistically significant (p>0.05)? Write one of the following sentences:
There is a statistically significant difference in survival by age category (p<0.05)
or
There is no statistically significant difference in survival by age category (p=0.x)
Run a Cox regression on over50 (command
stcox over50
). Print this sentence:
Hazard ratio: x.xx (95% CI y.yy-z.zz)
Except fill in the correct values.