HW 5#

This homework revisits HW3, but with a few additional challenges. Refer to lab 5 for hints.

global repo https://github.com/jhustata/basic/raw/main/
import delimited "${repo}hw1.txt", clear

Question 1. Now you have all the skills to create your first automated Table 1! Write a program called question1 that prints the following table (including Question 1 in the header). The XX values should be replaced with correct values found in the dataset, and should be rounded to the nearest whole number for age and to one decimal place to the right of the decimal point for other variables. Make sure the summary statistics are vertically aligned and justified along the left margin. Run your program and display the table (i.e., in .log file). Also, output these results to Question1.xlsx

Question 1                Males (N=XX)       Females (N=XX)
Age, median (IQR)         XX (XX-XX)         XX (XX-XX)
Previous transplant, %    XX.X               XX.X
Cause of ESRD:
Glomerular, %             XX.X               XX.X
Diabetes, %               XX.X               XX.X
PKD, %                    XX.X               XX.X
Hypertensive, %           XX.X               XX.X
Renovascular, %           XX.X               XX.X
Congenital, %             XX.X               XX.X
Tubulo, %                 XX.X               XX.X
Neoplasm, %               XX.X               XX.X
Other, %                  XX.X               XX.X

OR with indentation for categorical variables:

Question 1                Males (N=XX)       Females (N=XX)
Age, median (IQR)         XX (XX-XX)         XX (XX-XX)
Previous transplant, %    XX.X               XX.X
Cause of ESRD, %
   Glomerular             XX.X               XX.X
   Diabetes               XX.X               XX.X
   PKD                    XX.X               XX.X
   Hypertensive           XX.X               XX.X
   Renovascular           XX.X               XX.X
   Congenital             XX.X               XX.X
   Tubulo                 XX.X               XX.X
   Neoplasm               XX.X               XX.X
   Other                  XX.X               XX.X

Question 2. Your research group is investigating demographic characteristics associated with receiving a kidney transplant for waitlisted patients. You run a logistic regression using the following command:

logistic received_kt init_age female
lincom init_age
return list

Print a summary table as shown below (this is how it should appear in your .log file), with odds ratios (OR) and 95% confidence intervals (CI). The XXXX values should be replaced with the actual values found in the dataset, and should be displayed with two decimal places to the right of the decimal point. Also, create a Question2.xlsx with this output properly formatted.

Question 2
Variable         OR    (95% CI)
Age              X.XX  (X.XX-X.XX)
Female           X.XX  (X.XX-X.XX)

Hint: If you like, you may these expressions below after logistic regression to obtain the odds ratio and 95% CI. We will use init_age as an example.



Odds ratio


Lower bound of 95% CI


Upper bound of 95% CI


Additional Credit (Maximum \(+5%\))#

This is an optional question that is framed in the investigative spirit of Baltimore’s own Edgar Allan Poe

global repo https://github.com/jhustata/basic/raw/main/
do ${repo}annotate.do


  1. Run this remote script to figure out what it does. Guaranteed 5 points if you can debug it.

  2. Then download it and annotate it to explain to yourself and others what it accomplishes

  3. Use the ExtraCredit DropBox to hand in your version of annotate.do

One quick way to visualize the annotate.do script is by using a Linux cat command:

cat ${repo}annotate.do