HW3#
3.5 Homework#
Overview#
Write a .do file
which imports data from hw1.txt and performs the tasks described below. Name it hw3.lastname.firstname.do
and it should create a log file called hw1.lastname.firstname.log
. Please avoid some of the common mistakes we observed in your HW1. While you didn’t drop points for those mistakes as beginners, you’ll drop some points in HW3 if you repeat these mistakes. Make sure your log file displays only the output of interest. But do not submit your log files as part of the assignment.
Evaluation will be based on the log file produced when we run your script on our machines.
Codebook
Variable |
Description |
Values |
---|---|---|
transplants.dta |
||
|
Patient ID |
Numeric |
|
Body mass index (kg/m2) |
Numeric |
|
Primary diagnosis |
String (see dataset) |
|
Age |
Numeric |
|
History of previous transplant |
0=No / 1=Yes |
|
Peak PRA |
Numeric |
|
Sex |
0=No / 1=Yes |
|
Patient eventually received a kidney transplant? |
0=No / 1=Yes |
This dataset contains simulated data on registrants for the deceased donor kidney waitlist.
Question 1. Create a new numeric variable named htn
, which takes a value of 1 if the variable dx
has a value of 4=Hypertensive
, and 0
otherwise. This variable should have value labels, Yes
for 1 and No
for 0. (Hint: what is the data type of this new variable? Is it a numeric or a string?) Run tab htn
in your assignment do-file so that it would print the following result.
htn | Freq. Percent Cum.
------------+-------------------------------------
No | XXXX XXXX XXXX
Yes | XXXX XXXX XXXX
------------+-------------------------------------
Total | XXXX XXXX
Question 2. Create your first automated Table 1! Write a program called question2
that prints the following table (including Question 2
in the header). The XX
values should be replaced with correct values found in the dataset, and should be rounded to the nearest whole number for age and to one decimal place to the right of the decimal point for other variables. Make sure the summary statistics are vertically aligned and justified along the left margin. Run your program and display the table on your results window, in your .log
file, and in a question2.xlsx
. (The key challenge: reporting continuous and binary variables. Next week we’ll include categorical variables)
Question 2 Males (N=XX) Females (N=XX)
Age, median (IQR) XX (XX-XX) XX (XX-XX)
Previous transplant, % XX.X XX.X
To align output on your Stata results window, and thus in your .log
file, you may use the column _col()
syntax. Use the example below as a guide
local study_arms: di _col(30) "Trial" _col(60) "Control"
di "`study_arms'"