Loops#
Let’s start by reading in datasets we may need for demos
qui {
/*
1. Kidney Transplant Recipient Data
2. NHANES 1999-2000 Demographics Data
3. Homework 1 Textfile Data
*/
if 1 { //Activated
cls
noi di "How many datasets are you going to use?" _request(N)
forvalues i=1/$N {
noi di "What is dataset `i'?" _request(data`i')
}
global repo "https://github.com/jhustata/basic/raw/main/"
}
}
Kinds of loops#
foreach v of varlist {
#
qui {
/*
1. Kidney Transplant Recipient Data
2. NHANES 1999-2000 Demographics Data
3. Homework 1 Textfile Data
*/
if 0 { //Deactivated
cls
noi di "How many datasets are you going to use?" _request(N)
forvalues i=1/$N {
noi di "What is dataset `i'?" _request(data`i')
}
global repo "https://github.com/jhustata/basic/raw/main/"
}
if $N { //Import Data
use $repo$data1, clear
ds
//varlist: string of variable names
foreach v of varlist `r(varlist)' {
noi di "`v'"
}
}
}
The r(varlist)
can be replaced with a user-defined varlist in a customized program:
capture program drop myvarlist
syntax varlist
foreach v of varlist `varlist' {
noi di "`v'"
}
end
Now the user has flexibility:
myvarlist age gender race abo bmi
c(ALPHA)
#
What if you wish to loop over a list of strings that aren’t variables?
cls
//di c(ALPHA)
foreach v in `c(ALPHA)' {
di "`v'"
}
tokenize
#
You may be working within a forvalues i=1/N
loop and want to loop over some other “list” that is a string:
tokenize "`c(ALPHA)'"
forval i = 1/26 {
di "``i''"
}
set timeout1 1000
import sasxport5 "https://wwwn.cdc.gov/Nchs/Nhanes/1999-2000/DEMO.XPT", clear
Can you write a script that imports NHANES DEMO.XPT (1999-2000) and iteratively appends
NHANES DEMO.XPT from the next two survey cycles 2001-2002, 2003-2004? Visit the website to see the naming convention for the various years (e.g. NHANES 2001 - 2002)
varlist
#
What if you wish to loop over a list of strings that aren’t variables?
cls
local varlist "Egypt Portugal Swaziland Ireland"
foreach v in `varlist' {
di "`v'"
}
foreach n of numlist {
#
qui {
//earlier code
if $N {
// earlier code
levelsof dx, local(dxcat)
//numlist: list of numbers
foreach n of numlist `dxcat' {
noi di `n'
}
}
}
variable lab
#
qui {
//earlier code
//levelsof is a "numlist"
levelsof dx, local(dxcat)
local varlab: var lab dx
//later code
}
Variable type determines the parameters you report in Table 1:
Variable Type |
Statistic |
---|---|
Continuous (Units) |
Median (IQR) |
Binary (One is enough) |
% |
Multicategory (Each reported) |
Variable label |
Specific or collapsed |
% |
dx
is a collapsed version of extended_dgn
. But for the sake of practice, lets further collapse dx
:
tab dx
recode dx (1/4=1 "Prevalent Overall")(5/8=2 "Common in subgroups")(9=3 "Miscellaneous"),gen(dx_cat)
tab dx_cat
h ds
ds, has(type string)
levelsof extended_dgn
return list
ds, not(type string)
ds, has(type int)
ds, has(varl "*TX*")
ds, has(varl *TX* *transplant*)
ds, has(format %t*)
ds, has(format *f)
Here’s a simple script that classifies each variable:
qui {
cls
ds, not(type string) //otherwise, extended diagnosis is continuous!
global threshold 9 //for multicat vs. continuous
foreach v of varlist `r(varlist)' {
levelsof `v', local(numlevels)
if r(r) == 2 {
noi di "`v' binary"
}
else if inrange(`r(r)', 3, $threshold) {
noi di "`v' multicat"
}
else {
noi di "`v' continuous"
}
}
}
value lab
#
qui {
//earlier code
//earlier code
local vallab: value lab dx
foreach n of numlist `dxcat' {
//code
}
}
label value lab
#
Let’s get familiar with variations on the foreach
command:
forvalues
#
Here we are dealing with a sequence of numbers:
forvalues i=1/9 {
di `i'
}
foreach
#
In this scenario the numbers are arbitrarily arranged:
foreach n of numlist 1 2 3 7 9 {
di `n'
}
numlist
#
You can create a macro, whose value is the numlist
local numlist "1 2 3 7 9"
foreach n of numlist `numlist' {
di `n'
}
qui {
//earlier code
foreach n of numlist `dxcat' {
local dxvarlab: lab `vallab' `n'
//later code
}
}
tokenize
#
if 2 { //Int 1-26
egen lastname = seq(), f(1) t(26)
tostring lastname, replace
tokenize "`c(ALPHA)'"
}
if 3 { //Tokenize
forval i = 1/26 {
replace lastname = "``i''" if lastname == "`i'"
}
}
putexcel
#
Output a varlist
, numlist
, and some other list into .xlsx
clear
putexcel set lab6, replace
use $repo$data1, clear
qui ds
//nested loops
tokenize "`c(ALPHA)'"
forvalues i = 1/2 {
local row=2
foreach v of varlist `r(varlist)' {
if `i' == 1 {
qui putexcel ``i''`row' = "`v'"
local row=`row'+ 1
}
if `i' == 2 {
qui sum `v'
local mean: di %3.2f r(mean)
qui putexcel ``i''`row' = "`mean'"
local row=`row' + 1
}
}
}
ls
Review the .xlsx
file you’ve just created
Macros#
Can you list all the macros generated by the above .do
file script? Group them as follows:
System-defined
r()
e()
c()
_rc
etc
User-defined
local
global
qui {
if 0 { //Background
1. Loops
2. Macros
3. Values
4. String: `varlist' or $varlist
5. Numeric: `numlist' or $numlist
}
noi di "Do you wish to document this session (yes/no)" _request(log)
if "$log" == "no" {
global yeslog = 0
}
else {
global yeslog = 1
}
if $yeslog { //Methods
/*
1. Tabula Rasa: cls, clear
2. Work Directory: yours, collaborators, teaching team
3. Documentation: log using
*/
//tabula rasa
cls
clear
//workdirectory
global mydir "~/downloads/"
noi di "What is your workdirectory?" _request(yourdir)
noi di "What dataset do you wish to analyze?" _request(data)
if "$mydir" != "" {
cd "$mydir"
}
else {
if "$yourdir" != {
cd "$yourdir"
}
else {
noi di as err "You've not provided a workdirectory"
exit 340600
}
}
//documentation
capture log close
log using loops.log, replace //remember: log close
}
//dataset
capture confirm file "$data"
if _rc == 0 {
use "$data", clear
noi di "Thanks, $data has been loaded"
noi di ""
noi di "obs: `c(N)' vars: `c(k)'"
noi di ""
ds
noi di "varlist: `r(varlist)'"
}
else {
noi di as err "Please choose a dataset to analyze"
exit 340700
}
tab dx
levelsof dx, local(diagnosis)
desc
label define dx_lab ///
1 "Glomerular" ///
2 "Diabetes" ///
3 "PKD" ///
4 "Hypertension" ///
5 "Renovascular" ///
6 "Congenital" ///
7 "Tubulo" ///
8 "Neoplasm" ///
9 "Other"
label values dx dx_lab
local numlab: value label dx
foreach num of numlist `diagnosis' {
local dxcat: lab `numlab' `num'
noi di "`dxcat'"
}
capture log close
}