local v: di#
7.1 summary#
Lets recap the last 7 weeks in the spirit of a coda:
help
the help command followed by any command you wish to further explore
e.g.,
h twoway
tokenize
represent a numerical sequence (e.g. 1,2,3,…,6)
as a sequence of letters (e.g. a,b,c,…,z or A,B,C,…,Z)
quietly
how to control what is displayed in the Stata terminal
Stata uses this e.g. with the neat regression output, and with profile.do
may very easily be changed to
noisily
if that suits your needs
delimit
neat way to mimic
SAS
in defining the end of one “line” of codeespecially useful when dealing with graphical output and numerous graphing options
absolutely necessary if ever you wish to translate
.SAS
code into.do
may make accessible what otherwise were inaccessible datasets (e.g. NHANES)
rnormal()
simulating parametric distributions
you may develop this idea on your own in the future
but we lay the foundation for this developement somewhere below
runiform()
&rnormal()
are the only distributions demonstrated thus farrbinomial()
&rweibull()
are introduced below in the context of time-to-event data
pwd
pwd
di c(pwd)
di “text with embedded `c(pwd)’ for whatever reason”
global filepath c(pwd)
r(mean)
creturn list
return list
ereturn list
by
collapse (
statistic
) varname1, by(varname2)egen varname1=
statistic
(varname2)above commands are equivalent in many regards
one distorts the data; the other doesn’t do so at all
useful created twoway plots or other graphical output
program define
rigid, specific programs (without
syntax varlist
orsyntax, options
)flexible, generalizable programs (with
syntax [varlist]
,[options]
)
capture
preventing error results and an arrested script
Stata is finicky, pernickety and this little preface to a command may spare you untold misery
we’ve used it most frequently before
program define
&log using
but, notably, we’ve used it as a parsimonous variant in
syntax varlist if
twoway
last weeks class, which we didn’t cover because of
hw1
-related issuesnotes have been updated to give you the best bang for your buck
lookout for the following very important issues:
workflow;
open science; and,
chatGPTs take on these.
on your own, please go walk step-by-step, command-by-command, through the
twoway
examples offeredthink about
if macro {
conditional code-blocks and your deployment of this artifice in future, very powerful Stata scripts!
7.2 macros#
Today we’ll circle back to the method emplyed throughout this class:
defining macros
formatting them
embedding them in strings, text, figures, and excel
producing aesthically pleasing, richly informative output
7.3 science#
But to what end?
the advanced Stata Programming class answers that question thus:
open science
self-publication
collaboration
etc…
Show code cell source
import networkx as nx
import matplotlib.pyplot as plt
#import numpy as np
#import sklearn as skl
#
#plt.figure(figsize=[2, 2])
G = nx.DiGraph()
G.add_node("workflow", pos = (0,700) )
G.add_node("jupyter", pos = (-2000, 960) )
G.add_node("dofile", pos = (2000, 950) )
G.add_node("python ", pos = (-3000, 550) )
G.add_node("stata", pos = (3000, 550) )
G.add_node("build", pos = (-1900, 150) )
G.add_node("dyndoc", pos = (1900, 150) )
G.add_node("html", pos = (0,0))
G.add_node("push", pos = (0, -475))
G.add_node("ghp", pos = (0, -950))
G.add_edges_from([ ("jupyter","python "), ("dofile", "stata")])
G.add_edges_from([("python ", "build"), ("stata", "dyndoc") ])
G.add_edges_from([("build", "html"), ("dyndoc", "html")])
G.add_edges_from([("html","push")])
G.add_edges_from([("push","ghp")])
nx.draw(G,
nx.get_node_attributes(G, 'pos'),
with_labels=True,
font_weight='bold',
node_size = 4500,
node_color = "lightblue",
linewidths = 3)
ax= plt.gca()
ax.collections[0].set_edgecolor("#000000")
ax.set_xlim([-5000, 5000])
ax.set_ylim([-1000, 1000])
plt.show()
If at all you’ve enjoyed the conveniences brought forth by this classbook, then it would be remiss of me not to let you know from whence this book comes:
a data science class
python environment
jupyter notebooks
vscode by miscrosoft
dofiles
stata
In the advanced class we merge these ideas and publish our analyses and documentation (roll over, annotation) online via gh-pages!!!
7.4 embed#
Output
local macro: di %3.1f r(mean)*100
Text
“You may embed a
macro
in text”
Figures
line
y x, text(50 18 “macro
”)
Excel
putexcel
set filename, replaceputexcel B2=”
macro
”
Word
putdocx clear
putdocx begin
putdocx
paragraphputdocx
textputdocx save
doc1.docx, replace
Dataset
postfile
regcoeffs b1 b2 b3 using regoutput, replacepostfile
regcoeffs () () ()postfile
regcoeffs () () ()postclose
regcoeffsls *.dta
//look for regoutput.dta in your pwd
Markdown
dyndoc
html
7.4.5 postfile#
use transplants, clear
sum peak_pra,d
g highpra=peak_pra>r(p90)
sum wait_yrs,d
g longwait=wait_yrs>r(p50)
postutil clear
postfile pp xaxis str80 coef double(result lb ub pvalue) using betas.dta, replace
logistic highpra gender prev_ki bmi age wait_yrs don_ecd
local xaxis=1
qui foreach v of varlist gender prev_ki bmi age wait_yrs don_ecd {
lincom `v'
return list
local est: di %3.2f r(estimate)
local lb: di %3.2f r(lb)
local ub: di %3.2f r(ub)
local pval: di %3.2f r(p)
post pp (`xaxis') ("`v'") (`est') (`lb') (`ub') (`pval')
local xaxis=`xaxis' + 1
}
postclose pp
ls -l
use betas,clear
list
#delimit ;
twoway (scatter result xaxis)
(rcap lb ub xaxis,
scale(log)
yline(1,
lc(lime)
lp(dash)
)
legend(off)
xlab(
1 "Gender"
2 "Previous Tx"
3 "BMI"
4 "Age"
5 "Wait y"
6 "ECD"
)
ti("Association between High PRA and select risk factors", pos(11))
yti("OR",
orientation(horizontal)
)
xti("")
)
;
#delimit cr
graph export logistic.png,replace
7.4.6 alternative: coefplot#
use transplants, clear
sum peak_pra,d
g highpra=peak_pra>r(p90)
sum wait_yrs,d
g longwait=wait_yrs>r(p50)
logistic highpra prev_ki bmi age wait_yrs don_ecd if gender==0
est store male
logistic highpra prev_ki bmi age wait_yrs don_ecd if gender==1
est store female
if 0 {
lab var highpra "PRA>90th %"
lab var longwait "Wait>50th %"
lab var prev_ki "Previous Tx"
lab var bmi "BMI, kg/m2"
lab var age "Age, y"
lab var don_ecd "ECD"
}
coefplot male female, drop(_cons) xline(0)
7.6 counterfeiting#
A word on simulation:
7.6.1 rigor vs. fraud#
worth developing skills
making up data
imaginary scenarios
and why would one do that?
missing data
bootstrap
cryptography/disclosure risk
sample-size calculation (variant on missing data)
etc.
qui {
clear
cls
if c(N) { //background
inspired by polack et al. nejm 2020
NEJM2020;383:2603-15
lets do some reverse engineering
aka simulate, generate data
from results: reversed process!!
}
if c(os)=="Windows" { //methods
lobal workdir "`c(pwd)'\"
}
else {
global workdir "`c(pwd)'/"
}
capture log close
log using ${workdir}simulation.log, replace
set seed 340600
set obs 37706
}
if c(N)==37706 { //simulation
#delimit ;
//row1
g bnt=rbinomial(1,.5);
lab define Bnt
0 "Placebo"
1 "BNT162b2" ;
label values bnt Bnt ;
tab bnt ;
//row2
gen female=rbinomial(1, .494);
label define Female
0 "Male"
1 "Female";
label values female Female;
tab female;
//row3
tempvar dem ;
gen `dem'=round(runiform(0,100),.1);
recode `dem'
(0/82.9=0)
(83.0/92.1=1)
(92.2/96.51=2)
(96.52/97.0=3)
(97.1/97.2=4)
(97.3/99.41=5)
(99.42/100=6)
, gen(race);
lab define Race
0 "White"
1 "Black or African American"
2 "Asian"
3 "Native American or Alsak Native"
4 "Native Hawaiian or other Pacific Islander"
5 "Multiracial"
6 "Not reported";
label values race Race;
tab race;
//row4
gen ethnicity=rbinomial(1,0.28);
tostring ethnicity, replace;
replace ethnicity="Latinx" if ethnicity=="1";
replace ethnicity="Other" if ethnicity=="0";
//row5
tempvar country;
gen `country'=round(runiform(0,100), .1);
recode `country'
(0/15.3=0)
(15.4/21.5=1)
(21.6/23.6=2)
(23.7/100=3)
, gen(country) ;
label define Country
0 "Argentina"
1 "Brazil"
2 "South Africa"
3 "United States";
label values country Country;
tab country;
//row7
gen age=(rt(_N)*9.25)+52 ;
replace age=runiform(16,91)
if !inrange(age,16,91);
summ age, d ;
local age_med=r(p50); local age_lb=r(min); local age_ub=r(max);
gen dob = d(27jul2020) -
(age*365.25) ;
gen dor = dob + age*365.25 + runiform(0,4*30.25);
//row6
gen over55=age>55 ; tab over55;
//row8
gen bmi=rbinomial(1, .351); tab bmi;
//figure 3
g days=rweibull(.7,17,0) if bnt==0 ;
g covid=rbinomial(1, 162/21728) if bnt==0 ;
replace days=rweibull(.4,.8,0) if bnt==1 ;
replace covid=rbinomial(1, 14/21772) if bnt==1;
//key dates
gen eft = dor + days;
//date formats
format dob %td; format dor %td; format eft %td;
//kaplan-meier curve
stset days, fail(covid) ;
sts graph,
by(bnt)
fail per(100)
tmax(119)
xlab(0(7)119)
ylab(0(.4)2.4,
angle(360)
format("%3.1f")
)
xti("Days after Dose 1")
legend(off)
text(
2.3 100
"Placebo",
col(navy)
)
text(
.5 100
"BNT162b2",
col(maroon)
) ;
graph export BNT162b2.png, replace ;
stcox bnt ;
drop _* age over55 days ;
g bnt_id=round(runiform(37,37+_N)) ;
compress ;
#delimit cr
//label variables
lab var bnt_id "Participant Identifier"
lab var bnt "Random treatment assignment"
lab var female "Gender at birth"
lab var race "Self-identified race"
lab var ethnicity "Hispanic ethnicity"
lab var country "Country where trial was conducted"
lab var dob "Date of birth"
lab var dor "Date of recruitment into BNT162b2 trial"
lab var eft "Date of exit from BNT162b2 trial"
lab var bmi "Obese"
lab var covid "Covid-19 status on eft date"
//label data
lab data "Safety and Efficacy of the BNT162b2 mRNA Covid-19 Vaccine"
describe
order bnt_id dob female race ethnicity country bmi bnt eft covid
*replace eft=. if eft>d(15dec2020) //some folks lost to followup
save BNT162b2, replace
}
log close
}
7.8 strings#
Handling strings such as drug names:
We’ll review a script I wrote 4 years ago
Then compare it with an update I wrote yesterday
In brief, there is a fundamental transformation in:
aesthetics, legibility, & brevity
one is 440 lines, repetitious
the other is 164 lines, utilizes macros & loops
7.8.1 lists#
import excel "1-first gen antihisitamines.xlsx", sheet("de-duplicated list") clear
drop in 1/6
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="antihist"
keep *_gnn
gen group=1
save 01_antihistamines, replace
import excel "2-antiparkinsonian agents.xlsx", sheet("De-duplicated list") clear
drop in 1/6
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="antipark"
keep *_gnn
gen group=2
save 02_antiparkinsons,replace
import excel "3-Antispasmodics.xlsx", sheet("De-duplicated_list") clear
drop in 1/7
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="antispasm"
keep *_gnn
keep if !missing(drug)
gen group=3
save 03_antispasmodics,replace
import excel "4-Antithrombotics", sheet("De-duplicated_results") clear
drop in 1/7
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="antithromb"
keep *_gnn
keep if !missing(drug)
gen group=4
save 04_antithrombotics,replace
import excel "5-antiinfective agents", sheet("Sheet2") clear
drop in 1/8
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="antiinfect"
keep *_gnn
keep if !missing(drug)
gen group=5
save 05_antiinfective,replace
import excel "6-Peripheral alpha-1 blockers", sheet("De-duplicated_results") clear
drop in 1/6
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="pera1block"
keep *_gnn
keep if !missing(drug)
gen group=6
save 06_peripheralalpha1blockers,replace
import excel "7-Central alpha-agonists", sheet("de-duplicated_list") clear
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="cena1ag"
keep *_gnn
keep if !missing(drug)
gen group=7
save 07_centralalpha1agonists,replace
import excel "8-Other CNS alpha-agonists", sheet("De-duplicated_list") clear
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="otha1ag"
keep *_gnn
keep if !missing(drug)
gen group=8
save 08_otheralpha1agonists,replace
import excel "9-Antidepressants", sheet("De-duplicated_results") clear
drop in 1/8
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="antidep"
keep *_gnn
keep if !missing(drug)
gen group=9
save 09_antidepressants,replace
import excel "10-Antipsychotic agents", sheet("De-duplicated_results") clear
drop in 1/8
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="antipsy"
keep *_gnn
keep if !missing(drug)
gen group=10
save 10_antipsychotics,replace
import excel "11-Barbituates", sheet("De-duplicated_results") clear
drop in 1/6
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="barb"
keep *_gnn
keep if !missing(drug)
gen group=11
save 11_barbiturates,replace
import excel "12-Benzodiazepine lists", sheet("Short-acting") clear
drop in 1/7
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="benzoshort"
keep *_gnn
keep if !missing(drug)
tempfile benzoshort
save `benzoshort',replace
import excel "12-Benzodiazepine lists", sheet("Long-acting") clear
drop in 1/8
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="benzolong"
keep *_gnn
keep if !missing(drug)
tempfile benzolong
save `benzolong',replace
import excel "12-Benzodiazepine lists", sheet("Unknown") clear
drop in 1/7
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="benzounk"
keep *_gnn
keep if !missing(drug)
tempfile benzounk
save `benzounk',replace
use `benzoshort',clear
append using `benzolong'
append using `benzounk'
gen group=12
save 12_benzodiazepines,replace
import excel "13-Nonbenzodiazepines - Z-drugs", sheet("De-duplicated_results") clear
drop in 1/6
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="nonbenzo"
keep *_gnn
keep if !missing(drug)
gen group=13
save 13_nonbenzodiazepines,replace
import excel "14-Ergoloid Mesylates", sheet("De-duplicated_results") clear
drop in 1/7
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="ergot"
keep *_gnn
keep if !missing(drug)
gen group=14
save 14_ergoloids,replace
import excel "15-Androgens", sheet("De-duplicated_results") clear
drop in 1/7
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="andro"
keep *_gnn
keep if !missing(drug)
gen group=15
save 15_androgens,replace
import excel "16-Estrogens", sheet("De-duplicated_results") clear
drop in 1/6
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="estro"
keep *_gnn
keep if !missing(drug)
gen group=16
save 16_estrogens,replace
import excel "17-Growth hormone", sheet("De-duplicated_results") clear
drop in 1/7
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="growth"
keep *_gnn
keep if !missing(drug)
gen group=17
save 17_growthhormones,replace
import excel "18-Insulin", sheet("De-duplicated_results") clear
drop in 1/8
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="insulin"
keep *_gnn
keep if !missing(drug)
gen group=18
save 18_insulin,replace
import excel "19-Sulfonylureas", sheet("De-duplicated_results") clear
drop in 1/5
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="urea"
keep *_gnn
keep if !missing(drug)
gen group=19
save 19_sulfonylureas,replace
import excel "20-38", sheet("20.Proton_Pump_Inhibitors") clear
drop in 1/1
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="ppi"
keep *_gnn
keep if !missing(drug)
gen group=20
save 20_protonpumpinh,replace
import excel "20-38", sheet("21.non-selective NSAIDS") clear
drop in 1/1
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="nsnsaids"
keep *_gnn
keep if !missing(drug)
gen group=21
save 21_nonselectnsaids,replace
import excel "20-38", sheet("22.Skeletal muscle relaxants") clear
drop in 1/2
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="muscle"
keep *_gnn
keep if !missing(drug)
gen group=22
save 22_musclerelaxants,replace
import excel "20-38", sheet("23.non-DHP CCB") clear
drop in 1/1
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="ccb"
keep *_gnn
keep if !missing(drug)
gen group=23
save 23_nondhpccb,replace
import excel "20-38", sheet("24.Thiazolidinediones") clear
drop in 1/1
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="thiazo"
keep *_gnn
keep if !missing(drug)
gen group=24
save 24_thiazolidinediones,replace
import excel "20-38", sheet("25.Acetyl cholinesterase inhib") clear
drop in 1/1
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="achblock"
keep *_gnn
keep if !missing(drug)
gen group=25
save 25_acetylcholinesteraseinh,replace
import excel "20-38", sheet("26.alpha-1 blockers") clear
drop in 1/1
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="a1block"
keep *_gnn
keep if !missing(drug)
gen group=26
save 26_alpha1blockers,replace
import excel "20-38", sheet("27.ti-cyclic antidepressants") clear
drop in 1/1
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="tca"
keep *_gnn
keep if !missing(drug)
gen group=27
save 27_tca,replace
import excel "20-38", sheet("28.corticosteroids") clear
drop in 1/1
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="steroids"
keep *_gnn
keep if !missing(drug)
gen group=28
save 28_corticosteroids,replace
import excel "20-38", sheet("29.H2 receptor antagonists") clear
drop in 1/1
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="h2ant"
keep *_gnn
keep if !missing(drug)
gen group=29
save 29_h2rblockers,replace
import excel "20-38", sheet("30.antiepileptics") clear
drop in 1/1
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="antiepi"
keep *_gnn
keep if !missing(drug)
gen group=30
save 30_antiepileptics,replace
import excel "20-38", sheet("31.antiemetics") clear
drop in 1/1
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="antieme"
keep *_gnn
keep if !missing(drug)
gen group=31
save 31_antiemetics,replace
import excel "20-38", sheet("32.NSAIDS") clear
drop in 1/1
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="nsaid"
keep *_gnn
keep if !missing(drug)
gen group=32
save 32_nsaids,replace
import excel "20-38", sheet("33.Diuretics") clear
drop in 1/1
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="diuretics"
keep *_gnn
keep if !missing(drug)
gen group=33
save 33_diuretics,replace
import excel "20-38", sheet("34.SNRIs") clear
drop in 1/1
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="snri"
keep *_gnn
keep if !missing(drug)
gen group=34
save 34_snri,replace
import excel "20-38", sheet("35.SSRIs") clear
drop in 1/1
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="ssri"
keep *_gnn
keep if !missing(drug)
gen group=35
save 35_ssri,replace
import excel "20-38", sheet("36.RAS Inhibitor") clear
drop in 1/2
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="ras"
keep *_gnn
keep if !missing(drug)
gen group=36
save 36_rasinhibitors,replace
import excel "20-38", sheet("37.Opioids") clear
drop in 1/1
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="opioids"
keep *_gnn
keep if !missing(drug)
gen group=37
save 37_opioids,replace
import excel "20-38", sheet("38.Anticholinergic") clear
drop in 1/1
replace A = strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
gen class_gnn="antich"
keep *_gnn
keep if !missing(drug)
gen group=38
save 38_anticholinergics,replace
qui {
qui {
clear
cls
if c(N) { //background
we compiled a comprehensive list of medications within
each of the potentially inapporpriate medication (pim)
classes in a systematic manner.
first, informaticists used micromedex, the control vocabularies of medline and embase, and medication websites to generate
a trade and generic medication name list.
second, this list was curated to allow medications with multiple
mechanisms of action to be represented in more than one pim class.
we removed pims with topical or ocular routes of administration.
the final list was **imported into stata** to query
medicare part d claims for pims...
}
if c(N)<1 { //methods
global workdir `c(pwd)'
if c(os)=="Windows" {
global workdir "$workdir\"
}
else {
global workdir "$workdir/"
}
#delimit ;
global catalog1
"1-first gen antihisitamines"
"2-antiparkinsonian agents"
"3-Antispasmodics"
"4-Antithrombotics"
"5-antiinfective agents"
"6-Peripheral alpha-1 blockers"
"7-Central alpha-agonists"
"8-Other CNS alpha-agonists"
"9-Antidepressants"
"10-Antipsychotic agents"
"11-Barbituates"
"12-Benzodiazepine lists"
"12-Benzodiazepine lists"
"12-Benzodiazepine lists"
"13-Nonbenzodiazepines - Z-drugs"
"14-Ergoloid Mesylates"
"15-Androgens"
"16-Estrogens"
"17-Growth hormone"
"18-Insulin"
"19-Sulfonylureas"
;
global catalog2
"20.Proton_Pump_Inhibitors"
"21.non-selective NSAIDS"
"22.Skeletal muscle relaxants"
"23.non-DHP CCB"
"24.Thiazolidinediones"
"25.Acetyl cholinesterase inhib"
"26.alpha-1 blockers"
"27.ti-cyclic antidepressants"
"28.corticosteroids"
"29.H2 receptor antagonists"
"30.antiepileptics"
"31.antiemetics"
"32.NSAIDS"
"33.Diuretics"
"34.SNRIs"
"35.SSRIs"
"36.RAS Inhibitor"
"37.Opioids"
"38.Anticholinergic";
global namelist1
01_antihistamines
02_antiparkinsons
03_antispasmodics
04_antithrombotics
05_antiinfective
06_peripheralalpha1blockers
07_centralalpha1agonists
08_otheralpha1agonists
09_antidepressants
10_antipsychotics
11_barbiturates
benzoshort
benzolong
benzounk
13_nonbenzodiazepines
14_ergoloids
15_androgens
16_estrogens
17_growthhormones
18_insulin
19_sulfonylureas;
global namelist2
20_protonpumpinh
21_nonselectnsaids
22_musclerelaxants
23_nondhpccb
24_thiazolidinediones
25_acetylcholinesteraseinh
26_alpha1blockers
27_tca
28_corticosteroids
29_h2rblockers
30_antiepileptics
31_antiemetics
32_nsaids
33_diuretics
34_snri
35_ssri
36_rasinhibitors
37_opioids
38_anticholinergics;
#delimit cr
capture log close
log using "${workdir}01_bc050523.log",replace
set max_memory .
}
if c(N)==0 { //results
local group=1
foreach class in "$catalog1" {
import excel "${workdir}`class'.xlsx", clear
drop in 1/6
replace A=strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
local class_tidy: di word("$namelist1",`group')
g class_gnn="`class_tidy'"
capture split class_gnn,p("_")
capture keep drug_gnn class_gnn2
if inrange(`group',1,11) {
g group=`group'
}
else {
g group=`group'-2
}
keep if !missing(drug)
local filename: di word("$namelist1",`group')
capture save "`filename'",replace
noi di "`filename'"
local group=`group'+1
}
use benzoshort,clear
append using benzolong
append using benzounk //debug
replace group=12
drop class_gnn
rename class_gnn1 class_gnn2
save 12_benzodiazepines,replace
}
clear
if c(N)==0 {
local group=1
foreach class in "$catalog2" {
import excel "${workdir}20-38.xlsx", /*
*/ sheet("`class'") clear
drop in 1/1
replace A=strtrim(A)
replace A=ustrupper(A)
rename A drug_gnn
local class_tidy: di word("$namelist2",`group')
g class_gnn="`class_tidy'"
capture split class_gnn,p("_")
capture keep drug_gnn class_gnn2
g group=`group'+19
keep if !missing(drug)
local filename: di word("$namelist2",`group')
capture save "`filename'",replace
noi di "`filename'"
local group=`group'+1
}
}
cls
rm benzoshort.dta
rm benzolong.dta
rm benzounk.dta
noi ls *.dta
timer list
log close
}
}
7.8.2 excel#
Download these .xlsx files into the pwd
for the above scripts to work:
7.8.3 commands#
strpos
word
strlen
regexm
7.8.2.1 strpos#
di "$S_TIME"
clear
set more off
tempfile pde_2013 pde_2014 pde2013bc pde2014bc
use usrds_id srvc_dt gnn using "/dcs01/igm/segevlab/data/usrds2015/claims/pd/pde2013",clear
save `pde_2013',replace
use usrds_id srvc_dt gnn using "/dcs01/igm/segevlab/data/usrds2016/claims/pd/pde2014",clear
save `pde_2014',replace
local year "y=2013/2014"
forvalues `year' {
capture use `pde_`y'',clear
if _rc==0 {
di "processing pde_`y'.dta ..."
gen antihist16=(strpos(gnn,"BROMPHENIRAMINE")!=0)
gen antihist19=(strpos(gnn,"CARBINOXAMINE")!=0)
gen antihist26=(strpos(gnn,"CHLORPHENIRAMINE")!=0)
gen antihist31=(strpos(gnn,"CLEMASTINE")!=0)
gen antihist36=(strpos(gnn,"CYPROHEPTADINE")!=0)
gen antihist39=(strpos(gnn,"DEXBROMPHENIRAMINE")!=0)
gen antihist40=(strpos(gnn,"DEXCHLORPHENIRAMINE")!=0)
gen antihist43=(strpos(gnn,"DIMENHYDRINATE")!=0)
gen antihist47=(strpos(gnn,"DIPHENHYDRAMINE")!=0)
gen antihist51=(strpos(gnn,"DOXYLAMINE")!=0)
gen antihist65=(strpos(gnn,"HYDROXYZINE")!=0)
gen antihist77=(strpos(gnn,"MECLIZINE")!=0)
gen antihist101=(strpos(gnn,"PROMETHAZINE")!=0)
gen antihist105=(strpos(gnn,"PYRILAMINE")!=0)
gen antihist127=(strpos(gnn,"TRIPROLIDINE")!=0)
gen antihist_rx=( ///
antihist16+ ///
antihist19+ ///
antihist26+ ///
antihist31+ ///
antihist36+ ///
antihist39+ ///
antihist40+ ///
antihist43+ ///
antihist47+ ///
antihist51+ ///
antihist65+ ///
antihist77+ ///
antihist101+ ///
antihist105+ ///
antihist127 ///
>0)
keep usrds_id srvc_dt antihist_rx
duplicates drop
quietly compress
save `pde`y'bc', replace
}
}
forvalues `year' {
capture append using `pde`y'bc'
}
save 01_antihistamines_expR.dta,replace
di "$S_TIME"
7.9 others#
//explore these commands sequentially, on your own
use transplants, clear
list extended_dgn in 1/5, clean
disp word("Hello, is there anybody in there?",4)
list extended_dgn if word(ext, 5) != "", clean noobs
disp strlen("Same as it ever was")
list extended_dgn if strlen(ext)< 6, clean
assert regexm("Earth", "art")
assert !regexm("team", "I")
tab ext if regexm(ext, "HTN")
list ext if regexm(ext, "^A")
//starts with A
list ext if regexm(ext, "X$")
//ends with X
tab ext if regexm(ext, "HIV.*Y")
//contains "HIV", then some otherstuff, then Y
7.10 dates#
disp %td 19400 11feb2013
disp %td 366 01jan1961
disp %td -5 27dec1959
use transplants, clear
gen oneweek = transplant_date+7
list transplant_date oneweek in 1/3
format %td oneweek
list transplant_date oneweek in 1/3
disp td(04jul1976)
disp td(5may2021)
disp mdy(7,4,1976)
disp mdy(5,5,2021)
disp date("August 15, 1969", "MDY")
disp "$S_DATE"
di c(current_date)