Catalog of Commands#

Before running any sample codes, please run this line of command in Stata

global repo : di "https://raw.githubusercontent.com/jhustata/basic/main"

help/man#

help is a special command that can be used to find and display the help documents for all officially included packages and commands. It is the fastest way to get to know the syntax, usage, options, and examples of a command or a function. It is always encouraged to check out the help file through help command when installing a new package or command to Stata.

man is for the same purpose of help. Instead of opening a pdf reader to show the help file like the help command, man directly print out the help file contents to the result window.

help display
man display

display#

display is a command that let’s Stata to print out a value, a string, and etc. to the result window.

di "This is an example"

quitely#

quitely is a command control the output displayed in Stata result window. It can be used as a prefix to other commands to hide it’s results in the result window. Use {} to create chunks for quitely. All results for codes within the chunk will be hide unless otherwise specified.

// display the result
di "This is an example"
// hide with quitely
qui di "This is an example"
// code chunk
qui {
    di "This is an example"
}

noisily#

noisily is a command control the output displayed in Stata result window. It does exactly the opposite as quitely and forces the output to be displayed in the result window even if within a quitely block.

qui {
    di "This line will be hidden"
    noi di "This line will be displayed"
}

cls#

cls is a command clearing up the result window.

noi di "Some random staff to be removed from your screen"
cls

//, /*, if 0 {}, *#

//, /* and */, if 0 {}, * are different ways of commenting or annotating in Stata.

// Works in Do-files
// This is a comment
/* This is a comment chunk, and this is the start of this chunk
This is a comment
This is the end of this chunk */

if 0 {
    if 0 is a special way of commenting in Stata
    if 0 is a condition that is always being considered as False in Stata
    therefore, codes in if 0 will never be read-in as commands thus never runned
    so it can act as a way of commenting
}

* This is a very traditional way of commenting

if#

if is the way to tell Stata to execute specific commands when a condition is met If a number that is not equal to 0 is being used as condition, Stata will always consider it as True

if 1 {
    di "This is an example"
}
import delimited "${repo}/transplants.txt"
// summarize age among overall population
sum age, detail
// summarize age among females only
sum age if gender == 1, detail

else/else if#

else and else if are used with if to establish more sets of codes if additional conditions needs to be checked other than the original if condition.

else and else if has to start from a new line and cannot be specified after } in the same line.

local e1 = 1
local e2 = 0
local e3 = 4
if (`e1' == 0) {
    noi di "in if condition"
} 
else if (`e2' == 0) {
    noi di "in else if condition"
}
else {
    noi di "in else condition"
}

clear#

clear is a command clear up designated memory. In default, it clears up the dataset in memory. If using clear all, it clears up everything (dataset, programs, macros) in memory

local test : di "This is an example"
import delimited "${repo}/transplants.txt", clear
clear
di "`test'"
import delimited "${repo}/transplants.txt", clear
clear all
di "`test'"

capture, _rc#

capture is a command which lets Stata to continue execution if an error occurs in the command line start with capture _rc is a backstage value that Stata use to store the error code encountered in the command line with capture

capture this is an error
di _rc
this is an error

log#

log is a command modifies log file settings

// close up currently opened log file for writing
capture log close
// start a new log file to write on
log using "log_file_name.txt", text replace

set#

set regulates Stata’s systematic settings

// change number of records
set obs 100
// allow Stata to use abbreviation of variable names
set varabbrev on
// turn off the more prmopt when generating long results
set more off
// change timeout setting when connecting to internet resources
set timeout1 1000

local/global/macros#

macros are temporary user/command defined scalars that are stored in Stata’s backstage to hold temporary values, strings, files, and etc.

A local macro means that such a temporary scalar will only remaining functional in this specific do-file/program and will lose information stored in it after running. To call a local macro, use ` and ‘ to wrap the macro name.

// define a local macro called example with a value of 0
local example = 0
// define a local macro called example2 with a string
local example2 : di "This is an example"
// define a local macro with variable names
local example3 var1 var2 var3
// call macros
di `example'
di "`example2'"
di "`example3'"

A global macro has the same idea as a local macro. Instead, as its name shows, it will remain functional as long as the current running Stata session is not terminated. To call a global macro, use $ sign before the macro name. To avoid potential error, {} can be used to wrap the macro name after $ sign.

// defining global macro is the same as local macro, just change from local to global
global example = 0
// call a global macro
di ${example}

use/import/infile/infix#

use, import, and infile commands are Stata commands that read in dataset files into memory for analysis.

use is specifically for .dta files import is for all other file types with different syntax import delimited - for .csv and .txt import excel - for .xls and .xlsx see more with help command.

infile/infix are specific ways to read in text data with a dictionary

, clear is an option that helps clear out the dataset in memory to avoid issues

// read in .dta sets
use "dataset.dta", clear
// read in .csv/.txt
import delimited "dataset.csv", clear
// read in text file with dictionary
infile using dictionary, using(dataset) clear
infix var1 1-2 var2 3-7 var3 8-20 using "dataset.raw", clear

generate#

generate is a command to create a new variable in current dataset.

clear
set obs 100
// generate a variable called example and set all rows to 0
gen example = 0

replace#

replace is a command to change values of a specific variable for certain rows.

import delimited "${repo}/transplants.txt", clear
// generate a new variable called example and assign all rows 1
gen example = 1
// change example to 0 for rows have age less than 30
replace example = 0 if age < 30

drop#

drop is a command tells Stata to drop certain observations in dataset.

When using with variable names, drop command deletes the variable in the list of variable names.

import delimited "${repo}/transplants.txt", clear
drop if gender == 1
drop age gender

keep#

keep is a command tells Stata to keep only rows in dataset.

When using with variable names, keep command deletes all variable that are not in the list of variable names.

import delimited "${repo}/transplants.txt", clear
keep if gender == 1
keep age gender

label#

label is a command to help labeling the variables or variable values.

clear
set obs 100
gen example = 0
// to label a variable
label variable example "Example Variable"

To label variable values, a label needs to be defined and then applied to the variable.

clear
set obs 100
gen example = round(runiform(0,1))
// to label variable values
/* label as below:
0 - No
1 - Yes
*/
capture label drop example_lab
label define example_lab 0 "No" 1 "Yes"
label values example example_lab

tabulate#

tabulate is a command that quickly tallies the number of each unique values for a variable, or the number of each unique combination among values of two variables.

import delimited "${repo}/transplants.txt", clear
tab race
tab gender race

putexcel#

putexcel is a command that writes contents from Stata into designated excel file cells.

To use putexcel, first need to do set to designate file to write in. Then a specific cell can be referred using excel row-column combination (A1, A2, B1, B2, etc.).

putexcel set example, replace
putexcel A1 = "This is an example"
putexcel B2 = 1

putdocx#

putdocx is a command that weites contents from Stata into desginated word documents.

To use putdocx, the process should start with putdocx begin and end with putdocx save

putdocx begin
putdocx paragraph, style(Title)
putdocx text ("This Is an Example")
putdocx paragraph, style(Heading1)
putdocx text ("1. Heading")
putdocx save example, replace

loops - for/while#

loops in Stata refers to a block, or a set of code, that was executed repeatitively until a certain condition is met. The for and while commands are different ways of defining such a condition.

There are two ways of defining for loops in Stata.

forvalues#

forvalues is a way of defining a for loop in Stata. It will execute a set of codes based on a set of values. In default, the loop will execute from lowest value to the highest value, and everytime the set of code was executed, the value goes up for 1.

forvalues i = 1/10 {
    noi di "i value: `i'"
}

foreach#

foreach is a way of defining a for loop in Stata. It will execute a set of codes based on a list of values (can be many things like numbers, variables, strings and etc.). In default, executing the set of codes one time will make the loop proceed to the next element in the list. The loop will finish execution once it finish the execution for the last element in the list.

foreach i in sample1 sample2 sample3 {
    noi di "This is `i'"
}

while#

while is a way of defining a while loop in Stata. It will continue executing a set of codes until the condition specified in while command is not statisfied anymore. In default, elements used in determining the condition for the while loop will not being automatically changed unless the code modifies it. This means if no modification is being done through the code in the loop, the conditions may always remain True thus leading to endless iterations of the loop and cause execution issues.

local end = 0
local i = 0
while (`end' == 0) {
    local i = `i' + 1
    noi di "This is iteration: `i'"
    if (`i' == 10) { // this is very important!
        // this means we are changing our element for determining the condition after 10 iterations
        // if we miss this part, the end macro will always equal to 0
        // which means the condition is always satisfied thus the loop will always continue executing
        local end = 1 
    }
}

summarize#

sum is a command in Stata that quickly gives summary statistics for continuous variables. In default, it will show observation numbers, mean, standard deviation, min, and max of such variable. To get more information like quartiles, medians, use its detail option.

import delimited "${repo}/transplants.txt", clear
// summarize age
sum age
// summarize age among females
sum age if gender == 1

pwd#

pwd is a command in Stata to ask Stata to print out current working directory.

pwd

cd#

cd is a command changes the working directory of current running Stata.

cd "your_path_here"

split#

split is a command that can help parsing one string variable into 2 string variables based on a special delimiter within the string.

import delimited "${repo}/transplants.txt", clear
// we can take a look at the dx variable
// it is formatted as x=xxxx
// this means it has a very special delimiter of =
// therefore, we can split it into two variables by using this special delimiter
split dx, p("=")
// now we can see that there are two variables generated, one called dx1 and one called dx2
// dx1 refer to all information before =
// dx2 refer to all information after =

destring#

destring is a command that changes a string variable to numbers like int, float. destring can be problematic if the variable has strings containing non-numeric information. For new variable created, use replace option to make it substitute the original variable, use gen(var_name) option to generate a new variable to store numeric version of the original variable.

import delimited "${repo}/transplants.txt", clear
split dx, p("=")

destring dx1, replace

delimit#

#delimit is a special command which changes the symbol that Stata consider as the end of code line. In short, if we set #delimit to ;, Stata will consider everything before a ; as a whole single long code line. In default, #delimit is set to cr, which means everytime user hits the enter in do-file, it is the end of that code line unless otherwise specified.

#delimit ;
noi 
di 
as error
"This is an example";
#delimit cr

matrix#

matrix is a special way of storing information (in matrice) in Stata. matrix are special scalars in Stata that are defined, displayed, stored, and called in special ways.

mat X = (1+1, 2*3/4 \ 5/2, 3)
mat list X
// to call a value in matrix
di X[1,2]
mat drop X
// to see more
help mat

count#

count quickly counts the number of observations in the dataset. When using with an if condition, it counts the number of observations satisfy the condition.

import delimited "${repo}/transplants.txt", clear
count
count if gender == 0
count if race == 1

preserve#

preserve take a snapshot of the current dataset and store it in memory. When the dataset is being changed by other codes, this snapshot will not change and may be a back-up point if the user would want to recover the dataset.

preserve itself only stores the snapshot, and will not recover or update the current dataset. Only 1 snapshot can exist simultaneously. If preserved once, no more preserve can be called again unless restored.

preserve

restore#

restore is the command tells Stata to recover the dataset based on the snapshot taken through preserve.

restore can not act by itself as it does not create any snapshot. It has to be called after preserve to recover the dataset in memory. However, to just clear out the snapshot without influencing the real dataset, not option may be used.

// these lines can be executed on one-by-one basis to best understand the command results.
import delimited "${repo}/transplants.txt", clear
preserve
di "`c(N)' observations before modification preserved"
clear
di "`c(N)' observations after modification, dataset changed"
restore
di "`c(N)' observation after restore, dataset recovered"
preserve
di "`c(N)' observations before modification preserved"
clear
di "`c(N)' observations after modification, dataset changed"
restore, not
di "`c(N)' observation after restore, dataset not recovered if not option is used in restore"

program#

program is the command tells Stata that a customized program is being defined or modified.

program has to start with program define and end with end. Stata cannot have two programs with the same name. To call the program that is defined, just enter the program name. To drop a program, use program drop. It may be a good appraoch to drop the program before defining it, which avoids a lot of naming issues.

capture program drop example
program define example
    noi di "This is an example program"
end
example

syntax#

syntax is a special command which has to be used within the definition of a program (i.e. between program define and end).

syntax regulates the mandatory and optional customized syntax and options the program can take. Each syntax or option will be store under a local macro with the same name of the syntax/option. All customized syntax/option wrapped by [] are optional and otherwise mandatory. Optional syntax/options need a default value, but mandatory syntax/options do not.

capture program drop example
program define example

    syntax [in] [if], opt1(int) [opt2(real 0.1)]
    import delimited "${repo}/transplants.txt", clear
    noi di "opt1=" `opt1'
    noi di "opt2=" `opt2'

    keep in 1/50
    list `in' `if'

end

example, opt1(3) opt2(2.45)
example in 1/10 if gender == 0, opt1(20)

tokenize#

tokenize strikes a list of elements into single elements and store them into macros based on their order in the list.

local example_list a1 a2 a3 a4 a5 a6
tokenize `example_list'
macro list
forvalues i = 1/6 {
    noi di "No.`i' element: ``i''"
}

postfile#

postfile creates a temporary dataset in memory that can be modified by the user to store information.

postfile has to have a postfile close to start creating another dataset file.

postfile your_file_name var_name1 var_name2 var_name3, replace
post your_file_name (var_name1 = 1) (var_name2 = 2) (var_name3 = 3)
postfile close your_file_name

version#

version is a special command that regulates the running version of Stata. Running on a specific version means Stata will consider all commands in that version, so later commands or changes will not be applied.

version can only be set to a version that is before actually installed version. For exaple, if version 18.0 was installed, all version before 18.0 can be set through version command.

version 12.0

regress#

regress is the command for linear regression. The first variable placed after regress will be considered as the outcome variable, and all other variables after the first variable will be considered as co-variates. When a co-variate is categorical, use i. before the variable name to tell Stata to treat it as a categorical variable, and set the first level as reference level. To select a different reference level, use ibx. instead of i.. For example, to select race == 3 as reference level, use ib3.race.

import delimited "${repo}/transplants.txt", clear
levelsof race, local(race_val)
local lab_helper
foreach i in `race_val' {
    local s_helper : di "`i'"
    local lab_helper : di "`lab_helper'" " " subinstr(substr("`i'",3,.), " ", "-", .)
    local lab_helper : di strtrim("`lab_helper'")
}
split race, p("=")
destring race1, replace
capture label drop race
tokenize "`lab_helper'"
label define race 1 "`1'" 2 "`2'" 4 "`3'" 5 "`4'" 6 "`5'" 9 "`6'"
label values race1 race
regress bmi age gender i.race1

Extraction of Coefficients#

standard result matrix#

Stata always store the most recent regression results into a standard matrix so that user can directly refer to coefficients through _b[varname] or _se[varname].

import delimited "${repo}/transplants.txt", clear
levelsof race, local(race_val)
local lab_helper
foreach i in `race_val' {
    local s_helper : di "`i'"
    local lab_helper : di "`lab_helper'" " " subinstr(substr("`i'",3,.), " ", "-", .)
    local lab_helper : di strtrim("`lab_helper'")
}
split race, p("=")
destring race1, replace
capture label drop race
tokenize "`lab_helper'"
label define race 1 "`1'" 2 "`2'" 4 "`3'" 5 "`4'" 6 "`5'" 9 "`6'"
label values race1 race
regress bmi age gender i.race1

local age_coef = _b[age]
local race4_coef = _b[4.race1]
macro list

lincom#

lincom is the command that conducts the linear combination of coefficients. It is always based on the most recent regression results. When applying the lincom command to only one coefficient from the regression result, it will store the coefficient with confidence interval and p-value to the return list for extraction.

import delimited "${repo}/transplants.txt", clear
levelsof race, local(race_val)
local lab_helper
foreach i in `race_val' {
    local s_helper : di "`i'"
    local lab_helper : di "`lab_helper'" " " subinstr(substr("`i'",3,.), " ", "-", .)
    local lab_helper : di strtrim("`lab_helper'")
}
split race, p("=")
destring race1, replace
capture label drop race
tokenize "`lab_helper'"
label define race 1 "`1'" 2 "`2'" 4 "`3'" 5 "`4'" 6 "`5'" 9 "`6'"
label values race1 race
regress bmi age gender i.race1

lincom age
return list

local coef = `r(estimate)'
local ci_lb = `r(lb)'
local ci_ub = `r(ub)'
local p = `r(p)'
local se = `r(se)'

lincom 2.race
return list

matrix#

Matrix appraoch is another way of extracting coefficients from regression results. It is more advanced as compared to the lincom approach and will directly extract the whole regression result into a matrix, so that any one information can be directly referred through the matrix. The base for the matrix approach is that Stata will always store the regression results as a matrix into return list.

import delimited "${repo}/transplants.txt", clear
levelsof race, local(race_val)
local lab_helper
foreach i in `race_val' {
    local s_helper : di "`i'"
    local lab_helper : di "`lab_helper'" " " subinstr(substr("`i'",3,.), " ", "-", .)
    local lab_helper : di strtrim("`lab_helper'")
}
split race, p("=")
destring race1, replace
capture label drop race
tokenize "`lab_helper'"
label define race 1 "`1'" 2 "`2'" 4 "`3'" 5 "`4'" 6 "`5'" 9 "`6'"
label values race1 race
regress bmi age gender i.race1
return list

matrix results = r(table)' 
// the ' simply means transpose the matrix, it is not necessary here just to help refer to each cell

// extracting for age
local age_coef = results["age", "b"]
local age_se = results["age", "se"]
local age_p = results["age", "pvalue"]
local age_lb = results["age", "ll"]
local age_ub = results["age", "ul"]

// extracting for a certain level of race
local race_coef = results["4.race1", "b"]
macro list

describe#

describe is a command quicly shows type and label information of a variable. When no variable specified, it shows information for all variables in the dataset.

import delimited "${repo}/transplants.txt", clear
describe age
describe

rename#

rename is a command helping change the name of specific variables.

import delimited "${repo}/transplants.txt", clear
rename age age_new

codebook#

codebook shows brief information like type, range, and etc. of specific variables. If no variable was specified, it shows information for all variables in the dataset.

import delimited "${repo}/transplants.txt", clear
codebook
codebook age

list#

list shows the specific variables of rows that satisfies certain conditions. If no conditions specified, list will show all rows in the datasets. If no variables being specified, list will display all variables of each row.

cls 
import delimited "${repo}/transplants.txt", clear
list bmi if age > 80
list bmi
list

sort/gsort#

sort helps sort the rows based on variables specified on an ascending order (A-Z or 1-10). gsort allows to choose sort ascending or descending (add a - sign before variable name).

import delimited "${repo}/transplants.txt", clear

sort age
gsort -age
sort age bmi
sort -age bmi

recode#

recode regroup certain values of one variable into another group specified.

import delimited "${repo}/transplants.txt", clear
recode age (1/50 = 1) (51/60 = 2) (61/70 = 3) (71/85 = 4)
tab age
levelsof race, local(race_val)
local lab_helper
foreach i in `race_val' {
    local s_helper : di "`i'"
    local lab_helper : di "`lab_helper'" " " subinstr(substr("`i'",3,.), " ", "-", .)
    local lab_helper : di strtrim("`lab_helper'")
}
split race, p("=")
destring race1, replace
capture label drop race
tokenize "`lab_helper'"
label define race 1 "`1'" 2 "`2'" 4 "`3'" 5 "`4'" 6 "`5'" 9 "`6'"
recode race1 (1 2 = 1) (4 5 = 2) (6 9 = 3)
tab race1
label values race1 race
tab race1

save#

save is a command saves current dataset in the memory into a .dta file.

import delimited "${repo}/transplants.txt", clear

save new_set, replace

assert#

assert is a command that halts the script or program with an error message if a certain condition is met. If no condition specified, assert will always halt the script.

import delimited "${repo}/transplants.txt", clear
assert

return/ereturn/creturn#

return and ereturn lists are scalars that Stata automatically stored into backstage after a command is run. These are scalars that can be extracted as local macros.

Instead, creutrn list are constants or properties only relevant to the system or dataset and will not change based on command that was runned.

creturn list

import delimited "${repo}/transplants.txt", clear
levelsof race, local(race_val)
local lab_helper
foreach i in `race_val' {
    local s_helper : di "`i'"
    local lab_helper : di "`lab_helper'" " " subinstr(substr("`i'",3,.), " ", "-", .)
    local lab_helper : di strtrim("`lab_helper'")
}
split race, p("=")
destring race1, replace
capture label drop race
tokenize "`lab_helper'"
label define race 1 "`1'" 2 "`2'" 4 "`3'" 5 "`4'" 6 "`5'" 9 "`6'"
label values race1 race
regress bmi age gender i.race1
return list
ereturn list
creturn list

which#

which shows a command is built-in or a .do file.

which tab
which chelp

by/bys#

by varname: is a prefix of commands that tells Stata to conduct the command based on each level of a variable. bys is the same as by just adding a sorting procedure of the variable that is used for sub-grouping.

import delimited "${repo}/transplants.txt", clear
by gender: sum age
bys gender: sum age

egen#

egen is similar to gen, but it provides some special functions to process based on row and col values.

import delimited "${repo}/transplants.txt", clear
egen tag = tag(race gender)
help egen

twoway#

twoway is a graphing command prefix in Stata. twoway simply tells Stata that conducting a graph based on two variables. Stata grahping includes many modifications on graphing options, some common ones are:

title()

xlab()

ylab()

xtitle()

ytitle()

scatter#

scatter is a type of twoway plot that Stata will be able to conduct. Also known as point plot.

import delimited "${repo}/transplants.txt", clear
twoway scatter bmi age
// check about the graphing options
help scatter

line is a type of twoway plot that Stata will be able to conduct.

import delimited "${repo}/transplants.txt", clear
twoway line bmi age

merge#

merge is a command that combines two datasets based on a comman key variable between the two datasets. The dataset in memory is considered as the master dataset and the other dataset that is joining into the master dataset is considered as matching dataset. Based on uniqueness of the key variable in each datasets, merge can be having three types: 1:1 - the key variable is unique for each observation in both master and matching datasets 1:n - the key variable is unique in master dataset, but not in matching dataset n:n - the key variable is not unique in master dataset nor the matching dataset.

it is a better approach to make sure key variables together make every row unique so that 1:1 merging can be used as it is more stable and reliable

import delimited "${repo}/transplants.txt", clear
save transplants, replace
import delimited "${repo}/donors_recipients.txt", clear
save dr, replace
use transplants, clear
merge 1:1 fake_id using dr

correlate#

correlate is the command that estimates the correlation of variables specified.

import delimited "${repo}/transplants.txt", clear
corr age bmi

collapse#

collapse creates a new dataset with one record per each by variable, and for each row, it has the variable that were specified by variable name and how it is conducted. collapse can conduct some summary statistics of the variable from the dataset, like (min), (max), (mean), and etc.

import delimited "${repo}/transplants.txt", clear
collapse (mean) age, by(ctr_id)

reshape#

reshape changes the dataset from long shape to wide shape. A long shape dataset means one object had multiple rows of data represents different visits that measured the same information. A wide shape dataset means one object had only 1 row of data represents all different visits that measured the same information.

Long Example: id varA varB varC visit 1 xxx xxx xxx 1 1 xxx xxx xxx 2 1 xxx xxx xxx 3 Wide Example: id varA1 varB1 varC1 varA2 varB2 varC2 varA3 varB3 varC3 1 xxx xxx xxx xxx xxx xxx xxx xxx xxx

// simulating a long format dataset
clear
set obs 10
gen id = _n
gen exp = round(runiform(1,5))
expand exp
drop exp
sort id
bys id: gen visit = _n
gen varA = rnormal(50, 4)
gen varB = round(runiform(0,1))
gen varC = runiform(1,10)
// now we have a long form dataset, each person has 1-5 visits, each visit measures varA, varB, and varC
// now we convert it into wide format
reshape wide varA varB varC, i(id) j(visit)
// now it is in wide shape, and we see that the information in visit variable is now being integrated into variable names
// let's try reshape it back to long format
reshape long varA varB varC, i(id) j(visit)

expand#

expand is a command replicates each row for a certain time.

clear
set obs 10
gen id = _n
gen exp = round(runiform(1,5))
expand exp

histogram#

histogram is a kind of plot that Stata can produce. As a graph it also can takes in graphing options to modify it’s output. See section twoway for more information.

import delimited "${repo}/transplants.txt", clear
hist gender
hist bmi, bin(10)

mathmetical functions#

runiform()#

runiform() is a math function that generates random number follows a uniform distributions. runiform() takes in 2 numbers as the min and max of the distribution.

di runiform(0,10)

rnormal()#

rnormal() is a math function that generates random number follows a normal distributions. rnormal() takes in 2 numbers as the mean and standard deviation of the distribution.

di rnormal(10,1)

floor()#

floor() is a math function that rounds the number it takes in to the greatest integer that is less than or equal to the number.

di floor(3.5)
di floor(3.2)
di floor(3)
di floor(3.8)

ceil()#

ceil() is a math function that rounds the number it takes in to the smallest integer that is greater than or equal to the number.

di ceil(3)
di ceil(3.1)
di ceil(3.5)
di ceil(3.9)

round()#

round() is a math function that rounds the number to the nearest integer.

di round(3)
di round(3.1)
di round(3.5)
di round(3.9)

int()#

int() is a math function that truncate anything after decimal.

di int(3)
di int(-3.1)
di int(-3.5)
di int(3.9)

min ()#

min() is a math function that returns the minimum number of a list of numbers.

di min(1, 3, 5, 7.5, 9)

max()#

max() is a math function that returns the maximum number of a list of numbers.

di max(1, 3, 5, 7.5, 9)

exp()#

exp() takes the exponentiation of the number it takes in.

di exp(3)

ln()#

ln() takes the natural log of the number it takes in.

di ln(20)

sqrt()#

sqrt() takes the square root of the number it takes in.

di sqrt(4)
di sqrt(9)

abs()#

abs() conducts the absolute value of the number it takes in.

di abs(-3)

mod()#

mod() shows the modulus of number 1 by number 2.

di mod(529,10)

sin()#

sin() is the sine function.

di sin(`c(pi)' / 2)

string functions#

word()#

word() gets the nth word of a string.

di word("Hello, is there anybody in there?", 4)

strlen()#

strlen() counts the number of characters in a string.

di strlen("testtesttest")

regexm()#

regexm() checks if a string or a pattern exist in another string.

di regexm("testtesttest", "tt")
di regexm("testtesttest", "tett")

date and time functions#

%td#

%td is a special format that can simply transfer a numeric variable to dates (days since 1/1/1960)

di %td 19400
di %td 366

%tc#

%tc is a special format that can simply transfer a numeric variable to dates (milliseconds since 1/1/1960)

di %tc 4*365.25*24*60*60*1000

td()#

td() is a function that can turn a string in form of ddmonthyyyy into a date-time value.

di td(04jul1976)
di td(05may2024)

mdy()/dmy()#

mdy() and dmy() are functions that takes three numbers and convert it to a date-time value. m stands for month, d stands for day, y stands for year.

di mdy(01,02,2024)
di dmy(02,01,2024)

mdyhms()#

mdyhms() is the function addition to mdy() that takes in h (hours), m (minutes), and s (seconds).

di mdyhms(1,1,2011,5,15,00)

date()#

date() function converts a string into date based on the format specified.

di date("August 15, 1969", "MDY")
di date("03June2024", "DMY")

clock()#

clock() function creates a time stamp for the date string in first place and desired format in second place.

di "$S_DATE"
di "$S_TIME"
local start = Clock("$S_DATE $S_TIME", "DMYhms")

timer#

timer is a command that controls the timer in Stata.

To start a timer, begin with timer on with a number represents the timer slot. To stop, end with timer off with corresponding timer slot.

To show all timer results, use timer list. After timer list, use r(tx) to extract value of a specific value from a certain timer slot.

timer on 1
timer off 1
timer on 2
timer off 2
timer list
di `r(t2)'

cond()#

cond() is a function that similar to if-else condition. It takes in a condition to see if it is satisfied, if yes, return the value in the second place, if no, return the value in the thrid place.

local n = runiform(0, 10)
di cond(`n'<5, "small", "large")