Day 1: Introduction to Stata Programming#

Graduate Level Course#

Learning Objectives:#

  1. Understand Stata’s interface and basic workflow

  2. Learn fundamental Stata commands and syntax

  3. Distinguish between different types of Stata output

  4. Begin developing good programming practices

Pre-Class Survey#

Please complete the brief entry survey to help us tailor the course to your needs:

Launch Survey on Zoom

Survey Questions:

  1. How will you use Stata this week?

    • Locally on my laptop

    • Remotely on another desktop or terminal

  2. What operating system will you use?

    • MacOS

    • Unix

    • Windows

  3. What is your experience level with statistical software?

    • No Experience

    • Basic Knowledge

    • Novice User

    • Competent User

    • Advanced User

    • Expert User

1. Stata Fundamentals#

System vs. User Perspective#

graph TD
    A[System] --> B[Native Stata]
    A --> C[Third-party .ado files]
    A --> D[Your .ado files]
    
    E[User] --> F[Known Users]
    E --> G[Unknown Users]

Key points:

  • System: The Stata application and its components (not STATA, which is incorrect)

  • User: You, TAs, collaborators, or future users of your code

  • Empathy: Anticipate user needs when writing programs

  • Sharing: Make your code accessible to others

  • Caring: Write user-friendly, well-documented code

2. Stata Interface Components#

Key Windows#

  1. Command (⌘1)

  2. Results (⌘2)

  3. History (⌘3)

  4. Variables (⌘4)

  5. Do-file Editor (⌘9)

3. Commands and Syntax#

Anatomy of Stata Code#

command [varlist] [if] [in] [, options]

Example:

summarize age if gender == 1, detail

Command Types#

  1. Native Stata commands: Blue in command window

  2. Third-party commands: White (may need installation)

  3. User-written commands: Your own .ado files

Try this basic command:

display "Hello Stata Programmers!"

4. Working with Data#

Example Dataset#

Let’s explore the life expectancy dataset:

webuse lifeexp, clear
describe

Output Types#

graph TD
    Results --> String
    Results --> Numeric
    String --> Text
    String --> URL
    String --> Filepath
    Numeric --> Integer
    Numeric --> Decimal
    Integer --> byte
    Integer --> int
    Integer --> long
    Decimal --> float
    Decimal --> double

5. Practical Exercise#

Task 1: Basic Analysis#

webuse lifeexp, clear
encode country, gen(Country)
twoway scatter lexp Country, xscale(off)
graph export lexp_bycountry.png, replace

Task 2: Simulation#

clear
set obs 1000
generate bmi = rnormal(28, 5)
histogram bmi, normal
graph export bmi.png, replace

6. Common Errors#

Unrecognized Command#

myfirstprogram

Expected output:

command myfirstprogram is unrecognized
r(199);

Troubleshooting Tips#

  1. Check command spelling

  2. Verify required packages are installed

  3. Review syntax documentation (help commandname)

7. Key Takeaways#

  • Stata has a consistent command syntax structure

  • Distinguish between native and user-written commands

  • Output can be strings or numeric values

  • Always document your code for future users

Next Steps#

  1. Install any third-party commands needed for tomorrow

  2. Review basic Stata commands using help

  3. Bring questions about today’s material