πŸ“˜ Day 1: Introduction to Stata Programming#

Graduate-Level Short Course#

🎯 Learning Objectives#

By the end of today’s session, you will be able to:

  1. Navigate the Stata interface and understand its workflow

  2. Execute and interpret basic Stata commands and syntax

  3. Identify and differentiate types of Stata outputs

  4. Begin writing clear, shareable, and modular code

πŸ“ Pre-Class Survey#

Before we begin, please complete the short entry survey to help us tailor the course to your technical setup and prior experience:

πŸ‘‰ Launch Survey on Zoom

Questions:

  1. How will you access Stata this week?

    • Locally on your laptop

    • Remotely (e.g., SSH, RDP, terminal)

  2. Which operating system are you using?

    • MacOS

    • Unix/Linux

    • Windows

  3. What’s your experience level with statistical software?

    • 🚫 No Experience

    • πŸ“— Basic Knowledge

    • πŸ§ͺ Novice User

    • πŸ› οΈ Competent User

    • πŸ“Š Advanced User

    • 🧠 Expert User

1. πŸ” Understanding Stata: System vs. User#

graph TD
    A[System] --> B[Native Stata]
    A --> C[Third-party .ado files]
    A --> D[Your .ado files]
    E[User] --> F[Known Users]
    E --> G[Unknown Users]

Conceptual Distinctions:

  • System: The core Stata application and official components

  • User: You, your collaborators, or future analysts using your code

  • Empathy: Good code anticipates user confusion

  • Sharing is Caring: Write your code like it will be inherited

2. πŸ–₯️ Stata Interface Walkthrough#

Key Windows#

  1. Command – ⌘1

  2. Results – ⌘2

  3. History – ⌘3

  4. Variables – ⌘4

  5. Do-file Editor – ⌘9

3. 🧠 Stata Syntax 101#

Anatomy of a Command#

command [varlist] [if] [in] [, options]

Example:

summarize age if gender == 1, detail

Command Types#

  1. Native – Built-in commands (blue text)

  2. Third-party – Installed via SSC or GitHub (white text)

  3. User-written – Your own .ado files

Try it:

display "Hello Stata Programmers!"

4. πŸ“‚ Working with Datasets#

Life Expectancy Dataset#

webuse lifeexp, clear
describe

Output Types Visualized#

graph TD
    Results --> String
    Results --> Numeric
    String --> Text
    String --> URL
    String --> Filepath
    Numeric --> Integer
    Numeric --> Decimal
    Integer --> byte
    Integer --> int
    Integer --> long
    Decimal --> float
    Decimal --> double

5. πŸ§ͺ Hands-On Practice#

πŸ“Š Task 1: Country vs Life Expectancy#

webuse lifeexp, clear
encode country, gen(Country)
twoway scatter lexp Country, xscale(off)
graph export lexp_bycountry.png, replace

🎲 Task 2: Simulated BMI Distribution#

clear
set obs 1000
generate bmi = rnormal(28, 5)
histogram bmi, normal
graph export bmi.png, replace

6. 🚨 Debugging Common Errors#

❌ Unrecognized Command#

myfirstprogram

Output:

command myfirstprogram is unrecognized
r(199);

πŸ› οΈ Troubleshooting Checklist#

  1. Check spelling (case-sensitive)

  2. Confirm package installation

  3. Use help commandname to review syntax

  4. Use which commandname to locate command source

7. βœ… Key Takeaways#

  • Stata has a structured, predictable syntax

  • Learn to recognize command types and their origin

  • Outputs are typed: numeric or string (and more nuanced)

  • Thoughtful code helps future-you (and others!)


πŸ“Œ For Tomorrow#

  1. Install any needed third-party packages (we’ll guide you)

  2. Revisit core commands with help and search

  3. Bring at least one question or observation from today