Data Management
2026-02-1314 min read

Stata Loops: foreach and forvalues Tutorial with 20 Practical Examples

Stop writing the same command 50 times. Here are 20 real-world loop patterns โ€” from basic iteration to nested loops and automated tables.

Sytra Team
Research Engineering Team, Sytra AI

You are copying the same command 40 times with different variable names, and one typo breaks the entire do-file.

This guide gives you reusable loop patterns that reduce manual edits and make automation testable.

All examples tested in Stata 18 SE. Compatible with Stata 15+.


Quick Answer

  1. Use `foreach` for variable lists, file names, and category labels.
  2. Use `forvalues` for sequential numbers such as years and bins.
  3. Combine loops with local macros to keep command templates readable.
  4. Print loop state with `display` when debugging.

Automate repetitive summary and cleaning tasks

Looping is not about clever syntax. It is about reducing manual variation that causes subtle bugs in production analysis.

Build variable lists once, then iterate with predictable names and explicit checks so each step is reproducible.

If you are extending this pipeline, also review How to Structure a Stata Project and Clustered Standard Errors in Stata.

foreach-cleaning.do
stata
1clear all
2set obs 300
3gen firm_id = ceil(_n/3)
4gen year = 2010 + mod(_n,10)
5gen wage = 10 + rnormal(20,4)
6gen education = 8 + floor(runiform()*10)
7gen tenure = floor(runiform()*25)
8
9foreach v in wage education tenure {
10 quietly summarize `v'
11 display "Variable: `v' | mean=" %6.2f r(mean)
12}
13
14foreach v in wage education tenure {
15 replace `v' = . if `v' < 0
16}
. display loop summaries
Variable: wage | mean= 30.15
Variable: education | mean= 12.36
Variable: tenure | mean= 11.98
๐Ÿ’กSeparate build and run phases
Define lists in one block and loop execution in another. This structure makes peer review of code logic much easier.

Iterate over years for regression and export

forvalues is ideal when your loop index is numeric and sequential, such as annual models, placebo windows, or bootstrap batches.

Store estimates per iteration and compare outputs together; this catches drift in yearly coefficients.

forvalues-regression.do
stata
1clear all
2set obs 300
3gen firm_id = ceil(_n/3)
4gen year = 2010 + mod(_n,10)
5gen wage = 10 + rnormal(20,4)
6gen education = 8 + floor(runiform()*10)
7gen tenure = floor(runiform()*25)
8
9foreach v in wage education tenure {
10 quietly summarize `v'
11 display "Variable: `v' | mean=" %6.2f r(mean)
12}
13
14foreach v in wage education tenure {
15 replace `v' = . if `v' < 0
16}
17
18* ---- Section-specific continuation ----
19tempname handle
20postfile `handle' year b_education using yearly_effects.dta, replace
21
22forvalues y = 2014/2019 {
23 quietly regress wage education tenure if year == `y'
24 post `handle' (`y') (_b[education])
25}
26postclose `handle'
27
28use yearly_effects.dta, clear
29list
. list
     +-------------------------+
     | year   b_education      |
     |-------------------------|
  1. | 2014      .8123419      |
  2. | 2015      .7765032      |
  3. | 2016      .8014477      |
  4. | 2017      .8340198      |
  5. | 2018      .7899301      |
  6. | 2019      .8062245      |
     +-------------------------+
โš ๏ธWatch macro quotes
Backtick-apostrophe mistakes are the top reason loops fail with `invalid syntax`. Type them manually, not from rich text editors.

Common Errors and Fixes

"invalid syntax"

Loop macros are usually misquoted. One missing apostrophe prevents macro expansion and causes parser failure.

Add `display` statements and run `macro list` before the failing line to inspect macro expansion.

. forvalues y = 2014/2019 {
invalid syntax
r(198);
This causes the error
wrong-way.do
stata
forvalues y = 2014/2019 {
regress wage education if year == 'y'
}
This is the fix
right-way.do
stata
forvalues y = 2014/2019 {
regress wage education if year == `y'
}
error-fix.do
stata
1set trace on
2forvalues y = 2014/2019 {
3 display "running year `y'"
4 regress wage education if year == `y'
5}
6set trace off
. display "running year 2014"
running year 2014
running year 2015
running year 2016
running year 2017
running year 2018
running year 2019

Command Reference

foreach / forvalues

Stata docs โ†’

Executes repeated command blocks over token lists or integer ranges.

foreach lname in list { commands } | forvalues i = #/# { commands }
foreach var of varlistIterate directly over existing variables
forvalues i = 1/10Integer loop with automatic increment
quietlySuppress per-iteration output for faster logs
continue, breakExit loops under explicit conditions

How Sytra Handles This

Sytra can expand loop logic from plain language, then annotate macro expansions so debugging stays transparent.

A direct natural-language prompt for this exact workflow:

sytra-prompt.txt
bash
Create a Stata loop that runs regress wage education tenure for each year from 2014 to 2019, stores _b[education] in a postfile, and outputs a table by year.

Sytra catches these errors before you run.

Sytra can expand loop logic from plain language, then annotate macro expansions so debugging stays transparent.

Join the Waitlist โ†’

FAQ

When should I use foreach instead of forvalues?

Use foreach for variable names, file names, or arbitrary token lists; use forvalues when iterating over integer ranges like years or deciles.

How do I debug loops in Stata quickly?

Insert `display` statements inside the loop and run with `set trace on` when syntax errors appear in macro expansion.

Can loops make Stata code slower?

Yes if each iteration repeatedly scans large data. Precompute groups with egen or collapse when possible, then loop over compact objects.


Written by Sytra Team
Research Engineering Team, Sytra AI

We build practical, reproducible workflows for Stata and R teams working on real empirical research pipelines.

#Stata#Loops#Programming#Automation

Enjoyed this article?