Building a Replication Package in Stata: The Complete Checklist
AER, QJE, and REStud now require replication packages. Here's a complete checklist for building one in Stata — directory structure, master .do file, data documentation, and automated testing.
AER, QJE, REStud, Econometrica — the top journals now require replication packages. If your code can’t reproduce your results from scratch, your paper doesn’t get published. This isn’t a suggestion; it’s a gate.
Here’s a complete checklist for building a replication package in Stata that passes the data editor’s review on the first try.
Directory Structure
The Master .do File
Stop fighting with syntax.
Sytra is an AI research assistant built specifically for statistical computing. No more copy-pasting code into ChatGPT.
Get Early AccessThe README
The README is the single most important file. Data editors read it first. Include:
- Overview: One paragraph describing the paper and the replication package.
- Data Availability Statement: Where the data comes from. If it’s restricted, explain how to request access.
- Computational Requirements: Stata version, required packages, estimated runtime, hardware requirements.
- Instructions: “Edit the root path in master.do, then run master.do.”
- Output Map: Which script produces which table/figure in the paper.
Checklist
Common Reasons for Rejection
- Hardcoded paths:
"C:\Users\Jane\Desktop\..."appears 47 times across 12 .do files. Solution: use globals set in master.do. - Missing packages: The code uses
reghdfebut doesn’t install it. The replicator gets “command not found.” - Interactive steps: “Run 03_analysis.do, then manually copy the coefficient from the log and paste it into 04_tables.do.” No.
- Unlabeled output: Table 3 in the paper doesn’t correspond to any named file in the output folder.
- Version sensitivity: Code works in Stata 17 but not Stata 18 because a default changed.
How Sytra Automates This
When you build your analysis through Sytra, it automatically generates the replication package structure: numbered scripts, a master .do file with the correct globals, package dependencies, and an execution log that serves as the README’s computational documentation. The output map is built as you create each table and figure.