Reshape in Stata: Wide to Long and Long to Wide with Real Panel Data
reshape is one of the most confusing Stata commands. Here's how i() and j() work, with real panel data examples and error debugging.
You know your panel should be long, but reshape keeps failing and you are not sure whether the problem is in i(), j(), or your stub names.
You will leave with a repeatable reshape workflow, including diagnostics that catch malformed panel keys early.
All examples tested in Stata 18 SE. Compatible with Stata 15+.
Quick Answer
- Use `reshape long` when variables are currently repeated across columns such as `wage2019 wage2020`.
- Use `reshape wide` when each entity-time observation should become one row with year-specific columns.
- Validate `isid id year` after reshape long and `isid id` after reshape wide.
- Keep temporary backups with `preserve` while debugging reshapes.
Treat Reshape as a Data Integrity Operation
From wide payroll data to long panel data
In business and labor datasets, analysts often receive one row per firm with year-specific columns. That format is not estimation-ready for panel methods.
Use clear stub names, then reshape long and verify structure immediately with uniqueness checks and summary diagnostics.
If you are extending this pipeline, also review How to Merge Datasets in Stata and Stata margins: Complete Guide to Marginal Effects.
1clear all2input firm_id wage2019 wage2020 wage2021 education2019 education2020 education20213101 35 37 40 12 12 134102 28 31 33 10 10 115103 44 46 49 16 16 166104 30 29 31 11 11 117end89reshape long wage education, i(firm_id) j(year)10isid firm_id year11list, sepby(firm_id) +------------------------------------+
| firm_id year wage education |
|------------------------------------|
1. | 101 2019 35 12 |
2. | 101 2020 37 12 |
3. | 101 2021 40 13 |
|------------------------------------|
4. | 102 2019 28 10 |
+------------------------------------+Back to wide for export and reporting
After modeling in long format, teams often need wide outputs for reporting, spreadsheets, or survey dashboards. Reshape back only after key checks pass.
Always sort and verify IDs before reshaping wide to avoid ambiguous row construction.
1clear all2input firm_id wage2019 wage2020 wage2021 education2019 education2020 education20213101 35 37 40 12 12 134102 28 31 33 10 10 115103 44 46 49 16 16 166104 30 29 31 11 11 117end89reshape long wage education, i(firm_id) j(year)10isid firm_id year11list, sepby(firm_id)1213* ---- Section-specific continuation ----14sort firm_id year15isid firm_id year1617reshape wide wage education, i(firm_id) j(year)18isid firm_id19order firm_id wage2019 wage2020 wage2021 education2019 education2020 education202120list. isid firm_id . isid firm_id variables firm_id uniquely identify the observations
Common Errors and Fixes
"values of variable year not unique within firm_id"
The i-j pair contains duplicates, so Stata cannot determine which value belongs in the reshaped cell.
Run `duplicates report firm_id year` and resolve repeated rows before reshaping.
values of variable year not unique within firm_id r(9);
reshape wide wage, i(firm_id) j(year)duplicates report firm_id yearbysort firm_id year: gen dup = _Nlist firm_id year wage if dup>1by firm_id year: keep if _n==1reshape wide wage, i(firm_id) j(year)1duplicates report firm_id year2reshape wide wage education, i(firm_id) j(year)3isid firm_idDuplicates in terms of firm_id year
--------------------------------------
Copies | Observations Surplus
----------+---------------------------
1 | 12 0
--------------------------------------Command Reference
reshape
Stata docs โConverts datasets between wide and long representations while preserving identifiers.
i(varlist)Entity identifiers that stay fixed across reshapingj(varname)Index variable such as year, wave, or periodstringAllows string-valued j() indexes when needed@Advanced stub placeholder for custom naming patternsHow Sytra Handles This
Sytra can test i-j uniqueness and propose corrected reshape syntax before running the command, preventing silent panel corruption.
A direct natural-language prompt for this exact workflow:
I have wage2019 wage2020 wage2021 and education2019 education2020 education2021 by firm_id. Convert to long with year, check duplicates in firm_id-year, then reshape back to wide and verify firm_id uniqueness.Sytra catches these errors before you run.
Sytra can test i-j uniqueness and propose corrected reshape syntax before running the command, preventing silent panel corruption.
Join the Waitlist โFAQ
When should I reshape wide to long in Stata?
Reshape wide to long before panel regressions, fixed-effects models, or event-study code that needs one row per unit-time observation.
What do i() and j() mean in reshape?
i() identifies the entity ID that stays constant across repeated observations, and j() identifies the time or category index that varies.
How do I fix reshape uniqueness errors?
Check duplicates in the i-j pair with duplicates report and make sure each i-j combination appears once before reshaping.
Related Guides
- How to Merge Datasets in Stata: 1:1, m:1, 1:m with Complete Examples
- Panel Data in Stata: xtreg vs. reghdfe vs. areg
- Stata Dates: Formatting, Converting, and Working with Date Variables
- Stata Factor Variables: i., c., ibn., and # Notation Explained
- Explore the data management pillar page
- Open the full data management guide index
- Browse all Stata & R guides on the blog index
- Browse all Stata pillars
We build practical, reproducible workflows for Stata and R teams working on real empirical research pipelines.