Data Management
2026-02-1213 min read

Reshape in Stata: Wide to Long and Long to Wide with Real Panel Data

reshape is one of the most confusing Stata commands. Here's how i() and j() work, with real panel data examples and error debugging.

Sytra Team
Research Engineering Team, Sytra AI

You know your panel should be long, but reshape keeps failing and you are not sure whether the problem is in i(), j(), or your stub names.

You will leave with a repeatable reshape workflow, including diagnostics that catch malformed panel keys early.

All examples tested in Stata 18 SE. Compatible with Stata 15+.


Quick Answer

  1. Use `reshape long` when variables are currently repeated across columns such as `wage2019 wage2020`.
  2. Use `reshape wide` when each entity-time observation should become one row with year-specific columns.
  3. Validate `isid id year` after reshape long and `isid id` after reshape wide.
  4. Keep temporary backups with `preserve` while debugging reshapes.

Treat Reshape as a Data Integrity Operation

From wide payroll data to long panel data

In business and labor datasets, analysts often receive one row per firm with year-specific columns. That format is not estimation-ready for panel methods.

Use clear stub names, then reshape long and verify structure immediately with uniqueness checks and summary diagnostics.

If you are extending this pipeline, also review How to Merge Datasets in Stata and Stata margins: Complete Guide to Marginal Effects.

reshape-wide-to-long.do
stata
1clear all
2input firm_id wage2019 wage2020 wage2021 education2019 education2020 education2021
3101 35 37 40 12 12 13
4102 28 31 33 10 10 11
5103 44 46 49 16 16 16
6104 30 29 31 11 11 11
7end
8
9reshape long wage education, i(firm_id) j(year)
10isid firm_id year
11list, sepby(firm_id)
. list, sepby(firm_id)
     +------------------------------------+
     | firm_id   year   wage   education |
     |------------------------------------|
  1. |     101   2019     35          12 |
  2. |     101   2020     37          12 |
  3. |     101   2021     40          13 |
     |------------------------------------|
  4. |     102   2019     28          10 |
     +------------------------------------+
๐Ÿ’กName stubs consistently
Stub inconsistencies like `wage_2020` mixed with `wage2021` create fragile reshape code. Standardize names before reshaping.

Back to wide for export and reporting

After modeling in long format, teams often need wide outputs for reporting, spreadsheets, or survey dashboards. Reshape back only after key checks pass.

Always sort and verify IDs before reshaping wide to avoid ambiguous row construction.

reshape-long-to-wide.do
stata
1clear all
2input firm_id wage2019 wage2020 wage2021 education2019 education2020 education2021
3101 35 37 40 12 12 13
4102 28 31 33 10 10 11
5103 44 46 49 16 16 16
6104 30 29 31 11 11 11
7end
8
9reshape long wage education, i(firm_id) j(year)
10isid firm_id year
11list, sepby(firm_id)
12
13* ---- Section-specific continuation ----
14sort firm_id year
15isid firm_id year
16
17reshape wide wage education, i(firm_id) j(year)
18isid firm_id
19order firm_id wage2019 wage2020 wage2021 education2019 education2020 education2021
20list
. isid firm_id
. isid firm_id

. isid firm_id
variables firm_id uniquely identify the observations
๐Ÿ‘Do not skip sorting
Reshape may run without errors but still produce confusing row order. Explicit sorting keeps reproducibility stable across runs.

Common Errors and Fixes

"values of variable year not unique within firm_id"

The i-j pair contains duplicates, so Stata cannot determine which value belongs in the reshaped cell.

Run `duplicates report firm_id year` and resolve repeated rows before reshaping.

. reshape wide wage, i(firm_id) j(year)
values of variable year not unique within firm_id
r(9);
This causes the error
wrong-way.do
stata
reshape wide wage, i(firm_id) j(year)
This is the fix
right-way.do
stata
duplicates report firm_id year
bysort firm_id year: gen dup = _N
list firm_id year wage if dup>1
by firm_id year: keep if _n==1
reshape wide wage, i(firm_id) j(year)
error-fix.do
stata
1duplicates report firm_id year
2reshape wide wage education, i(firm_id) j(year)
3isid firm_id
. duplicates report firm_id year
Duplicates in terms of firm_id year

--------------------------------------
   Copies | Observations       Surplus
----------+---------------------------
        1 |           12             0
--------------------------------------

Command Reference

Converts datasets between wide and long representations while preserving identifiers.

reshape [long | wide] stubnames, i(idvars) j(indexvar)
i(varlist)Entity identifiers that stay fixed across reshaping
j(varname)Index variable such as year, wave, or period
stringAllows string-valued j() indexes when needed
@Advanced stub placeholder for custom naming patterns

How Sytra Handles This

Sytra can test i-j uniqueness and propose corrected reshape syntax before running the command, preventing silent panel corruption.

A direct natural-language prompt for this exact workflow:

sytra-prompt.txt
bash
I have wage2019 wage2020 wage2021 and education2019 education2020 education2021 by firm_id. Convert to long with year, check duplicates in firm_id-year, then reshape back to wide and verify firm_id uniqueness.

Sytra catches these errors before you run.

Sytra can test i-j uniqueness and propose corrected reshape syntax before running the command, preventing silent panel corruption.

Join the Waitlist โ†’

FAQ

When should I reshape wide to long in Stata?

Reshape wide to long before panel regressions, fixed-effects models, or event-study code that needs one row per unit-time observation.

What do i() and j() mean in reshape?

i() identifies the entity ID that stays constant across repeated observations, and j() identifies the time or category index that varies.

How do I fix reshape uniqueness errors?

Check duplicates in the i-j pair with duplicates report and make sure each i-j combination appears once before reshaping.


Written by Sytra Team
Research Engineering Team, Sytra AI

We build practical, reproducible workflows for Stata and R teams working on real empirical research pipelines.

#Stata#Reshape#Panel Data#Data Management

Enjoyed this article?