Methodology
2026-03-2011 min read

Matching Estimators in Stata: PSM, CEM, and Modern Alternatives

A guide to matching methods in Stata — propensity score matching, coarsened exact matching, and why you should probably use teffects instead of psmatch2.

Sytra Team
Research Engineering Team, Sytra AI

Matching aims to make treated and control groups comparable by pairing units with similar observable characteristics. When randomization isn’t possible, matching approximates the balance you’d get from a random experiment — at least on observed covariates.

But matching in Stata has a complicated ecosystem. There’s psmatch2 (the old standard), teffects (the official built-in), cem (coarsened exact matching), kmatch (kernel matching), and nnmatch (nearest neighbor). This guide tells you which to use, when, and what diagnostic checks to run.

Propensity Score Matching (PSM)

The Old Way: psmatch2

* PSM with psmatch2 (still widely used)
ssc install psmatch2, replace
 
psmatch2 treatment age income education i.race, outcome(y) caliper(0.01) neighbor(1)
 
* Check balance
pstest age income education i.race

psmatch2 has been the workhorse PSM command for 20 years. It works. But it has limitations: no built-in standard error adjustment for the matching step, limited matching algorithms, and diagnostics that require separate commands.

The Modern Way: teffects psmatch

* Official Stata PSM with correct standard errors
teffects psmatch (y) (treatment age income education i.race), atet nn(1)
 
* Check overlap (common support)
teoverlap
 
* Balance table
tebalance summarize
 
* Balance density plots
tebalance density age

Advantages of teffects over psmatch2:

  • Correct standard errors: teffects accounts for the fact that propensity scores are estimated, not known. psmatch2 doesn’t, which means its standard errors are too small.
  • Built-in diagnostics: teoverlap and tebalance are integrated. With psmatch2, you need separate community-written commands.
  • Multiple estimators: teffects supports PSM, IPW, and augmented IPW in a unified syntax.

Stop fighting with syntax.

Sytra is an AI research assistant built specifically for statistical computing. No more copy-pasting code into ChatGPT.

Get Early Access

Coarsened Exact Matching (CEM)

CEM takes a different approach: instead of matching on a single propensity score, it coarsens each covariate into bins and matches exactly on the binned values. This guarantees balance within the bins.

* CEM matching
ssc install cem, replace
 
cem age (#5) income (#10) education (#3) race (#0), treatment(treatment)
 
* Estimate treatment effect on matched sample
regress y treatment [iweight = cem_weights], vce(robust)

The numbers after # specify the number of bins. More bins = more precise matching but more unmatched units. #0 means exact matching on that variable (useful for categorical variables).

CEM’s main advantage: you can verify balance by construction. If units are matched, they have the same binned covariate values. No need for balance tables or bias tests — the matching step guarantees it.

Inverse Probability Weighting (IPW)

* IPW estimation
teffects ipw (y) (treatment age income education i.race), atet
 
* Augmented IPW (doubly robust)
teffects aipw (y age income education) (treatment age income education i.race), atet

IPW reweights observations by the inverse of their estimated probability of treatment. Treated units with a high probability of treatment get lower weight (they’re not adding much information); control units with a high probability of treatment get higher weight (they’re the best counterfactuals).

Augmented IPW (AIPW) is doubly robust: it’s consistent if either the propensity score model or the outcome model is correctly specified. This is the gold standard for observational studies.

Diagnostics: What to Check

  • Common support: Run teoverlap after teffects. If the propensity score distributions for treated and control groups don’t overlap, matching is extrapolating — and the results are unreliable.
  • Balance: Run tebalance summarize. Standardized differences should be below 0.1. Variance ratios should be between 0.8 and 1.25.
  • Sensitivity to unobservables: Matching only works if selection is on observables. Use the Rosenbaum bounds test (rbounds) to assess how sensitive your results are to unobserved confounders.

Which Matching Method to Use

  • Small sample, few covariates: CEM
  • Large sample, many continuous covariates: PSM via teffects psmatch
  • Want robustness: AIPW via teffects aipw
  • Legacy code / compatibility: psmatch2

When in doubt, run AIPW. It’s doubly robust, it comes with built-in diagnostics, and its standard errors are correct by construction. If the results differ meaningfully from PSM or CEM, investigate why — the difference usually reveals a specification issue.

#Matching#Stata#Causal Inference#Political Science

Enjoyed this article?