Stata Errors
2026-02-089 min read

Singleton Observations in Stata reghdfe: What They Are and What to Do

reghdfe dropped 12,000 observations and you don't know why. They're singletons — fixed effect groups with one observation. Here's what that means for your paper.

Sytra Team
Research Engineering Team, Sytra AI

You ran reghdfe and the output says it dropped 12,000 observations. Your carefully constructed sample just lost a quarter of its observations and you have no idea why. The answer is almost always singletons.

. reghdfe wage education experience, absorb(firm_id year) cluster(firm_id)
(MWFE estimator converged in 5 iterations)

HDFE Linear regression                          Number of obs     =     38,247
Absorbing 2 HDFE groups                         F(2, 4182)        =     124.56
                                                 Prob > F          =     0.0000
                                                 R-squared         =     0.4231
                                                 Adj R-squared     =     0.3891
                                                 Within R-sq.      =     0.0892
Number of clusters (firm_id) = 4,183             Root MSE          =     8.2341

(12,403 singleton observations dropped)

Singleton observations are fixed effect groups that contain only one observation. If a firm appears exactly once in your dataset, that observation is a “singleton” for the firm fixed effect. reghdfe drops them because they cannot contribute to estimation and their inclusion biases standard errors.

All examples use the reghdfe package by Sergio Correia. Install with: ssc install reghdfe, replace


Quick Answer

Singletons are observations in fixed effect groups with only one member. reghdfedrops them iteratively because:

  1. A single observation perfectly identifies the fixed effect — there is no within-group variation to estimate from
  2. Including them biases standard errors downward (overstates precision)
  3. They do not contribute to the coefficient estimates at all

This is correct behavior. If many singletons are dropped, check your data structure — it may indicate a data problem rather than a normal estimation feature.


What Exactly Is a Singleton?

Consider a firm-year panel. Firm ABC appears in years 2015, 2016, 2017, 2018. Firm XYZ appears only in year 2020. With firm fixed effects, the XYZ observation is a singleton: the firm fixed effect for XYZ is perfectly determined by that single observation, absorbing all its variation.

singleton-example.do
stata
1* Check for singletons before running reghdfe
2bysort firm_id: gen firm_count = _N
3tab firm_count if firm_count == 1
. tab firm_count if firm_count == 1
 firm_count |      Freq.     Percent        Cum.
────────────┼───────────────────────────────────
          1 |     12,403      100.00      100.00
────────────┼───────────────────────────────────
      Total |     12,403      100.00

With two-way fixed effects (firm + year), the singleton problem is more complex. An observation can become a singleton iteratively — after dropping first-round singletons from one dimension, new singletons may appear in the other dimension. reghdfe handles this iterative process automatically.

💡Iterative deletion
Singleton deletion is iterative. After removing firm-level singletons, some year groups may become singletons. reghdfe repeats until no singletons remain. The number dropped can exceed the count from a simple bysort: gen count = _N check.

Why Does reghdfe Drop Singletons?

The key insight from Correia (2015): including singletons in fixed effect estimation does not affect point estimates but does affect standard errors. Singletons contribute zero degrees of freedom to the residual but are counted in the degrees-of-freedom adjustment for standard errors. The result: standard errors are too small and t-statistics are too large.

With singletons (biased SEs)
stata
* areg keeps singletons — SEs may be too small
areg wage education experience, "stata-comment">///
absorb(firm_id) cluster(firm_id)
* SE on education: 0.0123
Without singletons (correct SEs)
stata
* reghdfe drops singletons — SEs are correct
reghdfe wage education experience, "stata-comment">///
absorb(firm_id) cluster(firm_id)
* SE on education: 0.0141

The difference is often small (5-15%) but can matter for borderline significance. In published research, this is exactly the kind of detail that replication teams check.

How to Check for Singletons Before Estimation

Check your data structure before running the regression. This helps you understand how many observations you’ll lose and whether the singleton count is reasonable.

check-singletons.do
stata
1* One-way: check groups with only one observation
2bysort firm_id: gen n_firm = _N
3count if n_firm == 1
4drop n_firm
5
6* Two-way: check both dimensions
7bysort firm_id: gen n_firm = _N
8bysort year: gen n_year = _N
9count if n_firm == 1
10count if n_year == 1
11
12* For exact count, use reghdfe with verbose option
13reghdfe wage education experience, absorb(firm_id year) cluster(firm_id) verbose(1)

The keepsingleton Option

reghdfe provides a keepsingleton option that prevents dropping. This exists primarily for comparability with areg and xtreg,fe — not because keeping singletons is a good idea.

keepsingleton.do
stata
1* Keep singletons (NOT recommended for final results)
2reghdfe wage education experience, "stata-comment">///
3 absorb(firm_id year) cluster(firm_id) keepsingleton
4
5* Compare with default (singletons dropped)
6reghdfe wage education experience, "stata-comment">///
7 absorb(firm_id year) cluster(firm_id)
⚠️Do not use keepsingleton for published results
Using keepsingleton produces standard errors that are biased downward. Reviewers familiar with Correia (2015) will flag this. Use it only for diagnostic comparison, not for your final tables.

Comparison: reghdfe vs areg vs xtreg,fe

reghdfe

High-dimensional fixed effects estimator with automatic singleton deletion and multi-way clustering.

reghdfe depvar indepvars, absorb(fe1 fe2 ...) [cluster(cvar)]
absorb()Fixed effects to absorb (any number)
cluster()Cluster variable(s) for robust SEs
keepsingletonDo not drop singletons (not recommended)
verbose(#)Show iteration details
areg / xtreg,fe
stata
* Keeps singletons (biased SEs)
* Only one set of fixed effects
areg wage education, absorb(firm_id) robust
xtreg wage education, fe cluster(firm_id)
reghdfe
stata
* Drops singletons (correct SEs)
* Multiple fixed effects
reghdfe wage education, "stata-comment">///
absorb(firm_id year) "stata-comment">///
cluster(firm_id)

Key differences:

  • Singleton handling: reghdfe drops singletons by default; areg and xtreg,fe keep them
  • Multiple FEs: reghdfe handles any number of fixed effects; areg handles one
  • Speed: reghdfe is dramatically faster for high-dimensional fixed effects
  • Two-way clustering: reghdfe supports cluster(var1 var2)

When Singletons Signal a Data Problem

If reghdfe drops a large fraction of your sample (say >30%), that’s not just a statistical technicality — it likely indicates a structural issue with your data or research design:

  1. Too many fixed effect categories. If you have almost as many firm IDs as observations, most groups will be singletons. Consider whether you need that granularity.
  2. Unbalanced panel. Short panels where most firms appear for only 1-2 years will have massive singleton attrition.
  3. Wrong fixed effect specification. Using zip code × year fixed effects when you should be using state × year.
  4. Sample restriction too aggressive. After subsetting, the remaining data may be too sparse for the fixed effect structure.
diagnose-singletons.do
stata
1* Understand your panel structure
2xtset firm_id year
3xtdescribe
4
5* How many observations per firm?
6bysort firm_id: gen T_firm = _N
7summarize T_firm, detail
8
9* How many firms per year?
10bysort year: gen N_year = _N
11summarize N_year, detail
👁Reviewer question
Referees will ask: “You started with N observations and your regression uses N-K. Where did the K observations go?” Be prepared to explain singleton deletion and show it does not systematically exclude a meaningful subgroup.

Reporting Singletons in Your Paper

Always report singleton information. A standard approach:

“Our initial sample contains 50,650 firm-year observations. The reghdfe estimator drops 12,403 singleton observations, leaving an estimation sample of 38,247. Results are robust to including singletons (Online Appendix Table A3).”

For the appendix, show the keepsingleton comparison to demonstrate that your point estimates are stable and that the standard error differences are modest.


Sytra catches these errors before you run.

Sytra understands fixed effect estimation. When you describe a panel regression, Sytra generates reghdfe code with appropriate singleton handling and warns you if your fixed effect structure will drop a large fraction of your sample.

Join the Waitlist →

FAQ

What are singleton observations in reghdfe?

Singleton observations belong to fixed effect groups that contain only one observation. For example, if firm XYZ appears only once in your panel, that observation is a singleton.reghdfe drops them because they cannot contribute to within-group estimation and their inclusion biases standard errors.

Why does reghdfe drop singletons but areg does not?

areg and xtreg,fe were written before the statistical issue was well understood. Correia (2015) showed that including singletons biases standard errors downward. reghdfe implements the correct approach by default.

Should I report singleton drops in my paper?

Yes. Always report the initial sample size, the number of singletons dropped, and the final estimation sample. Include a keepsingleton robustness check in the appendix.

Can I keep singletons in reghdfe?

Yes, use reghdfe y x, absorb(fe) keepsingleton. But this is not recommended for published results — it biases standard errors downward, potentially inflating significance.

Written by Sytra Team
Research Engineering Team, Sytra AI

We build practical, reproducible workflows for Stata and R teams working on real empirical research pipelines.

#Stata#Errors#Fixed Effects#reghdfe

Enjoyed this article?