Panel Data in R: fixest vs. plm vs. lfe
Comparing fixest, plm, and lfe for panel data estimation in R. Speed benchmarks, syntax differences, and when to use each.
Panel data in R means choosing between three packages: fixest (the modern standard), plm (the textbook choice), and lfe (archived but still used). The choice matters for speed, syntax, and what diagnostics are available.
fixest: The Modern Standard
fixest is R’s answer to Stata’s reghdfe. It’s the fastest option, supports multi-way FE and clustering, and has the best export tools (etable() produces LaTeX tables directly). The csw() function runs stepwise additions in a single call.
plm: The Textbook Package
plm is what most econometrics textbooks teach. Its advantage: built-in Hausman test, Breusch-Pagan test, and other panel diagnostics. Its disadvantage: slow on large datasets, limited to one-way FE without workarounds, and the pdata.frame requirement is clunky.
Stop fighting with syntax.
Sytra is an AI research assistant built specifically for statistical computing. No more copy-pasting code into ChatGPT.
Get Early Accesslfe: Archived but Alive
lfe was the pre-fixest standard. Its felm() function uses a pipe-delimited syntax: outcome ~ regressors | FE | IV | cluster. It’s archived on CRAN (no longer actively maintained) but still installs and works. If you’re maintaining legacy code, it’s fine. For new projects, use fixest.
Speed Benchmarks
1M observations, 50K firms, 20 years
Decision Tree
- New project, any size:
fixest - Need Hausman test:
plm - Legacy code:
lfe(but consider migrating) - Speed critical:
fixest(not even close) - Publication tables:
fixest(etable()is best-in-class)