Regression
2026-02-2710 min read

Stata Weights Explained: fweight, pweight, aweight, iweight — When to Use Which

Four weight types, four different purposes. Here's when each weight type is correct, with real examples from survey data and aggregated data.

Sytra Team
Research Engineering Team, Sytra AI

You have a weight variable, but using the wrong weight type could invalidate your entire inference section.

You will map data provenance to the correct Stata weight class and implement it safely.

All examples tested in Stata 18 SE. Compatible with Stata 15+.


Quick Answer

  1. Use fweights for replicated counts, pweights for sampling probabilities, and aweights for inverse-variance contexts.
  2. Document why each weight type matches data collection design.
  3. Check command support for chosen weight type before estimation.
  4. For complex surveys, use svyset with pweights.

Choose Weight Types Based on Design, Not Habit

Apply pweights and fweights in practical workflows

Weight type should come from data-generating process. Survey sampling and aggregated counts are fundamentally different objects.

Coding this distinction directly in scripts improves transparency and reduces model misinterpretation.

If you are extending this pipeline, also review How to Merge Datasets in Stata and How to Structure a Stata Project.

weights-core.do
stata
1clear all
2set obs 1500
3gen firm_id = ceil(_n/10)
4gen year = 2014 + mod(_n,10)
5gen education = 8 + floor(runiform()*10)
6gen experience = 18 + floor(runiform()*20)
7gen wage = 12 + 0.8*education + 0.3*experience + rnormal(0,2)
8
9gen p_wt = 0.5 + runiform()*2.5
10gen f_wt = ceil(runiform()*4)
11
12regress wage education experience [pweight=p_wt], vce(robust)
13regress wage education experience [fweight=f_wt], vce(robust)
. regress wage education experience [pweight=p_wt], vce(robust)
Linear regression                               Number of obs   =      1,500
                                                  F(2,1497)       =     921.14
                                                  Prob > F        =     0.0000
------------------------------------------------------------------------------
        wage | Coefficient  std. err.      t    P>|t|
-------------+----------------------------------------
   education |   .8013475   .0241174    33.23   0.000
  experience |   .2986102   .0092158    32.40   0.000
------------------------------------------------------------------------------
💡Tie weight choice to documentation
In replication packages, include a short note on why the selected weight type matches sampling or measurement design.

Use svyset when probability sampling is complex

When designs include strata and PSUs, plain pweight regression is incomplete. svyset encodes design information for proper variance estimation.

Explicit svy commands help avoid understated uncertainty in multistage samples.

weights-svy.do
stata
1clear all
2set obs 1500
3gen firm_id = ceil(_n/10)
4gen year = 2014 + mod(_n,10)
5gen education = 8 + floor(runiform()*10)
6gen experience = 18 + floor(runiform()*20)
7gen wage = 12 + 0.8*education + 0.3*experience + rnormal(0,2)
8
9gen p_wt = 0.5 + runiform()*2.5
10gen f_wt = ceil(runiform()*4)
11
12regress wage education experience [pweight=p_wt], vce(robust)
13regress wage education experience [fweight=f_wt], vce(robust)
14
15* ---- Section-specific continuation ----
16gen strata_id = mod(firm_id, 5) + 1
17gen psu_id = ceil(firm_id/2)
18
19svyset psu_id [pweight=p_wt], strata(strata_id)
20svy: regress wage education experience
. svy: regress wage education experience
Survey: Linear regression
Number of strata   =         5
Number of PSUs     =       100
Population size    =    2,279.7
Design df          =        95
------------------------------------------------------------------------------
        wage | Coefficient  std. err.      t    P>|t|
-------------+----------------------------------------
   education |   .7989043   .0312407    25.57   0.000
  experience |   .3012244   .0118870    25.34   0.000
------------------------------------------------------------------------------
⚠️Do not mix ad hoc and survey VCE
If the dataset is from a complex survey, prefer svy workflows over ad hoc combinations of pweights and cluster choices.

Common Errors and Fixes

"weights not allowed"

The selected estimator or option does not support the specified weight type.

Check help for command-specific supported weight classes and switch to compatible estimator or weight type.

. xtset firm_id year
weights not allowed
r(101);
This causes the error
wrong-way.do
stata
xtset firm_id year
xtreg wage education [pweight=p_wt], fe
This is the fix
right-way.do
stata
xtset firm_id year
xtreg wage education, fe vce(cluster firm_id)
* or use supported survey design approach
error-fix.do
stata
1help xtreg
2regress wage education experience [pweight=p_wt], vce(robust)
. help xtreg
weights allowed are fweights and iweights

Command Reference

weight syntax

Stata docs →

Specifies observation weights with semantics tied to data collection and estimator assumptions.

regress y x [fweight=fw] [pweight=pw] [aweight=aw] [iweight=iw]
[fweight=var]Integer frequency counts
[pweight=var]Inverse probability weights
[aweight=var]Analytic inverse-variance style weights
svysetFull survey design specification

How Sytra Handles This

Sytra can infer likely weight intent from metadata and suggest compatible estimation commands and diagnostics.

A direct natural-language prompt for this exact workflow:

sytra-prompt.txt
bash
Given a wage survey dataset with PSU, strata, and probability weights, decide whether to use pweight with svyset or alternative weighting, run the model, and justify the choice in plain language.

Sytra catches these errors before you run.

Sytra can infer likely weight intent from metadata and suggest compatible estimation commands and diagnostics.

Join the Waitlist →

FAQ

Which weight type should I use for survey microdata?

Typically pweights, often with svyset and design variables, because they represent inverse selection probabilities.

Are aweights and pweights interchangeable?

No. They imply different variance assumptions and estimands; choosing the wrong type can invalidate inference.

What if I only have frequency counts?

Use fweights when each row represents repeated identical observations counted by an integer frequency variable.


Written by Sytra Team
Research Engineering Team, Sytra AI

We build practical, reproducible workflows for Stata and R teams working on real empirical research pipelines.

#Stata#Weights#Survey Data#Econometrics

Enjoyed this article?