Stata Errors
2026-02-0711 min read

Stata Error r(2000) and r(2001): 'No Observations' — Why It Happens and How to Fix

You had 50,000 observations a minute ago. Now Stata says zero. Here's why your if condition, merge, or drop wiped your dataset — and how to recover.

Sytra Team
Research Engineering Team, Sytra AI

You had 50,000 observations a minute ago. Now Stata says you have zero. Your data didn’t vanish — something in your code excluded every single observation. Here’s how to find out what happened.

. regress wage education experience if gender == Female
no observations
r(2000);

Error r(2000) means the operation requires data but none is available. Either the dataset is empty or your if/in condition filtered out every observation. Error r(2001) is the same thing in estimation contexts. This guide covers every cause and the fix for each.

All examples tested in Stata 18 SE. Compatible with Stata 15+.


Quick Answer

Your data is probably still fine. The most common causes of r(2000):

  1. Your if condition matches zero observations (usually a string/numeric mismatch)
  2. Using = instead of == for comparison
  3. Missing values propagating through conditions
  4. An earlier drop or keep command removed all data
  5. A merge produced zero matched observations

Start debugging by checking your observation count:

quick-check.do
stata
1* How many observations do you have?
2count
3
4* How many match your condition?
5count if gender == "Female"
6
7* What values does the variable actually have?
8tab gender

Cause 1: if Condition Matches Zero Observations

The most common cause. Your if condition is syntactically valid but logically excludes every observation. This usually happens because of a mismatch between what you typed and what the data actually contains.

if-mismatch.do
stata
1* You expect "Female" but the data has "female" (lowercase)
2count if gender == "Female" "stata-comment">// 0
3count if gender == "female" "stata-comment">// 24,531
4
5* You expect "USA" but the data has "United States"
6count if country == "USA" "stata-comment">// 0
7count if country == "United States" "stata-comment">// 8,200

The fix

Always check the actual values before filtering:

check-values.do
stata
1* For categorical variables
2tab gender
3tab country
4
5* For string variables with many values
6levelsof country, local(countries)
7display `"`countries'"'
💡Partial matching
Use strmatch() or strpos() for flexible string matching when you’re unsure of the exact value: count if strpos(country, "United") > 0.

Cause 2: Using = Instead of ==

In Stata, = is assignment and == is comparison. Using =in an if condition doesn’t cause r(2000) directly — Stata usually catches this as a syntax error. But in some contexts, the single equals sign creates unexpected behavior that leads to zero matches.

Wrong — assignment operator
stata
* This is WRONG in an if condition
list if wage = 50000
* Stata may interpret this unexpectedly
Correct — comparison operator
stata
* Use == for comparison
list if wage == 50000
* Or use relational operators
list if wage >= 50000

Cause 3: String vs Numeric Comparison

If the variable is a string, you must compare with a string value (in quotes). If the variable is numeric, you must compare with a number (no quotes). Mixing these up produces zero matches — not an error, just an empty result that then triggers r(2000) when you try to run a command.

string-numeric.do
stata
1* state_fips is stored as a STRING variable
2count if state_fips == 6 "stata-comment">// 0 — comparing string to number
3count if state_fips == "6" "stata-comment">// correct
4count if state_fips == "06" "stata-comment">// might also be needed (leading zero)
5
6* Check the storage type
7describe state_fips
. describe state_fips
              storage   display    value
variable name   type    format    label      variable label
─────────────────────────────────────────────────────────────
state_fips      str2    %2s                  State FIPS code
👁Leading zeros
FIPS codes, ZIP codes, and country codes are often stored as strings precisely because they have leading zeros. "06" (California) is not the same as "6". Always tab or list a few values to see the exact format.

Cause 4: Missing Values in Conditions

Stata treats missing values (.) as positive infinity in comparisons. This means conditions involving > can match missing values, while conditions involving== may exclude observations you expected to include.

missing-values.do
stata
1* Suppose wage has missing values for 10,000 observations
2count if wage > 50000 "stata-comment">// includes missings! (. > 50000 is true)
3count if wage > 50000 & wage < . "stata-comment">// excludes missings — correct
4count if wage > 50000 & !missing(wage) "stata-comment">// same thing, clearer
5
6* This may return 0 if ALL observations have missing wage
7count if wage == 50000 "stata-comment">// missings are excluded (. != 50000)
8count if wage != . "stata-comment">// count non-missing values
⚠️Critical
Missing values in Stata are greater than any number. if wage > 0 includes observations where wage is missing. Always add & !missing(wage)when you mean “positive values only.”

Cause 5: After a Merge with _merge

After merge, the _merge variable tells you which observations matched. If zero observations matched between the two datasets, any operation on the merged data filtered by match status can produce r(2000).

merge-nomatch.do
stata
1merge 1:1 country_id year using "trade_data.dta"
2tab _merge
. tab _merge
                 _merge |      Freq.     Percent        Cum.
────────────────────────┼───────────────────────────────────
        Master only (1) |     48,000      100.00      100.00
────────────────────────┼───────────────────────────────────
                  Total |     48,000      100.00
merge-diagnosis.do
stata
1* Zero matches — something is wrong with the merge key
2* Check the key variables in both datasets
3describe country_id "stata-comment">// Is it numeric or string?
4tab country_id in 1/10 "stata-comment">// What values does it have?
5
6* Common fix: the key variable types don't match
7* One dataset has numeric country_id, the other has string
8tostring country_id, replace "stata-comment">// or: destring country_id, replace

Cause 6: After drop or keep

A drop if or keep if command that matches all observations leaves the dataset empty. This is irreversible in memory — but the original file on disk is unchanged.

drop-all.do
stata
1* Meant to drop missing wages, but used wrong condition
2drop if wage >= 0 "stata-comment">// drops ALL non-missing observations!
3* Should have been:
4* drop if missing(wage)
5
6* Now everything is gone
7count "stata-comment">// 0
8regress wage education "stata-comment">// r(2000)

The fix

Reload the dataset from disk: use "mydata.dta", clear. The dropcommand only affects data in memory — your .dta file is safe.

💡Safety pattern
Always count your condition before dropping: count if wage < 0. If the count looks reasonable, then drop if wage < 0. This prevents accidental deletion of your entire dataset.

Cause 7: Subset with in Out of Range

The in qualifier selects observations by position. If your range exceeds the dataset size, or if combined with if it produces zero matches:

in-range.do
stata
1* Dataset has 1,000 observations
2count "stata-comment">// 1000
3
4* This range is out of bounds
5list wage in 2000/3000 "stata-comment">// r(2000)
6
7* Combined if/in that produces zero
8list wage in 1/100 if wage > 999999 "stata-comment">// r(2000) if no wage exceeds 999999

Cause 8: Panel Gaps and tsfill

In panel data, gaps in the time variable can cause commands to find zero observations for certain time periods. After tsfill, new observations are created but all their variables (except the panel and time IDs) are missing.

panel-gaps.do
stata
1xtset firm_id year
2tsfill
3
4* The new observations have missing values for everything
5summarize wage if year == 2015 "stata-comment">// may return 0 if 2015 was a gap year
6
7* Check what tsfill added
8count if missing(wage) "stata-comment">// large number after tsfill

Cause 9: Missing Data Cascading Through Commands

When you generate a new variable from data with missing values, the result is also missing for those observations. Chain enough operations together and you can end up with a variable that is entirely missing — producing r(2000) when you try to analyze it.

cascade.do
stata
1* income has 20% missing, hours has 15% missing
2gen hourly_wage = income / hours "stata-comment">// missing if EITHER is missing
3gen log_hw = ln(hourly_wage) "stata-comment">// also missing where hourly_wage <= 0
4
5* By now, log_hw might be missing for 40% of observations
6* Add an if condition and you might hit 0
7regress log_hw education if age > 65 "stata-comment">// r(2000) — too few observations
8
9* Diagnosis
10count if !missing(log_hw) & age > 65 "stata-comment">// check before running
💡Prevention
Before running a regression, check the effective sample size:count if !missing(depvar) & !missing(indepvar1) & !missing(indepvar2). Regression commands only use complete cases, so this tells you exactly how many observations will be used.

Sytra catches these errors before you run.

Sytra checks your data dimensions before generating estimation commands. It warns you when an if condition would eliminate all observations — before you run the command. Describe your analysis and get code that handles missing data correctly.

Join the Waitlist →

Debugging Checklist

When you hit r(2000) or r(2001):

  1. Run count. Is the dataset empty? If yes, reload from disk.
  2. Check the if condition. Run count if [your condition] to see how many observations match.
  3. Check string vs numeric. Run describe varname to see if it’s str or float/int.
  4. Check for missing values. Run count if missing(varname) for your key variables.
  5. Check _merge. After a merge, tab _merge to see how many observations matched.
  6. Search for drop/keep. Did an earlier command remove all observations?

FAQ

What does r(2000) mean in Stata?

Error r(2000) means your command requires observations but the current dataset has zero. Either the dataset is empty or your if/in condition excluded every observation.

What is the difference between r(2000) and r(2001)?

Both mean “no observations.” r(2000) is the general form. r(2001) occurs specifically in estimation commands (regress, logit, etc.) when the estimation sample is empty after accounting for missing values and conditions.

How do I recover my data after getting r(2000)?

If the error came from an if condition, your data is still in memory — just change the condition. If you accidentally dropped all observations, your original file on disk is unchanged: use "mydata.dta", clear.

Why does my if condition eliminate all observations?

The most common reasons: comparing a string variable with a numeric value (or vice versa), case-sensitivity in string comparisons ("Female" vs "female"), or not accounting for missing values in the condition.

Written by Sytra Team
Research Engineering Team, Sytra AI

We build practical, reproducible workflows for Stata and R teams working on real empirical research pipelines.

#Stata#Errors#Debugging#Data Management

Enjoyed this article?