Stata Error r(2000) and r(2001): 'No Observations' — Why It Happens and How to Fix
You had 50,000 observations a minute ago. Now Stata says zero. Here's why your if condition, merge, or drop wiped your dataset — and how to recover.
You had 50,000 observations a minute ago. Now Stata says you have zero. Your data didn’t vanish — something in your code excluded every single observation. Here’s how to find out what happened.
no observations r(2000);
Error r(2000) means the operation requires data but none is available. Either the dataset is empty or your if/in condition filtered out every observation. Error r(2001) is the same thing in estimation contexts. This guide covers every cause and the fix for each.
All examples tested in Stata 18 SE. Compatible with Stata 15+.
Quick Answer
Your data is probably still fine. The most common causes of r(2000):
- Your
ifcondition matches zero observations (usually a string/numeric mismatch) - Using
=instead of==for comparison - Missing values propagating through conditions
- An earlier
droporkeepcommand removed all data - A merge produced zero matched observations
Start debugging by checking your observation count:
1* How many observations do you have?2count34* How many match your condition?5count if gender == "Female"67* What values does the variable actually have?8tab genderCause 1: if Condition Matches Zero Observations
The most common cause. Your if condition is syntactically valid but logically excludes every observation. This usually happens because of a mismatch between what you typed and what the data actually contains.
1* You expect "Female" but the data has "female" (lowercase)2count if gender == "Female" "stata-comment">// 03count if gender == "female" "stata-comment">// 24,53145* You expect "USA" but the data has "United States"6count if country == "USA" "stata-comment">// 07count if country == "United States" "stata-comment">// 8,200The fix
Always check the actual values before filtering:
1* For categorical variables2tab gender3tab country45* For string variables with many values6levelsof country, local(countries)7display `"`countries'"'strmatch() or strpos() for flexible string matching when you’re unsure of the exact value: count if strpos(country, "United") > 0.Cause 2: Using = Instead of ==
In Stata, = is assignment and == is comparison. Using =in an if condition doesn’t cause r(2000) directly — Stata usually catches this as a syntax error. But in some contexts, the single equals sign creates unexpected behavior that leads to zero matches.
* This is WRONG in an if conditionlist if wage = 50000* Stata may interpret this unexpectedly* Use == for comparisonlist if wage == 50000* Or use relational operatorslist if wage >= 50000Cause 3: String vs Numeric Comparison
If the variable is a string, you must compare with a string value (in quotes). If the variable is numeric, you must compare with a number (no quotes). Mixing these up produces zero matches — not an error, just an empty result that then triggers r(2000) when you try to run a command.
1* state_fips is stored as a STRING variable2count if state_fips == 6 "stata-comment">// 0 — comparing string to number3count if state_fips == "6" "stata-comment">// correct4count if state_fips == "06" "stata-comment">// might also be needed (leading zero)56* Check the storage type7describe state_fipsstorage display value variable name type format label variable label ───────────────────────────────────────────────────────────── state_fips str2 %2s State FIPS code
"06" (California) is not the same as "6". Always tab or list a few values to see the exact format.Cause 4: Missing Values in Conditions
Stata treats missing values (.) as positive infinity in comparisons. This means conditions involving > can match missing values, while conditions involving== may exclude observations you expected to include.
1* Suppose wage has missing values for 10,000 observations2count if wage > 50000 "stata-comment">// includes missings! (. > 50000 is true)3count if wage > 50000 & wage < . "stata-comment">// excludes missings — correct4count if wage > 50000 & !missing(wage) "stata-comment">// same thing, clearer56* This may return 0 if ALL observations have missing wage7count if wage == 50000 "stata-comment">// missings are excluded (. != 50000)8count if wage != . "stata-comment">// count non-missing valuesif wage > 0 includes observations where wage is missing. Always add & !missing(wage)when you mean “positive values only.”Cause 5: After a Merge with _merge
After merge, the _merge variable tells you which observations matched. If zero observations matched between the two datasets, any operation on the merged data filtered by match status can produce r(2000).
1merge 1:1 country_id year using "trade_data.dta"2tab _merge _merge | Freq. Percent Cum.
────────────────────────┼───────────────────────────────────
Master only (1) | 48,000 100.00 100.00
────────────────────────┼───────────────────────────────────
Total | 48,000 100.001* Zero matches — something is wrong with the merge key2* Check the key variables in both datasets3describe country_id "stata-comment">// Is it numeric or string?4tab country_id in 1/10 "stata-comment">// What values does it have?56* Common fix: the key variable types don't match7* One dataset has numeric country_id, the other has string8tostring country_id, replace "stata-comment">// or: destring country_id, replaceCause 6: After drop or keep
A drop if or keep if command that matches all observations leaves the dataset empty. This is irreversible in memory — but the original file on disk is unchanged.
1* Meant to drop missing wages, but used wrong condition2drop if wage >= 0 "stata-comment">// drops ALL non-missing observations!3* Should have been:4* drop if missing(wage)56* Now everything is gone7count "stata-comment">// 08regress wage education "stata-comment">// r(2000)The fix
Reload the dataset from disk: use "mydata.dta", clear. The dropcommand only affects data in memory — your .dta file is safe.
count your condition before dropping: count if wage < 0. If the count looks reasonable, then drop if wage < 0. This prevents accidental deletion of your entire dataset.Cause 7: Subset with in Out of Range
The in qualifier selects observations by position. If your range exceeds the dataset size, or if combined with if it produces zero matches:
1* Dataset has 1,000 observations2count "stata-comment">// 100034* This range is out of bounds5list wage in 2000/3000 "stata-comment">// r(2000)67* Combined if/in that produces zero8list wage in 1/100 if wage > 999999 "stata-comment">// r(2000) if no wage exceeds 999999Cause 8: Panel Gaps and tsfill
In panel data, gaps in the time variable can cause commands to find zero observations for certain time periods. After tsfill, new observations are created but all their variables (except the panel and time IDs) are missing.
1xtset firm_id year2tsfill34* The new observations have missing values for everything5summarize wage if year == 2015 "stata-comment">// may return 0 if 2015 was a gap year67* Check what tsfill added8count if missing(wage) "stata-comment">// large number after tsfillCause 9: Missing Data Cascading Through Commands
When you generate a new variable from data with missing values, the result is also missing for those observations. Chain enough operations together and you can end up with a variable that is entirely missing — producing r(2000) when you try to analyze it.
1* income has 20% missing, hours has 15% missing2gen hourly_wage = income / hours "stata-comment">// missing if EITHER is missing3gen log_hw = ln(hourly_wage) "stata-comment">// also missing where hourly_wage <= 045* By now, log_hw might be missing for 40% of observations6* Add an if condition and you might hit 07regress log_hw education if age > 65 "stata-comment">// r(2000) — too few observations89* Diagnosis10count if !missing(log_hw) & age > 65 "stata-comment">// check before runningcount if !missing(depvar) & !missing(indepvar1) & !missing(indepvar2). Regression commands only use complete cases, so this tells you exactly how many observations will be used.Sytra catches these errors before you run.
Sytra checks your data dimensions before generating estimation commands. It warns you when an if condition would eliminate all observations — before you run the command. Describe your analysis and get code that handles missing data correctly.
Join the Waitlist →Debugging Checklist
When you hit r(2000) or r(2001):
- Run
count. Is the dataset empty? If yes, reload from disk. - Check the
ifcondition. Runcount if [your condition]to see how many observations match. - Check string vs numeric. Run
describe varnameto see if it’sstrorfloat/int. - Check for missing values. Run
count if missing(varname)for your key variables. - Check
_merge. After a merge,tab _mergeto see how many observations matched. - Search for
drop/keep. Did an earlier command remove all observations?
FAQ
What does r(2000) mean in Stata?
Error r(2000) means your command requires observations but the current dataset has zero. Either the dataset is empty or your if/in condition excluded every observation.
What is the difference between r(2000) and r(2001)?
Both mean “no observations.” r(2000) is the general form. r(2001) occurs specifically in estimation commands (regress, logit, etc.) when the estimation sample is empty after accounting for missing values and conditions.
How do I recover my data after getting r(2000)?
If the error came from an if condition, your data is still in memory — just change the condition. If you accidentally dropped all observations, your original file on disk is unchanged: use "mydata.dta", clear.
Why does my if condition eliminate all observations?
The most common reasons: comparing a string variable with a numeric value (or vice versa), case-sensitivity in string comparisons ("Female" vs "female"), or not accounting for missing values in the condition.
We build practical, reproducible workflows for Stata and R teams working on real empirical research pipelines.