Stata Error r(111): '[variable] not found' — Complete Fix Guide
Your variable exists. You can see it. But Stata says "not found." Here's every reason that happens and how to fix each one in under a minute.
Your variable exists. You created it two lines ago. You can see it in the Variable window. You run the command and Stata says:
variable eduaction not found r(111);
Error r(111) means Stata cannot find a variable you referenced. It does not exist in the current dataset — or at least, not under the name you typed. This guide covers every documented reason this happens and shows the fix for each.
All examples tested in Stata 18 SE. Compatible with Stata 15+.
Quick Answer
Error r(111) means the variable name you typed does not match any variable in memory. The most common causes are:
- Typo in the variable name
- Case-sensitivity mismatch (
Incomevsincome) - Variable was dropped or never created
- Variable is in a different data frame (Stata 16+)
- Forgot
i.prefix for factor variables - Time-series operator without
tsset
The fastest diagnostic: run describe to see every variable in your dataset, then lookfor to search by keyword.
1* See all variables2describe34* Search for a variable by keyword5lookfor wage6lookfor educCause 1: Typo in the Variable Name
The most common cause, period. One wrong letter and Stata cannot find the variable. Common patterns: transposed letters (edcuation), missing letters (educaton), or extra characters (education1 when the variable is education).
regress wage eduaction experience* "eduaction" instead of "education"regress wage education experienceThe fix
Run lookfor with part of the variable name. It searches both variable names and labels:
1lookfor educstorage display value variable name type format label variable label ───────────────────────────────────────────────────────────── education float %9.0g Years of education
Cause 2: Case Sensitivity
Stata variable names are case-sensitive. Income, income, and INCOME are three entirely different variables. If your dataset hasIncome and you type income, you get r(111).
summarize income* But the variable is "Income" (capital I)summarize Income* Match the exact case from describeThe fix
Run describe and check the exact casing. If you want to permanently fix inconsistent casing, rename the variable:
1* See exact variable names2describe, simple34* Rename to lowercase (common convention)5rename Income income6rename FIPS fipsrename *, lower. This is a good first step when importing data from Excel or other case-inconsistent sources.Cause 3: Variable Was Dropped Earlier
If your do-file runs drop or keep before the line that references the variable, the variable no longer exists. This is especially common in long do-files where you clean data in one section and analyze in another.
1* Data cleaning section2use "raw_data.dta", clear3drop if missing(wage)4drop race ethnicity religion "stata-comment">// dropped some variables56* ... 200 lines later ...78* Analysis section — oops, "race" was dropped9regress wage education i.race "stata-comment">// r(111): race not foundThe fix
Search your do-file for drop and keep commands. If you need the variable later, either don’t drop it, or reload the data. You can also use preserve/restoreto make destructive changes safely:
1preserve2 keep wage education experience3 regress wage education experience, robust4restore5* All variables are backCause 4: Variable Is in a Different Frame
Stata 16+ supports multiple data frames in memory. If you loaded data into a named frame but your command runs against the default frame, Stata won’t find the variable.
1* Load data into a named frame2frame create analysis3frame analysis: use "mydata.dta"45* This fails — default frame has no data6summarize wage "stata-comment">// r(111)78* This works — specify the frame9frame analysis: summarize wage1011* Or switch to the frame12frame change analysis13summarize wage "stata-comment">// now it worksframe dir to see all frames in memory, and frame (no arguments) to see which frame is currently active.Cause 5: Abbreviation Ambiguity
Stata allows variable name abbreviation — you can type edu instead of education if no other variable starts with edu. But if multiple variables share the same prefix, the abbreviation is ambiguous and Stata throws r(111).
1* Two variables starting with "edu"2describe edu*storage display value variable name type format label variable label ───────────────────────────────────────────────────────────── education float %9.0g Years of education edu_level byte %9.0g edlbl Education level
1* Ambiguous abbreviation — Stata can't tell which you mean2summarize edu "stata-comment">// r(111)34* Be specific5summarize education "stata-comment">// works6summarize edu_level "stata-comment">// worksCause 6: String vs Numeric Confusion
If you try to use a numeric operator on a string variable (or vice versa), the error can sometimes manifest as r(111) rather than a type mismatch error — especially inif conditions with incorrect comparison syntax.
* If state_id is a string variablesummarize wage if state_id == 6* Use quotes for string comparisonsummarize wage if state_id == "6"* Or if you need numeric: destring firstdestring state_id, replacesummarize wage if state_id == 6Cause 7: Time-Series Operators Without tsset
Time-series operators like L. (lag), F. (lead), and D. (difference) require that your data is declared as time series with tsset or xtset. Without this declaration, Stata interpretsL.gdp as a variable literally named L.gdp — which doesn’t exist.
1* Forgot to tsset2regress gdp_growth L.gdp investment "stata-comment">// r(111): L.gdp not found34* Fix: declare time structure first5tsset country_id year6regress gdp_growth L.gdp investment "stata-comment">// now worksvariable L.gdp not found r(111);
xtset panelvar timevar instead of tsset. Both enable time-series operators, but xtset also declares the panel structure for xtreg,xtlogit, etc.Cause 8: Factor Variables Without i. Prefix
If you want to include a categorical variable as a set of dummies in a regression, you need the i. prefix. Without it, Stata treats the variable as continuous. But certain contexts — like interactions — require the prefix, and omitting it can produce r(111) in some command contexts.
1* If "industry" is a numeric categorical variable (1, 2, 3, ...)2* This may give unexpected results or errors3regress wage education industry45* This creates dummies correctly6regress wage education i.industry78* Interactions also need the prefix9regress wage education i.industry##c.experienceCause 9: Referencing _n and _N Incorrectly
_n (current observation number) and _N (total observations) are system variables. They cannot be used in all contexts — for example, you cannot use them in by prefix contexts without understanding that they reset per group.
1* _n and _N are not regular variables2describe _n "stata-comment">// r(111) — _n doesn't show up in describe34* But they work in expressions5gen obs_number = _n6gen total_obs = _N78* With by-groups, _n and _N are within-group9bysort state: gen state_obs = _n10bysort state: gen state_total = _NCause 10: After clear or use
Running clear removes all data and variables from memory. Runninguse replaces the current dataset. If you reference a variable from the previous dataset after either command, you get r(111).
1use "wages.dta", clear2gen log_wage = ln(wage)34* Now load a different dataset5use "demographics.dta", clear67* This variable no longer exists8summarize log_wage "stata-comment">// r(111): log_wage was in the wages dataThe fix
If you need variables from multiple datasets, either merge the datasets first, or use frame (Stata 16+) to keep both datasets in memory simultaneously.
Cause 11: preserve/restore Scope
Variables created between preserve and restore are discarded when restore executes. If you created a variable inside a preserve block and try to use it after restore, it’s gone.
1preserve2 gen temp_indicator = (wage > 50000)3 tab temp_indicator4restore56* temp_indicator no longer exists7summarize temp_indicator "stata-comment">// r(111)restore. Macros and scalars survive restore — variables don’t.1preserve2 keep if wage > 500003 count4 local high_wage_n = r(N)5restore67display "High-wage observations: `high_wage_n'"Sytra catches these errors before you run.
Sytra validates variable references against your actual dataset before generating code. If a variable doesn't exist, Sytra tells you — before you hit Enter. Describe your analysis in plain English and get code that references only variables that exist.
Join the Waitlist →Debugging Checklist
When you hit r(111), work through this list:
- Run
describe. See every variable in the current dataset. - Run
lookfor keyword. Search variable names and labels for what you’re looking for. - Check the spelling. One wrong letter is enough.
- Check the case.
Wageandwageare different. - Check for
drop/keep. Did you remove the variable earlier in your do-file? - Check
tsset/xtset. Time-series operators need a declared time structure. - Check the active frame. Run
frameto see which dataset is active.
FAQ
What does r(111) mean in Stata?
Error r(111) means Stata cannot find a variable you referenced. The variable does not exist in the current dataset — either it was never created, was dropped, is misspelled, or lives in a different frame.
Why does Stata say variable not found when I can see it?
The most common reason is a case mismatch. Stata is case-sensitive: Income andincome are different variables. Run describe to see the exact names. Another possibility: the variable is in a different frame (Stata 16+).
How do I check which variables exist in my dataset?
Run describe to list all variables with their types and labels. Run ds for a compact list of names only. Run lookfor keyword to search both names and labels.
Can r(111) happen after a merge?
Yes. If you used keepusing() during merge, only the specified variables from the using dataset are brought in. If you expected a variable that wasn’t in keepusing(), it won’t be in the merged result and you’ll get r(111).
We build practical, reproducible workflows for Stata and R teams working on real empirical research pipelines.