Stata 'variable already defined': Why gen Fails and How to Fix It
You ran gen and Stata said the variable already exists. Here's when to use replace, when to drop first, and the safe pattern for do-files.
You’re running your do-file for the second time. The first run worked perfectly. Now you get:
variable log_wage already defined r(110);
This happens because gen can only create new variables. If the variable already exists — because you created it on the first run and didn’t clear the data — Stata refuses to overwrite it. Here’s how to handle it properly.
All examples tested in Stata 18 SE. Compatible with Stata 15+.
Quick Answer
1// Option 1: Replace if variable exists2replace log_wage = log(wage)34// Option 2: Drop first, then generate5capture drop log_wage6gen log_wage = log(wage)78// Option 3: Start fresh — reload data9use analysis_data.dta, clear10gen log_wage = log(wage)gen vs. replace: The Core Distinction
Stata enforces a strict separation between creating and modifying variables:
| Command | Variable must... | If not... |
|---|---|---|
| gen | NOT exist | r(110) — already defined |
| replace | ALREADY exist | r(111) — not found |
This is intentional. Stata prevents you from accidentally overwriting variables. In interactive use, it’s a safety net. In do-files you run repeatedly, it’s an annoyance you need to handle.
Pattern 1: Use replace When Modifying Values
If the variable exists and you want to change its values, use replace:
1// First run: create the variable2gen treatment_post = treatment * post_period34// Second run: update values (maybe you changed the definition)5replace treatment_post = treatment * post_period67// Replace with conditional8replace wage = . if wage < 0 "stata-comment">// Set negative wages to missing9replace education = 16 if education > 16 "stata-comment">// Top-code educationreplace preserves variable labels, value labels, and notes. If you use drop +gen, all metadata is lost.Pattern 2: capture drop + gen (The Do-File Pattern)
The most common pattern for do-files that run repeatedly:
1// capture suppresses the error if the variable doesn't exist2// drop removes the variable if it does exist3// gen creates it fresh45capture drop log_wage6gen log_wage = log(wage)78capture drop age_sq9gen age_sq = age^21011capture drop treatment_post12gen treatment_post = treatment * post_periodThe capture prefix tells Stata: “Run this command. If it throws an error, ignore it and continue.” If log_wage doesn’t exist yet, drop log_wagewould fail — but capture swallows the error.
capture hides ALL errors, not just “variable not found.” Don’t wrap your entire do-file in capture blocks — you’ll miss real problems. Use it narrowly: capture drop varname.Pattern 3: Reload the Data
The simplest and safest approach for do-files: reload the original data at the top.
1// Master do-file pattern — always starts clean2clear all3set more off45// Load raw data6use "data/raw/survey_2024.dta", clear78// All gen commands work because we started fresh9gen log_wage = log(wage)10gen age_sq = age^211gen treatment_post = treatment * post_period1213// Save constructed dataset14save "data/constructed/analysis_sample.dta", replacecapture drop anywhere. This is the most reproducible approach.Pattern 4: Use tempvar for Intermediate Variables
If you’re creating temporary variables for intermediate calculations, use tempvar. Temporary variables are automatically dropped when the do-file or program ends.
1// Temporary variables — automatically cleaned up2tempvar residual predicted3regress wage education experience4predict `predicted'5gen `residual' = wage - `predicted'67// Use temporary variables for intermediate calculations8summarize `residual', detail910// When the do-file ends, these variables disappear11// No "already defined" errors on next runCommon Mistake: gen with an if Condition
A subtle issue: gen with an if condition creates the variable for ALL observations, setting unmatched observations to missing. Running it twice still fails.
// Creates log_wage for ALL obs// (missing for wage <= 0)gen log_wage = log(wage) if wage > 0// Second run:gen log_wage = log(wage) if wage > 0// r(110) — already defined!// Safe pattern:capture drop log_wagegen log_wage = log(wage) if wage > 0// OR: reload data firstuse mydata.dta, cleargen log_wage = log(wage) if wage > 0Sytra catches these errors before you run.
Sytra tracks which variables exist in your dataset and automatically uses gen for new variables and replace for existing ones. No more r(110) errors.
Join the Waitlist →FAQ
What does “variable already defined” mean in Stata?
It means you used gen to create a variable that already exists in your dataset.gen can only create new variables. To modify an existing variable, use replace.
What is the difference between gen and replace in Stata?
gen creates a new variable that does not yet exist. replace modifies the values of an existing variable. Using gen on an existing variable throws r(110); using replace on a non-existent variable throws r(111).
How do I safely overwrite a variable in a do-file?
Use capture drop varname followed by gen varname = expression. Thecapture suppresses errors if the variable doesn’t exist yet. Or better: reload your data at the top of the do-file with use data.dta, clear.
Should I use capture drop or replace?
Use replace when modifying values of an existing variable (it preserves labels and notes). Use capture drop + gen when recreating a variable from scratch. Best practice: reload data at the top of each do-file so everything starts clean.
Related Guides
- Stata Error r(111): Variable Not Found — Complete Fix Guide
- Stata Error r(198): Every Cause of Invalid Syntax
- Stata Error r(100): Varlist Required and Varlist Not Allowed
We build practical, reproducible workflows for Stata and R teams working on real empirical research pipelines.