Is Sytra free for researchers?

Yes. Sytra is free forever for individual researchers. You bring your own API key from OpenAI or Anthropic and pay only for the AI inference costs (typically $0.01-0.10 per query).

Does Sytra upload my data to the cloud?

No. Sytra runs entirely on your local machine. Your .dta files, .csv files, and code never leave your computer. Only the natural language prompt is sent to the AI provider.

What versions of Stata does Sytra support?

Sytra supports Stata 17 and later, including MP, SE, and BE editions.

Stata Errors

2026-02-0810 min read

Stata 'Convergence Not Achieved': Causes and Solutions for ML Estimation

Your logit, probit, or MLE model won't converge. Here's why — separation, multicollinearity, bad starting values — and the options that fix it.

Sytra Team

Research Engineering Team, Sytra AI

Your logit, probit, or maximum likelihood model starts iterating. The log-likelihood values bounce around. Then Stata gives up:

. logit outcome treatment age income i.state, robust

Iteration 0:   log pseudolikelihood = -24831.204
Iteration 1:   log pseudolikelihood = -18422.891
Iteration 2:   log pseudolikelihood = -17956.332
...
Iteration 25:  log pseudolikelihood = -17901.445
convergence not achieved
r(430);

“Convergence not achieved” means Stata’s optimization algorithm could not find stable parameter estimates. The log-likelihood function never settled at a maximum within the allowed number of iterations. This guide covers every common cause and the specific fix for each.

All examples tested in Stata 18 SE. Compatible with Stata 15+.

Quick Answer

Convergence failure in ML estimation has a few main causes:

Separation (perfect prediction) — by far the most common cause in logit/probit
Multicollinearity — near-identical predictors confuse the optimizer
Too many parameters for the sample size
Bad starting values — the optimizer starts too far from the solution
Flat or irregular likelihood surface

The fastest diagnostic: check for separation first, then try the difficult option, then simplify the model.

What Convergence Means in Maximum Likelihood

Maximum likelihood estimation (MLE) is iterative. Stata starts with initial parameter guesses, evaluates the log-likelihood function, adjusts the parameters to increase the likelihood, and repeats. “Convergence” means the parameters stopped changing — the algorithm found a stable maximum.

When convergence fails, it means:

The parameters keep changing between iterations (no stable maximum exists)
One or more parameters are drifting toward infinity
The algorithm is oscillating between two regions
The likelihood surface is too flat for the algorithm to find a direction

Cause 1: Separation (Perfect Prediction) in Logit/Probit

This is the most common cause of convergence failure in binary outcome models. Separation occurs when a predictor (or combination of predictors) perfectly predicts the outcome. The MLE coefficient wants to go to infinity — which obviously prevents convergence.

separation-example.do

stata

1* Example: rare disease with strong predictor
2* All patients with biomarker > 100 have the disease
3logit disease biomarker age gender    "stata-comment">// convergence not achieved
4
5* Diagnose: check for perfect prediction
6tab disease if biomarker > 100        "stata-comment">// all 1s
7tab disease if biomarker <= 100       "stata-comment">// mix of 0s and 1s

. tab disease if biomarker > 100

    disease |      Freq.     Percent        Cum.
────────────┼───────────────────────────────────
          1 |        847      100.00      100.00
────────────┼───────────────────────────────────
      Total |        847      100.00

Fixes for separation

separation-fixes.do

stata

1* 1. Remove the separating variable
2logit disease age gender, robust
3
4* 2. Use Firth's penalized likelihood (reduces bias from separation)
5firthlogit disease biomarker age gender
6
7* 3. Categorize the continuous predictor to break separation
8gen bio_cat = irecode(biomarker, 25, 50, 75, 100)
9logit disease i.bio_cat age gender
10
11* 4. Use Bayesian estimation (puts a prior on the coefficient)
12* bayes: logit disease biomarker age gender

💡Detecting separation

Install and run ssc install firthlogit for Firth’s penalized likelihood, which handles separation gracefully. You can also use logit, iterate(100) and watch the iteration log — if one coefficient keeps growing without bound, that variable is separated.

Cause 2: Multicollinearity

When two or more predictors are highly correlated (or perfectly collinear), the likelihood surface becomes a ridge with no clear maximum. The optimizer wanders along this ridge without converging.

collinearity.do

stata

1* income_thousands and income_dollars are perfectly correlated
2logit outcome income_thousands income_dollars age
3// convergence not achieved — two measures of the same thing
4
5* Diagnose collinearity
6correlate income_thousands income_dollars
7* r = 1.0 — perfectly collinear
8
9* Fix: drop one of the collinear variables
10logit outcome income_thousands age, robust
11
12* For near-collinearity, check VIF after OLS
13regress outcome income_thousands income_dollars age
14vif

. vif

    Variable |       VIF       1/VIF
─────────────┼──────────────────────
income_tho~s |  99999.99    0.000010
income_dol~s |  99999.99    0.000010
         age |      1.02    0.980392
─────────────┼──────────────────────
    Mean VIF |  66667.00

⚠️Rule of thumb

VIF > 10 suggests problematic multicollinearity. VIF > 100 almost certainly causes convergence issues in MLE. Drop or combine the collinear variables.

Cause 3: Too Many Parameters for the Sample Size

ML estimation requires enough observations per estimated parameter. A logit model with 50 dummy variables and 200 observations is asking the optimizer to find a needle in a very high-dimensional haystack.

too-many-params.do

stata

1* 200 observations, 48 state dummies + 5 continuous predictors
2logit outcome treatment age income i.state, robust
3// convergence not achieved — 53 parameters from 200 observations
4
5* Fix: reduce the number of parameters
6* Option 1: fewer fixed effects
7logit outcome treatment age income i.region, robust
8
9* Option 2: penalized likelihood
10firthlogit outcome treatment age income i.state
11
12* Option 3: conditional logit (for grouped data)
13clogit outcome treatment age income, group(state)

💡Rule of thumb

For logit/probit, you need roughly 10-20 events (cases where outcome = 1) per estimated parameter. With rare outcomes, you run out of events quickly as you add dummies.

Cause 4: Bad Starting Values

Stata picks starting values automatically, usually from a simplified version of the model. If the default starting values are far from the true maximum, the optimizer may get stuck in a flat region or diverge.

starting-values.do

stata

1* Provide your own starting values with from()
2* First, estimate a simpler model
3logit outcome treatment age, robust
4matrix b0 = e(b)
5
6* Use those estimates as starting values for the full model
7logit outcome treatment age income education, from(b0) robust
8
9* Or provide specific values
10logit outcome treatment age income, "stata-comment">///
11    from(treatment:0.5 age:0.02 income:0.001 _cons:-2) robust

Cause 5: The `iterate()` and `difficult` Options

Sometimes convergence is simply slow — the algorithm is making progress but needs more iterations than the default (usually 25 or 50).

ML estimation options

Options that control the convergence behavior of maximum likelihood estimation commands.

logit y x, [iterate(#)] [difficult] [from(matname)] [technique(algo)]

iterate(#)Maximum number of iterations (default varies by command)

difficultUse a more robust but slower optimization algorithm

from()Provide starting values

technique()Optimization algorithm: nr, bhhh, dfp, bfgs

convergence-options.do

stata

1* Increase max iterations
2logit outcome treatment age income, iterate(100) robust
3
4* Use the difficult option (slower but more robust algorithm)
5logit outcome treatment age income, difficult robust
6
7* Combine both
8logit outcome treatment age income, iterate(200) difficult robust
9
10* Try a different optimization algorithm
11logit outcome treatment age income, technique(bfgs) robust
12
13* Hybrid: start with one algorithm, switch to another
14logit outcome treatment age income, technique(bfgs 10 nr 20) robust

👁When iterate() won't help

If the model hasn’t converged after 100 iterations and the log-likelihood is barely changing, more iterations won’t help. The problem is structural — go back and check for separation or multicollinearity. iterate() only helps when the algorithm is making steady progress toward convergence.

Cause 6: Small Sample Size

With very small samples, the likelihood surface can be irregular — multiple local maxima, saddle points, or flat regions that confuse the optimizer.

small-sample.do

stata

1* With N = 30, logit can struggle
2logit outcome treatment age if subsample == 1, robust
3// convergence not achieved
4
5* Options for small samples:
6* 1. Exact logistic regression
7exlogistic outcome treatment age
8
9* 2. Firth's penalized likelihood
10firthlogit outcome treatment age
11
12* 3. Linear probability model (robust to small samples)
13regress outcome treatment age, robust

Simplifying the Model

When nothing else works, simplify. Remove variables one at a time to isolate which predictor is causing the convergence failure. Start with the most complex terms (interactions, polynomials, high-dimensional dummies).

simplify.do

stata

1* Full model — doesn't converge
2logit outcome treatment age income education "stata-comment">///
3    i.state i.year i.industry c.age#c.income, robust
4
5* Step 1: Remove interaction
6logit outcome treatment age income education "stata-comment">///
7    i.state i.year i.industry, robust
8
9* Step 2: Remove smallest FE group
10logit outcome treatment age income education "stata-comment">///
11    i.state i.year, robust
12
13* Step 3: Continue until it converges
14* Then add terms back one at a time to find the culprit

Sytra catches these errors before you run.

Sytra validates your model specification before estimation. It checks for separation, near-collinearity, and events-per-variable ratios — warning you before you hit convergence failure. Describe your analysis and get code that converges on the first try.

Join the Waitlist →

Debugging Checklist

Check for separation. Cross-tabulate the outcome with each predictor. Look for cells with zero counts.
Check for multicollinearity. Run OLS first and check VIF.
Try difficult. Add , difficult to your command.
Increase iterations. Add iterate(100) and watch the iteration log.
Provide starting values. Estimate a simpler model first and use from().
Simplify. Remove variables one at a time until convergence is achieved.
Consider alternatives. Firth logit, exact logistic, or a linear probability model.

FAQ

What does convergence not achieved mean in Stata?

It means Stata’s maximum likelihood optimization algorithm could not find stable parameter estimates. The log-likelihood function did not settle at a maximum within the allowed number of iterations.

How do I fix convergence not achieved in Stata logit?

The most common cause is separation (perfect prediction). Check for variables that perfectly predict the outcome. Other fixes: use difficult, increase iterate(), provide starting values with from(), or simplify the model.

What is separation in logistic regression?

Separation occurs when a predictor perfectly predicts the outcome for a subset of observations. The MLE coefficient for that variable wants to go to infinity, which prevents convergence. Firth’s penalized likelihood (firthlogit) is the standard solution.

Should I just increase iterate() until it converges?

Not blindly. If the log-likelihood is still changing meaningfully between iterations, more iterations may help. If it’s barely moving or oscillating, the problem is structural and more iterations won’t fix it. Check for separation and collinearity first.

Written by Sytra Team

Research Engineering Team, Sytra AI

We build practical, reproducible workflows for Stata and R teams working on real empirical research pipelines.

#Stata#Errors#MLE#Logit#Probit

Stata 'Convergence Not Achieved': Causes and Solutions for ML Estimation

Quick Answer

What Convergence Means in Maximum Likelihood

Cause 1: Separation (Perfect Prediction) in Logit/Probit

Fixes for separation

Cause 2: Multicollinearity

Cause 3: Too Many Parameters for the Sample Size

Cause 4: Bad Starting Values

Cause 5: The `iterate()` and `difficult` Options

ML estimation options

Cause 6: Small Sample Size

Simplifying the Model

Sytra catches these errors before you run.

Debugging Checklist

FAQ

What does convergence not achieved mean in Stata?

How do I fix convergence not achieved in Stata logit?

What is separation in logistic regression?

Should I just increase iterate() until it converges?

Enjoyed this article?

Related Guides

Singleton Observations in Stata reghdfe: What They Are and What to Do

Logistic Regression in Stata: Marginal Effects That Actually Make Sense

Stata 'matsize too small' Error: How to Fix It (and When You Shouldn't)

Stata Error r(198): Every Cause of 'Invalid Syntax' and How to Fix It

Stata Error r(111): '[variable] not found' — Complete Fix Guide

Stata 'Convergence Not Achieved': Causes and Solutions for ML Estimation

Quick Answer

What Convergence Means in Maximum Likelihood

Cause 1: Separation (Perfect Prediction) in Logit/Probit

Fixes for separation

Cause 2: Multicollinearity

Cause 3: Too Many Parameters for the Sample Size

Cause 4: Bad Starting Values

Cause 5: The iterate() and difficult Options

ML estimation options

Cause 6: Small Sample Size

Simplifying the Model

Sytra catches these errors before you run.

Debugging Checklist

FAQ

What does convergence not achieved mean in Stata?

How do I fix convergence not achieved in Stata logit?

What is separation in logistic regression?

Should I just increase iterate() until it converges?

Enjoyed this article?

Related Guides

Singleton Observations in Stata reghdfe: What They Are and What to Do

Logistic Regression in Stata: Marginal Effects That Actually Make Sense

Stata 'matsize too small' Error: How to Fix It (and When You Shouldn't)

Stata Error r(198): Every Cause of 'Invalid Syntax' and How to Fix It

Stata Error r(111): '[variable] not found' — Complete Fix Guide

Cause 5: The `iterate()` and `difficult` Options