Methodology
2026-03-069 min read

Logistic Regression in Stata: Marginal Effects That Actually Make Sense

Odds ratios are confusing. Marginal effects are what you actually want. Here's how to compute and interpret them correctly in Stata.

Sytra Team
Research Engineering Team, Sytra AI

Ask a room of applied researchers what an odds ratio of 1.3 means, and you’ll get three different answers — all of them wrong. Odds ratios are the default output of logistic regression, and they’re one of the most frequently misinterpreted statistics in the social and health sciences.

The fix is margins. Marginal effects tell you what you actually want to know: how much does the probability of the outcome change when the explanatory variable changes by one unit? This guide shows you how to get from logit to interpretable, publishable results.

The Problem with Odds Ratios

When you run logit y x, or, Stata reports odds ratios. An OR of 1.3 means “the odds of y=1 are 30% higher for a one-unit increase in x.” But here’s the problem:

  • Odds ≠ probability. If the baseline probability is 0.01 (1%), an OR of 2, doubling the odds, increases the probability to about 0.02 (2%). If the baseline is 0.50 (50%), the same OR doubles the odds to give a probability of about 0.67 (67%). The same odds ratio has wildly different probability implications depending on the baseline.
  • Non-collapsibility. Odds ratios change when you add or remove covariates, even if the covariates are independent of the treatment. This doesn’t happen with risk ratios or marginal effects. It means you can’t compare ORs across models with different controls.
  • Nobody thinks in odds. When a clinician says “the treatment doubles the risk,” they mean risk (probability), not odds. When a policy paper says “education reduces the probability of unemployment by 5 percentage points,” that’s a marginal effect, not an odds ratio.

Basic Logistic Regression in Stata

* Logit model — log-odds coefficients (default)
logit employed education age i.race, vce(robust)
 
* Same model — report odds ratios
logit employed education age i.race, vce(robust) or

Both report the same model. The or option just exponentiates the coefficients. Neither gives you what you probably need for substantive interpretation.

Average Marginal Effects (AME)

The most common approach: compute the marginal effect for each observation at its actual covariate values, then average across all observations.

* Average marginal effects for all variables
logit employed education age i.race, vce(robust)
margins, dydx(*)
 
* Just for education
margins, dydx(education)

The output is in probability units (percentage points). A marginal effect of 0.05 means: a one-unit increase in education is associated with a 5 percentage point increase in the probability of being employed, averaging over the sample distribution of all other covariates.

This is what you report in your paper. Not odds ratios. Not log-odds. Marginal effects in probability terms.

Stop fighting with syntax.

Sytra is an AI research assistant built specifically for statistical computing. No more copy-pasting code into ChatGPT.

Get Early Access

Marginal Effects at the Mean (MEM)

* Marginal effects evaluated at the mean of covariates
margins, dydx(*) atmeans

MEM computes the marginal effect for a hypothetical “average person” — someone with the mean value of every covariate. It’s conceptually simpler but has a problem: the mean of a binary variable (e.g., female = 0.52) doesn’t correspond to any real person. AME is generally preferred.

Margins for Categorical Variables

* Predicted probability by race category
margins race
 
* Contrast: difference in predicted probability between categories
margins race, contrast(nowald)
 
* Margins plot
marginsplot, title("Predicted Probability by Race")

For categorical variables, margins gives you the predicted probability at each level, holding other variables at their observed values (or means, if you specify atmeans). The contrast option tests whether the differences between levels are statistically significant.

Interaction Effects

Interaction effects in logit models are notoriously tricky. The coefficient on an interaction term in a logit model does not have the intuitive interpretation that it has in linear models. Ai and Norton (2003) showed that the true interaction effect depends on the level of all covariates.

* Logit with interaction
logit employed education##female age, vce(robust)
 
* Correct way to interpret the interaction
margins female, dydx(education)
 
* Contrast: is the education effect different by gender?
margins female, dydx(education) contrast(nowald)

This gives you the marginal effect of education separately for males and females, and tests whether the two effects are significantly different. This is the only correct way to interpret interactions in nonlinear models.

Common Mistakes

  • Reporting odds ratios as probabilities: “Education increases the probability of employment by 30%” is wrong if the 30% comes from an odds ratio. It means the odds increase by 30%, not the probability.
  • Using mfx instead of margins: mfx is deprecated. It computes marginal effects only at the mean. Use margins, dydx(*) for AMEs and margins, dydx(*) atmeans for MEMs.
  • Interpreting interaction coefficients directly: In logit, the interaction coefficient is the difference in the log-odds ratio, not the difference in the marginal effect. Use margins to get the correct interaction effect on the probability scale.
  • Forgetting post: If you want to store margins results for a table, use margins ..., post. This replaces the estimation results with the margins results, allowing you to use esttab to export them.
#Logistic Regression#Stata#Public Health

Enjoyed this article?