Is Sytra free for researchers?

Yes. Sytra is free forever for individual researchers. You bring your own API key from OpenAI or Anthropic and pay only for the AI inference costs (typically $0.01-0.10 per query).

Does Sytra upload my data to the cloud?

No. Sytra runs entirely on your local machine. Your .dta files, .csv files, and code never leave your computer. Only the natural language prompt is sent to the AI provider.

What versions of Stata does Sytra support?

Sytra supports Stata 17 and later, including MP, SE, and BE editions.

Data Management

2026-02-1711 min read

Importing Data into Stata: Excel, CSV, Fixed-Width, SAS, and SPSS

How to get data into Stata from every format — with the exact import syntax, encoding options, and gotchas that waste 30 minutes.

Sytra Team

Research Engineering Team, Sytra AI

Your import worked yesterday, but today all IDs are shifted, dates are malformed, and Unicode characters are broken.

You will build a robust import-export block that protects schema, encoding, and key fields.

All examples tested in Stata 18 SE. Compatible with Stata 15+.

Quick Answer

Import raw files once and save clean `.dta` staging datasets.
Specify encoding and string handling options explicitly.
Check key variables and date fields immediately after import.
Export only finalized outputs with controlled formats.

Treat I/O as a Controlled Boundary in Your Pipeline

Import Excel and CSV with explicit typing checks

Raw file ingest is where many silent data-type errors enter applied projects. Stabilize import commands with explicit options and checks.

After import, inspect variable types and key uniqueness before any transformation.

If you are extending this pipeline, also review How to Structure a Stata Project and Clustered Standard Errors in Stata.

import-main.do

stata

1clear all
2
3capture mkdir "raw"
4capture mkdir "build"
5capture mkdir "exports"
6
7* Create a reproducible CSV source file
8set obs 300
9gen firm_id = "F" + string(ceil(_n/3), "%03.0f")
10gen year = 2015 + mod(_n,8)
11gen wage = 14 + rnormal(0,2)
12gen education = 8 + floor(runiform()*10)
13export delimited using "raw/worker_panel.csv", replace
14
15* Create a reproducible Excel source file
16preserve
17clear
18set obs 100
19gen firm_id = "F" + string(_n, "%03.0f")
20gen year = 2015 + mod(_n,8)
21gen firm_size = 50 + floor(runiform()*500)
22export excel using "raw/firm_covariates.xlsx", firstrow(variables) replace
23restore
24
25* Import CSV with explicit settings
26import delimited using "raw/worker_panel.csv", clear varnames(1) encoding("UTF-8")
27
28describe firm_id year wage education
29count if missing(firm_id) | missing(year)
30
31* Save staging dataset
32save "build/worker_panel_stage.dta", replace
33
34* Example Excel import with first row names
35import excel "raw/firm_covariates.xlsx", sheet("Sheet1") firstrow clear
36save "build/firm_covariates_stage.dta", replace

. describe firm_id year wage education

              storage   display    value
variable name   type    format     label
---------------------------------------------
firm_id         str8    %9s
year            int     %8.0g
wage            float   %9.0g
education       byte    %8.0g

💡Stage raw imports

Use a build/staging layer between raw files and analysis. This isolates vendor format changes from modeling scripts.

Export reproducible tables and analysis extracts

Export steps should be deterministic and versioned. Use one script that creates all outbound files from validated staging data.

Keep IDs and date formats explicit before exporting to spreadsheet consumers.

export-main.do

stata

1clear all
2
3capture mkdir "raw"
4capture mkdir "build"
5capture mkdir "exports"
6
7* Create a reproducible CSV source file
8set obs 300
9gen firm_id = "F" + string(ceil(_n/3), "%03.0f")
10gen year = 2015 + mod(_n,8)
11gen wage = 14 + rnormal(0,2)
12gen education = 8 + floor(runiform()*10)
13export delimited using "raw/worker_panel.csv", replace
14
15* Create a reproducible Excel source file
16preserve
17clear
18set obs 100
19gen firm_id = "F" + string(_n, "%03.0f")
20gen year = 2015 + mod(_n,8)
21gen firm_size = 50 + floor(runiform()*500)
22export excel using "raw/firm_covariates.xlsx", firstrow(variables) replace
23restore
24
25* Import CSV with explicit settings
26import delimited using "raw/worker_panel.csv", clear varnames(1) encoding("UTF-8")
27
28describe firm_id year wage education
29count if missing(firm_id) | missing(year)
30
31* Save staging dataset
32save "build/worker_panel_stage.dta", replace
33
34* Example Excel import with first row names
35import excel "raw/firm_covariates.xlsx", sheet("Sheet1") firstrow clear
36save "build/firm_covariates_stage.dta", replace
37
38* ---- Section-specific continuation ----
39use "build/worker_panel_stage.dta", clear
40
41* Keep analysis subset for collaborators
42keep firm_id year wage education
43export delimited using "exports/worker_panel_clean.csv", replace
44
45* Export QA summary to Excel
46collapse (mean) mean_wage=wage mean_edu=education, by(year)
47export excel using "exports/yearly_summary.xlsx", firstrow(variables) replace

. export delimited using "exports/worker_panel_clean.csv", replace

file exports/worker_panel_clean.csv saved

⚠️Avoid manual spreadsheet edits

If exported files are edited manually and then re-imported, you lose reproducibility. Keep source of truth in scripted .do files.

Common Errors and Fixes

"file raw/worker_panel.csv not found"

Stata cannot locate the path relative to current working directory.

Run `pwd`, verify relative path, and standardize project root via `cd` in master do-file.

. import delimited using "worker_panel.csv", clear

file raw/worker_panel.csv not found
r(601);

This causes the error

wrong-way.do

stata

import delimited using "worker_panel.csv", clear

This is the fix

right-way.do

stata

cd "/project/root"
import delimited using "raw/worker_panel.csv", clear

error-fix.do

stata

1pwd
2capture confirm file "raw/worker_panel.csv"
3if _rc {
4    display as error "raw file missing"
5    exit 601
6}
7import delimited using "raw/worker_panel.csv", clear

. capture confirm file "raw/worker_panel.csv"

. capture confirm file "raw/worker_panel.csv"

. import delimited using "raw/worker_panel.csv", clear
(encoding automatically selected: UTF-8)

Command Reference

import delimited / import excel

Stata docs →

Reads external data files into Stata while controlling encoding and variable handling.

import delimited using filename, clear [encoding()] [varnames()]

encoding("UTF-8")Sets character encoding explicitly

varnames(1)Uses first row as variable names

clearReplaces data in memory

allstringImports all columns as strings for strict ID control

How Sytra Handles This

Sytra can generate import blocks with file existence checks, encoding options, and automatic staging-file saves.

A direct natural-language prompt for this exact workflow:

sytra-prompt.txt

bash

Write an import pipeline for worker_panel.csv and firm_covariates.xlsx with UTF-8 encoding checks, key validation for firm_id-year, staging saves to build/, and export of a yearly summary Excel file.

Sytra catches these errors before you run.

Sytra can generate import blocks with file existence checks, encoding options, and automatic staging-file saves.

Join the Waitlist →

FAQ

Why do imported IDs lose leading zeros?

Stata often reads ID columns as numeric. Use `allstring` or convert with display formats before export to preserve leading zeros.

How do I handle encoding problems when importing CSV?

Use the encoding() option in import delimited and verify with sample string checks after import.

Should I import in every do-file run?

Import raw data once into .dta staging files, then load .dta in downstream scripts for faster and reproducible pipelines.

Written by Sytra Team

Research Engineering Team, Sytra AI

We build practical, reproducible workflows for Stata and R teams working on real empirical research pipelines.

#Stata#Import#Export#Data Management

Importing Data into Stata: Excel, CSV, Fixed-Width, SAS, and SPSS

Quick Answer

Treat I/O as a Controlled Boundary in Your Pipeline

Import Excel and CSV with explicit typing checks

Export reproducible tables and analysis extracts

Common Errors and Fixes

"file raw/worker_panel.csv not found"

Command Reference

import delimited / import excel

How Sytra Handles This

Sytra catches these errors before you run.

FAQ

Why do imported IDs lose leading zeros?

How do I handle encoding problems when importing CSV?

Should I import in every do-file run?

Enjoyed this article?

Related Guides

API Data in Stata: Import JSON/CSV Feeds and Build Analysis-Ready Panels

Stata ODBC Connection Guide: Query SQL Databases and Reproducible Extracts

Stata Dates: Formatting, Converting, and Working with Date Variables

Stata String Functions: substr, strpos, regexm, and 30 More with Examples

Stata collapse: How to Aggregate Data with Examples