Understanding Difference-in-Differences: A Practical Guide Using Stata
Exploring quasi-experimental methods through a health policy case study
stata
regression
difference-in-difference
Author
Nadhira A. Hendra
Published
January 15, 2026
Modified
January 19, 2026
Introduction
This analysis started as one of my class assignments in Economic Development at Columbia. The task was to think through experiment designs of a study evaluating mobile health clinics on prenatal care utilization in rural Rajasthan, India.
Unlike my previous post on World Bank indicators (which was purely correlational), this one dives into causal inference. DiD lets us estimate treatment effects when we have observational data before and during the period of intervention.
DiD is one of the most widely used quasi-experimental method in empirical economics. It is particularly valuable in context where randomized controlled trial may be infeasible. However, this method also often use in combination with randomization because the key insight of DiD is that we can observe the change over time in untreated units, adjusting any baseline difference between control and treatment groups.
In this post I’ll walk through the design, the assumptions to make, and show how to implement it in Stata. This post serves as both a personal reference and hopefully can be a useful resource for others learning causal inference methods.
The Problem
Imagine researchers want to evaluate a mobile health clinic program that began operating in treatment village in January 2023. The intervention involves sending mobile clinics to randomly selected villages once per week, offering free prenatal checkups, basic medications, and health education to pregnant women.
The researchers was able to collected data on prenatal care visits:
Before: 2022 (pre-treatment)
After: 2023 (post-treatment)
Why Randomize at the Village Level?
A natural question: why not randomize at the individual level? Why assign entire villages to treatment or control?
Since we’re including randomization in this experiment design. one of the assumption need to be made is SUTVA or Non Interference. If we randomized at individual levels, meaning that in one village there could be a group of woman that is assigned to treatment and another groups that is assigned to control, we’d face several problems:
Spillover effects: If a woman in the treatment group shares information about prenatal benefits with her neighbor in the control group, or if the control woman sees treatment women getting checkups and decides to seek care herself, we’d underestimate the true treatment effect.
Correlated outcomes: Women within the same village are more alike than women across villages (similar cultural norms, access to facilities, socioeconomic conditions). This means observations aren’t truly independent.
Understated standard errors: Our regression assumes each observation contributes equally independent information. When observations are clustered, we have less effective information than the sample size suggests.
By randomizing at the village level, we cleanly separate treatment and control groups and avoid these violation.
The DiD Framework
The 2x2 Grid
What I like about DiD is the simplicity. To give a better visualization, to be able to calculate the result we need four group means:
Control Villages
Treatment Villages
Before (2022)
\(\bar{Y}_{00}\)
\(\bar{Y}_{10}\)
After (2023)
\(\bar{Y}_{01}\)
\(\bar{Y}_{11}\)
Where:
\(\bar{Y}_{00}\) = average prenatal visits before treatment in control group
\(\bar{Y}_{01}\) = average prenatal visits after treatment in control group
\(\bar{Y}_{10}\) = average prenatal visits before treatment in treatment group
\(\bar{Y}_{11}\) = average prenatal visits after treatment in treatment group
\(\alpha\): baseline mean for control group in the pre-period
\(\beta\): time-invariant difference between treatment and control (selection effect)
\(\gamma\): time trend common to both groups (what would have happened anyway)
\(\tau\) (the DiD estimate) : the causal effect of mobile clinics on prenatal visits
The coefficient \(\tau\) is what we care about. It captures the additional change in the treatment group beyond what the control group experienced.
Exploratory Analysis
Setup
For reference on setting up Stata in R, you can refer here.
Simulating the Data
For illustration, I’ll create a dataset that mimics what researchers might have collected.
Code
clear* Set seedfor reproducibilitysetseed 12345* Create village-leveldatasetobs 100generate village_id = _ngenerate treatment = (village_id <= 50)* Expand to woman-level (10 women per village)expand 10bysort village_id: generate woman_id = _n* Expand to panel (before and after)expand 2bysort village_id woman_id: generate period = _n - 1labeldefine period_lbl 0 "Before (2022)" 1 "After (2023)"labelvalues period period_lbl* Generate outcome with DiD structure* Base: 2 visits on average* Treatment villages slightly lowerat baseline: -0.2* Time trend for everyone: +0.3* Treatment effect: +0.8 visitsgenerate prenatal_visits = 2 + /// (-0.2) * treatment + /// (0.3) * period + /// (0.8) * treatment * period + /// rnormal(0, 0.8)* Round to integers (can't have fractional visits)replace prenatal_visits = round(prenatal_visits, 1)replace prenatal_visits = 0 if prenatal_visits < 0save"did_simulation.dta", replacedescribe
Number of observations (_N) was 0, now 100.
(900 observations created)
(1,000 observations created)
(2,000 real changes made)
(1 real change made)
file did_simulation.dta saved
Contains data from did_simulation.dta
Observations: 2,000
Variables: 5 19 Jan 2026 07:37
-------------------------------------------------------------------------------------------
Variable Storage Display Value
name type format label Variable label
-------------------------------------------------------------------------------------------
village_id float %9.0g
treatment float %9.0g
woman_id float %9.0g
period float %13.0g period_lbl
prenatal_visits float %9.0g
-------------------------------------------------------------------------------------------
Sorted by: village_id woman_id
| period
| Before (2022) After (2023) Total
-----------------------+-----------------------------------------
treatment |
0 |
Mean | 2.06 2.332 2.196
Standard deviation | .8589969 .8527781 .8662184
1 |
Mean | 1.81 2.946 2.378
Standard deviation | .831434 .8789933 1.026728
Total |
Mean | 1.935 2.639 2.287
Standard deviation | .8541104 .9184348 .9539843
-----------------------------------------------------------------
Running the DiD Regression
Code
use"did_simulation.dta", clear* Generate interaction termgenerate treat_post = treatment * period* DiD regression with clustered standard errorsregress prenatal_visits treatment period treat_post, cluster(village_id)
Linear regression Number of obs = 2,000
F(3, 99) = 133.22
Prob > F = 0.0000
R-squared = 0.1966
Root MSE = .85572
(Std. err. adjusted for 100 clusters in village_id)
------------------------------------------------------------------------------
| Robust
prenatal_v~s | Coefficient std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
treatment | -.25 .0534681 -4.68 0.000 -.3560923 -.1439077
period | .272 .0542785 5.01 0.000 .1642996 .3797004
treat_post | .864 .080151 10.78 0.000 .704963 1.023037
_cons | 2.06 .038061 54.12 0.000 1.984479 2.135521
------------------------------------------------------------------------------
The coefficient on treat_post is our DiD estimate. In this simulated data, mobile clinics increased prenatal visits by approximately 0.8 visits. This is close to the true effect we built into the simulation.
Assumption
Parallel Trends
WarningThe Key Assumption
DiD requires that without treatment, both groups would have followed the same trend. This is called the parallel trends assumption. It’s untestable in the treatment period, but we can check whether trends were parallel before treatment.
In our example, suppose treatment villages had 2.1 prenatal visits in 2022 while control villages had 2.3. This baseline difference doesn’t invalidate DiD—the method accounts for level differences by looking at changes rather than absolute levels.
What would be concerning is if the groups were on different trajectories before treatment. If we had data from 2020 and 2021, we could test this by:
Visual inspection: Plot average prenatal visits over time for both groups. The lines should be roughly parallel before 2023.
Placebo test: Run DiD on the pre-treatment periods. If we find a “treatment effect” before treatment actually happened, something is wrong.
Additional Assumption
Specifically with this case study, we also need to make some additional assumption :
No Spillovers : No spillovers between villages (which we address by cluster randomization in village levels)
No anticipation effects : Which assume that women didn’t change behavior before mobile clinic arrived
Conclusion
Difference-in-Differences is a method where we compare changes over time between groups that did and didn’t receive treatment. With DiD we also able to remove both time-invariant differences between groups(existing sifference between control and treatment before the intervention started).
The method requires careful attention to Parallel trends assumptionwhich can’t be tested directly but can be signalled by analysing pre-treatment data.
Additionally, consider the cost and logistics, we also need to consider the Unit of randomization to cluster at the village level to avoid SUTVA violations
On a practical note, implementing DiD in Stata is straightforward because in essence it’s just a regression with an interaction term. The harder part is thinking through the assumptions and whether they’re plausible for the case at hand.