Skip to content

Commit 896e704

Browse files
authored
Merge pull request #141 from igerber/imputation-estimator
Add Borusyak-Jaravel-Spiess (2024) Imputation DiD estimator
2 parents d82e0fb + 1377358 commit 896e704

File tree

15 files changed

+5509
-13
lines changed

15 files changed

+5509
-13
lines changed

CLAUDE.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -97,6 +97,16 @@ cross-platform compilation - no OpenBLAS or Intel MKL installation required.
9797
- Alternative to Callaway-Sant'Anna with different weighting scheme
9898
- Useful robustness check when both estimators agree
9999

100+
- **`diff_diff/imputation.py`** - Borusyak-Jaravel-Spiess imputation DiD estimator:
101+
- `ImputationDiD` - Borusyak et al. (2024) efficient imputation estimator for staggered DiD
102+
- `ImputationDiDResults` - Results with overall ATT, event study, group effects, pre-trend test
103+
- `ImputationBootstrapResults` - Multiplier bootstrap inference results
104+
- `imputation_did()` - Convenience function
105+
- Steps: (1) OLS on untreated obs for unit+time FE, (2) impute counterfactual Y(0), (3) aggregate
106+
- Conservative variance (Theorem 3) with `aux_partition` parameter for SE tightness
107+
- Pre-trend test (Equation 9) via `results.pretrend_test()`
108+
- Proposition 5: NaN for unidentified long-run horizons without never-treated units
109+
100110
- **`diff_diff/triple_diff.py`** - Triple Difference (DDD) estimator:
101111
- `TripleDifference` - Ortiz-Villavicencio & Sant'Anna (2025) estimator for DDD designs
102112
- `TripleDifferenceResults` - Results with ATT, SEs, cell means, diagnostics
@@ -255,6 +265,7 @@ cross-platform compilation - no OpenBLAS or Intel MKL installation required.
255265
Standalone estimators (each has own get_params/set_params):
256266
├── CallawaySantAnna
257267
├── SunAbraham
268+
├── ImputationDiD
258269
├── TripleDifference
259270
├── TROP
260271
├── SyntheticDiD
@@ -364,6 +375,7 @@ Tests mirror the source modules:
364375
- `tests/test_estimators.py` - Tests for DifferenceInDifferences, TWFE, MultiPeriodDiD, SyntheticDiD
365376
- `tests/test_staggered.py` - Tests for CallawaySantAnna
366377
- `tests/test_sun_abraham.py` - Tests for SunAbraham interaction-weighted estimator
378+
- `tests/test_imputation.py` - Tests for ImputationDiD (Borusyak et al. 2024) estimator
367379
- `tests/test_triple_diff.py` - Tests for Triple Difference (DDD) estimator
368380
- `tests/test_trop.py` - Tests for Triply Robust Panel (TROP) estimator
369381
- `tests/test_bacon.py` - Tests for Goodman-Bacon decomposition

README.md

Lines changed: 111 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -70,7 +70,7 @@ Signif. codes: '***' 0.001, '**' 0.01, '*' 0.05, '.' 0.1
7070
- **Wild cluster bootstrap**: Valid inference with few clusters (<50) using Rademacher, Webb, or Mammen weights
7171
- **Panel data support**: Two-way fixed effects estimator for panel designs
7272
- **Multi-period analysis**: Event-study style DiD with period-specific treatment effects
73-
- **Staggered adoption**: Callaway-Sant'Anna (2021) and Sun-Abraham (2021) estimators for heterogeneous treatment timing
73+
- **Staggered adoption**: Callaway-Sant'Anna (2021), Sun-Abraham (2021), and Borusyak-Jaravel-Spiess (2024) imputation estimators for heterogeneous treatment timing
7474
- **Triple Difference (DDD)**: Ortiz-Villavicencio & Sant'Anna (2025) estimators with proper covariate handling
7575
- **Synthetic DiD**: Combined DiD with synthetic control for improved robustness
7676
- **Triply Robust Panel (TROP)**: Factor-adjusted DiD with synthetic weights (Athey et al. 2025)
@@ -879,6 +879,54 @@ print(f"Sun-Abraham ATT: {sa_results.overall_att:.3f}")
879879
# If results differ substantially, investigate heterogeneity
880880
```
881881

882+
### Borusyak-Jaravel-Spiess Imputation Estimator
883+
884+
The Borusyak et al. (2024) imputation estimator is the **efficient** estimator for staggered DiD under parallel trends, producing ~50% shorter confidence intervals than Callaway-Sant'Anna and 2-3.5x shorter than Sun-Abraham under homogeneous treatment effects.
885+
886+
```python
887+
from diff_diff import ImputationDiD, imputation_did
888+
889+
# Basic usage
890+
est = ImputationDiD()
891+
results = est.fit(data, outcome='outcome', unit='unit',
892+
time='period', first_treat='first_treat')
893+
results.print_summary()
894+
895+
# Event study
896+
results = est.fit(data, outcome='outcome', unit='unit',
897+
time='period', first_treat='first_treat',
898+
aggregate='event_study')
899+
900+
# Pre-trend test (Equation 9)
901+
pt = results.pretrend_test(n_leads=3)
902+
print(f"F-stat: {pt['f_stat']:.3f}, p-value: {pt['p_value']:.4f}")
903+
904+
# Convenience function
905+
results = imputation_did(data, 'outcome', 'unit', 'period', 'first_treat',
906+
aggregate='all')
907+
```
908+
909+
```python
910+
ImputationDiD(
911+
anticipation=0, # Number of anticipation periods
912+
alpha=0.05, # Significance level
913+
cluster=None, # Cluster variable (defaults to unit)
914+
n_bootstrap=0, # Bootstrap iterations (0=analytical inference)
915+
seed=None, # Random seed
916+
horizon_max=None, # Max event-study horizon
917+
aux_partition="cohort_horizon", # Variance partition: "cohort_horizon", "cohort", "horizon"
918+
)
919+
```
920+
921+
**When to use Imputation DiD vs Callaway-Sant'Anna:**
922+
923+
| Aspect | Imputation DiD | Callaway-Sant'Anna |
924+
|--------|---------------|-------------------|
925+
| Efficiency | Most efficient under homogeneous effects | Less efficient but more robust to heterogeneity |
926+
| Control group | Always uses all untreated obs | Choice of never-treated or not-yet-treated |
927+
| Inference | Conservative variance (Theorem 3) | Multiplier bootstrap |
928+
| Pre-trends | Built-in F-test (Equation 9) | Separate testing |
929+
882930
### Triple Difference (DDD)
883931

884932
Triple Difference (DDD) is used when treatment requires satisfying two criteria: belonging to a treated **group** AND being in an eligible **partition**. The `TripleDifference` class implements the methodology from Ortiz-Villavicencio & Sant'Anna (2025), which correctly handles covariate adjustment (unlike naive implementations).
@@ -2000,6 +2048,60 @@ SunAbraham(
20002048
| `print_summary(alpha)` | Print summary to stdout |
20012049
| `to_dataframe(level)` | Convert to DataFrame ('event_study' or 'cohort') |
20022050

2051+
### ImputationDiD
2052+
2053+
```python
2054+
ImputationDiD(
2055+
anticipation=0, # Periods of anticipation effects
2056+
alpha=0.05, # Significance level for CIs
2057+
cluster=None, # Column for cluster-robust SEs
2058+
n_bootstrap=0, # Bootstrap iterations (0 = analytical)
2059+
seed=None, # Random seed
2060+
rank_deficient_action='warn', # 'warn', 'error', or 'silent'
2061+
horizon_max=None, # Max event-study horizon
2062+
aux_partition='cohort_horizon', # Variance partition
2063+
)
2064+
```
2065+
2066+
**fit() Parameters:**
2067+
2068+
| Parameter | Type | Description |
2069+
|-----------|------|-------------|
2070+
| `data` | DataFrame | Panel data |
2071+
| `outcome` | str | Outcome variable column name |
2072+
| `unit` | str | Unit identifier column |
2073+
| `time` | str | Time period column |
2074+
| `first_treat` | str | First treatment period column (0 for never-treated) |
2075+
| `covariates` | list | Covariate column names |
2076+
| `aggregate` | str | Aggregation: None, "event_study", "group", "all" |
2077+
| `balance_e` | int | Balance event study to this many pre-treatment periods |
2078+
2079+
### ImputationDiDResults
2080+
2081+
**Attributes:**
2082+
2083+
| Attribute | Description |
2084+
|-----------|-------------|
2085+
| `overall_att` | Overall average treatment effect on the treated |
2086+
| `overall_se` | Standard error (conservative, Theorem 3) |
2087+
| `overall_t_stat` | T-statistic |
2088+
| `overall_p_value` | P-value for H0: ATT = 0 |
2089+
| `overall_conf_int` | Confidence interval |
2090+
| `event_study_effects` | Dict of relative time -> effect dict (if `aggregate='event_study'` or `'all'`) |
2091+
| `group_effects` | Dict of cohort -> effect dict (if `aggregate='group'` or `'all'`) |
2092+
| `treatment_effects` | DataFrame of unit-level imputed treatment effects |
2093+
| `n_treated_obs` | Number of treated observations |
2094+
| `n_untreated_obs` | Number of untreated observations |
2095+
2096+
**Methods:**
2097+
2098+
| Method | Description |
2099+
|--------|-------------|
2100+
| `summary(alpha)` | Get formatted summary string |
2101+
| `print_summary(alpha)` | Print summary to stdout |
2102+
| `to_dataframe(level)` | Convert to DataFrame ('observation', 'event_study', 'group') |
2103+
| `pretrend_test(n_leads)` | Run pre-trend F-test (Equation 9) |
2104+
20032105
### TripleDifference
20042106

20052107
```python
@@ -2464,6 +2566,14 @@ The `HonestDiD` module implements sensitivity analysis methods for relaxing the
24642566

24652567
### Multi-Period and Staggered Adoption
24662568

2569+
- **Borusyak, K., Jaravel, X., & Spiess, J. (2024).** "Revisiting Event-Study Designs: Robust and Efficient Estimation." *Review of Economic Studies*, 91(6), 3253-3285. [https://doi.org/10.1093/restud/rdae007](https://doi.org/10.1093/restud/rdae007)
2570+
2571+
This paper introduces the imputation estimator implemented in our `ImputationDiD` class:
2572+
- **Efficient imputation**: OLS on untreated observations → impute counterfactuals → aggregate
2573+
- **Conservative variance**: Theorem 3 clustered variance estimator with auxiliary model
2574+
- **Pre-trend test**: Independent of treatment effect estimation (Proposition 9)
2575+
- **Efficiency gains**: ~50% shorter CIs than Callaway-Sant'Anna under homogeneous effects
2576+
24672577
- **Callaway, B., & Sant'Anna, P. H. C. (2021).** "Difference-in-Differences with Multiple Time Periods." *Journal of Econometrics*, 225(2), 200-230. [https://doi.org/10.1016/j.jeconom.2020.12.001](https://doi.org/10.1016/j.jeconom.2020.12.001)
24682578

24692579
- **Sant'Anna, P. H. C., & Zhao, J. (2020).** "Doubly Robust Difference-in-Differences Estimators." *Journal of Econometrics*, 219(1), 101-122. [https://doi.org/10.1016/j.jeconom.2020.06.003](https://doi.org/10.1016/j.jeconom.2020.06.003)

ROADMAP.md

Lines changed: 3 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ For past changes and release history, see [CHANGELOG.md](CHANGELOG.md).
1010

1111
diff-diff v2.1.1 is a **production-ready** DiD library with feature parity with R's `did` + `HonestDiD` + `synthdid` ecosystem for core DiD analysis:
1212

13-
- **Core estimators**: Basic DiD, TWFE, MultiPeriod, Callaway-Sant'Anna, Sun-Abraham, Synthetic DiD, Triple Difference (DDD), TROP
13+
- **Core estimators**: Basic DiD, TWFE, MultiPeriod, Callaway-Sant'Anna, Sun-Abraham, Borusyak-Jaravel-Spiess Imputation, Synthetic DiD, Triple Difference (DDD), TROP
1414
- **Valid inference**: Robust SEs, cluster SEs, wild bootstrap, multiplier bootstrap, placebo-based variance
1515
- **Assumption diagnostics**: Parallel trends tests, placebo tests, Goodman-Bacon decomposition
1616
- **Sensitivity analysis**: Honest DiD (Rambachan-Roth), Pre-trends power analysis (Roth 2022)
@@ -24,15 +24,9 @@ diff-diff v2.1.1 is a **production-ready** DiD library with feature parity with
2424

2525
High-value additions building on our existing foundation.
2626

27-
### Borusyak-Jaravel-Spiess Imputation Estimator
27+
### ~~Borusyak-Jaravel-Spiess Imputation Estimator~~ ✅ Implemented (v2.2)
2828

29-
More efficient than Callaway-Sant'Anna when treatment effects are homogeneous across groups/time. Uses imputation rather than aggregation.
30-
31-
- Imputes untreated potential outcomes using pre-treatment data
32-
- More efficient under homogeneous effects assumption
33-
- Can handle unbalanced panels more naturally
34-
35-
**Reference**: Borusyak, Jaravel, and Spiess (2024). *Review of Economic Studies*.
29+
Implemented as `ImputationDiD` — see `diff_diff/imputation.py`. Includes conservative variance (Theorem 3), event study and group aggregation, pre-trend test (Equation 9), multiplier bootstrap, and Proposition 5 handling for no never-treated units.
3630

3731
### Gardner's Two-Stage DiD (did2s)
3832

Lines changed: 160 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,160 @@
1+
#!/usr/bin/env Rscript
2+
# Benchmark: Imputation DiD Estimator (R `didimputation` package)
3+
#
4+
# Compares against diff_diff.ImputationDiD (Borusyak, Jaravel & Spiess 2024).
5+
#
6+
# Usage:
7+
# Rscript benchmark_didimputation.R --data path/to/data.csv --output path/to/results.json
8+
9+
library(didimputation)
10+
library(fixest)
11+
library(jsonlite)
12+
library(data.table)
13+
14+
# Parse command line arguments
15+
args <- commandArgs(trailingOnly = TRUE)
16+
17+
parse_args <- function(args) {
18+
result <- list(
19+
data = NULL,
20+
output = NULL
21+
)
22+
23+
i <- 1
24+
while (i <= length(args)) {
25+
if (args[i] == "--data") {
26+
result$data <- args[i + 1]
27+
i <- i + 2
28+
} else if (args[i] == "--output") {
29+
result$output <- args[i + 1]
30+
i <- i + 2
31+
} else {
32+
i <- i + 1
33+
}
34+
}
35+
36+
if (is.null(result$data) || is.null(result$output)) {
37+
stop("Usage: Rscript benchmark_didimputation.R --data <path> --output <path>")
38+
}
39+
40+
return(result)
41+
}
42+
43+
config <- parse_args(args)
44+
45+
# Load data
46+
message(sprintf("Loading data from: %s", config$data))
47+
data <- fread(config$data)
48+
49+
# Ensure proper column types
50+
data[, unit := as.integer(unit)]
51+
data[, time := as.integer(time)]
52+
53+
# R's didimputation package expects first_treat=0 or NA for never-treated units
54+
# Our Python implementation uses first_treat=0 for never-treated, which matches
55+
data[, first_treat := as.integer(first_treat)]
56+
message(sprintf("Never-treated units (first_treat=0): %d", sum(data$first_treat == 0)))
57+
58+
# Determine event study horizons from the data
59+
# Compute relative time for treated units
60+
treated_data <- data[first_treat > 0]
61+
if (nrow(treated_data) > 0) {
62+
treated_data[, rel_time := time - first_treat]
63+
min_horizon <- min(treated_data$rel_time)
64+
max_horizon <- max(treated_data$rel_time)
65+
# Post-treatment horizons only (for event study)
66+
post_horizons <- sort(unique(treated_data$rel_time[treated_data$rel_time >= 0]))
67+
all_horizons <- sort(unique(treated_data$rel_time))
68+
message(sprintf("Horizon range: [%d, %d]", min_horizon, max_horizon))
69+
message(sprintf("Post-treatment horizons: %s", paste(post_horizons, collapse = ", ")))
70+
}
71+
72+
# Run benchmark - Overall ATT (static)
73+
message("Running did_imputation (static)...")
74+
start_time <- Sys.time()
75+
76+
static_result <- did_imputation(
77+
data = data,
78+
yname = "outcome",
79+
gname = "first_treat",
80+
tname = "time",
81+
idname = "unit",
82+
cluster_var = "unit"
83+
)
84+
85+
static_time <- as.numeric(difftime(Sys.time(), start_time, units = "secs"))
86+
message(sprintf("Static estimation completed in %.3f seconds", static_time))
87+
88+
# Extract overall ATT
89+
overall_att <- static_result$estimate[1]
90+
overall_se <- static_result$std.error[1]
91+
message(sprintf("Overall ATT: %.6f (SE: %.6f)", overall_att, overall_se))
92+
93+
# Run benchmark - Event study
94+
message("Running did_imputation (event study)...")
95+
es_start_time <- Sys.time()
96+
97+
es_result <- did_imputation(
98+
data = data,
99+
yname = "outcome",
100+
gname = "first_treat",
101+
tname = "time",
102+
idname = "unit",
103+
horizon = TRUE,
104+
cluster_var = "unit"
105+
)
106+
107+
es_time <- as.numeric(difftime(Sys.time(), es_start_time, units = "secs"))
108+
message(sprintf("Event study estimation completed in %.3f seconds", es_time))
109+
110+
total_time <- static_time + es_time
111+
112+
# Format event study results
113+
event_study <- data.frame(
114+
event_time = as.integer(gsub("tau", "", es_result$term)),
115+
att = es_result$estimate,
116+
se = es_result$std.error
117+
)
118+
119+
message("Event study effects:")
120+
for (i in seq_len(nrow(event_study))) {
121+
message(sprintf(" h=%d: ATT=%.4f (SE=%.4f)",
122+
event_study$event_time[i],
123+
event_study$att[i],
124+
event_study$se[i]))
125+
}
126+
127+
# Format output
128+
results <- list(
129+
estimator = "didimputation::did_imputation",
130+
131+
# Overall ATT
132+
overall_att = overall_att,
133+
overall_se = overall_se,
134+
135+
# Event study
136+
event_study = event_study,
137+
138+
# Timing
139+
timing = list(
140+
static_seconds = static_time,
141+
event_study_seconds = es_time,
142+
total_seconds = total_time
143+
),
144+
145+
# Metadata
146+
metadata = list(
147+
r_version = R.version.string,
148+
didimputation_version = as.character(packageVersion("didimputation")),
149+
n_units = length(unique(data$unit)),
150+
n_periods = length(unique(data$time)),
151+
n_obs = nrow(data)
152+
)
153+
)
154+
155+
# Write output
156+
message(sprintf("Writing results to: %s", config$output))
157+
dir.create(dirname(config$output), recursive = TRUE, showWarnings = FALSE)
158+
write_json(results, config$output, auto_unbox = TRUE, pretty = TRUE, digits = 10)
159+
160+
message(sprintf("Completed in %.3f seconds", total_time))

benchmarks/R/requirements.R

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@
77
required_packages <- c(
88
# Core DiD packages
99
"did", # Callaway-Sant'Anna (2021) staggered DiD
10+
"didimputation", # Borusyak, Jaravel & Spiess (2024) imputation DiD
1011
"HonestDiD", # Rambachan & Roth (2023) sensitivity analysis
1112
"fixest", # Fast TWFE and basic DiD
1213

0 commit comments

Comments
 (0)