Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,15 @@ cross-platform compilation - no OpenBLAS or Intel MKL installation required.
- Pre-trend test (Equation 9) via `results.pretrend_test()`
- Proposition 5: NaN for unidentified long-run horizons without never-treated units

- **`diff_diff/two_stage.py`** - Gardner (2022) Two-Stage DiD estimator:
- `TwoStageDiD` - Two-stage estimator: (1) estimate unit+time FE on untreated obs, (2) regress residualized outcomes on treatment indicators
- `TwoStageDiDResults` - Results with overall ATT, event study, group effects, per-observation treatment effects
- `TwoStageBootstrapResults` - Multiplier bootstrap inference on GMM influence function
- `two_stage_did()` - Convenience function
- Point estimates identical to ImputationDiD; different variance estimator (GMM sandwich vs. conservative)
- Custom `_compute_gmm_variance()` — cannot reuse `compute_robust_vcov()` because correction term uses GLOBAL cross-moment
- No finite-sample adjustments (raw asymptotic sandwich, matching R `did2s`)

- **`diff_diff/triple_diff.py`** - Triple Difference (DDD) estimator:
- `TripleDifference` - Ortiz-Villavicencio & Sant'Anna (2025) estimator for DDD designs
- `TripleDifferenceResults` - Results with ATT, SEs, cell means, diagnostics
Expand Down Expand Up @@ -270,6 +279,7 @@ cross-platform compilation - no OpenBLAS or Intel MKL installation required.
├── CallawaySantAnna
├── SunAbraham
├── ImputationDiD
├── TwoStageDiD
├── TripleDifference
├── TROP
├── SyntheticDiD
Expand Down Expand Up @@ -381,6 +391,7 @@ Tests mirror the source modules:
- `tests/test_staggered.py` - Tests for CallawaySantAnna
- `tests/test_sun_abraham.py` - Tests for SunAbraham interaction-weighted estimator
- `tests/test_imputation.py` - Tests for ImputationDiD (Borusyak et al. 2024) estimator
- `tests/test_two_stage.py` - Tests for TwoStageDiD (Gardner 2022) estimator, including equivalence tests with ImputationDiD
- `tests/test_triple_diff.py` - Tests for Triple Difference (DDD) estimator
- `tests/test_trop.py` - Tests for Triply Robust Panel (TROP) estimator
- `tests/test_bacon.py` - Tests for Goodman-Bacon decomposition
Expand Down
105 changes: 104 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ Signif. codes: '***' 0.001, '**' 0.01, '*' 0.05, '.' 0.1
- **Wild cluster bootstrap**: Valid inference with few clusters (<50) using Rademacher, Webb, or Mammen weights
- **Panel data support**: Two-way fixed effects estimator for panel designs
- **Multi-period analysis**: Event-study style DiD with period-specific treatment effects
- **Staggered adoption**: Callaway-Sant'Anna (2021), Sun-Abraham (2021), and Borusyak-Jaravel-Spiess (2024) imputation estimators for heterogeneous treatment timing
- **Staggered adoption**: Callaway-Sant'Anna (2021), Sun-Abraham (2021), Borusyak-Jaravel-Spiess (2024) imputation, and Two-Stage DiD (Gardner 2022) estimators for heterogeneous treatment timing
- **Triple Difference (DDD)**: Ortiz-Villavicencio & Sant'Anna (2025) estimators with proper covariate handling
- **Synthetic DiD**: Combined DiD with synthetic control for improved robustness
- **Triply Robust Panel (TROP)**: Factor-adjusted DiD with synthetic weights (Athey et al. 2025)
Expand Down Expand Up @@ -927,6 +927,53 @@ ImputationDiD(
| Inference | Conservative variance (Theorem 3) | Multiplier bootstrap |
| Pre-trends | Built-in F-test (Equation 9) | Separate testing |

### Two-Stage DiD (Gardner 2022)

Two-Stage DiD addresses TWFE bias in staggered adoption designs by estimating unit and time fixed effects on untreated observations only, then regressing the residualized outcomes on treatment indicators. Point estimates match the Imputation DiD estimator (Borusyak et al. 2024); the key difference is that Two-Stage DiD uses a GMM sandwich variance estimator that accounts for first-stage estimation error, while Imputation DiD uses a conservative variance (Theorem 3).

```python
from diff_diff import TwoStageDiD

# Basic usage
est = TwoStageDiD()
results = est.fit(data, outcome='outcome', unit='unit', time='period', first_treat='first_treat')
results.print_summary()
```

**Event study:**

```python
# Event study aggregation with visualization
results = est.fit(data, outcome='outcome', unit='unit', time='period',
first_treat='first_treat', aggregate='event_study')
plot_event_study(results)
```

**Parameters:**

```python
TwoStageDiD(
anticipation=0, # Periods of anticipation effects
alpha=0.05, # Significance level for CIs
cluster=None, # Column for cluster-robust SEs (defaults to unit)
n_bootstrap=0, # Bootstrap iterations (0 = analytical GMM SEs)
seed=None, # Random seed
rank_deficient_action='warn', # 'warn', 'error', or 'silent'
horizon_max=None, # Max event-study horizon
)
```

**When to use Two-Stage DiD vs Imputation DiD:**

| Aspect | Two-Stage DiD | Imputation DiD |
|--------|--------------|---------------|
| Point estimates | Identical | Identical |
| Variance | GMM sandwich (accounts for first-stage error) | Conservative (Theorem 3, may overcover) |
| Intuition | Residualize then regress | Impute counterfactuals then aggregate |
| Reference impl. | R `did2s` package | R `didimputation` package |

Both estimators are the efficient estimator under homogeneous treatment effects, producing shorter confidence intervals than Callaway-Sant'Anna or Sun-Abraham.

### Triple Difference (DDD)

Triple Difference (DDD) is used when treatment requires satisfying two criteria: belonging to a treated **group** AND being in an eligible **partition**. The `TripleDifference` class implements the methodology from Ortiz-Villavicencio & Sant'Anna (2025), which correctly handles covariate adjustment (unlike naive implementations).
Expand Down Expand Up @@ -2104,6 +2151,58 @@ ImputationDiD(
| `to_dataframe(level)` | Convert to DataFrame ('observation', 'event_study', 'group') |
| `pretrend_test(n_leads)` | Run pre-trend F-test (Equation 9) |

### TwoStageDiD

```python
TwoStageDiD(
anticipation=0, # Periods of anticipation effects
alpha=0.05, # Significance level for CIs
cluster=None, # Column for cluster-robust SEs (defaults to unit)
n_bootstrap=0, # Bootstrap iterations (0 = analytical GMM SEs)
seed=None, # Random seed
rank_deficient_action='warn', # 'warn', 'error', or 'silent'
horizon_max=None, # Max event-study horizon
)
```

**fit() Parameters:**

| Parameter | Type | Description |
|-----------|------|-------------|
| `data` | DataFrame | Panel data |
| `outcome` | str | Outcome variable column name |
| `unit` | str | Unit identifier column |
| `time` | str | Time period column |
| `first_treat` | str | First treatment period column (0 for never-treated) |
| `covariates` | list | Covariate column names |
| `aggregate` | str | Aggregation: None, "event_study", "group", "all" |
| `balance_e` | int | Balance event study to this many pre-treatment periods |

### TwoStageDiDResults

**Attributes:**

| Attribute | Description |
|-----------|-------------|
| `overall_att` | Overall average treatment effect on the treated |
| `overall_se` | Standard error (GMM sandwich variance) |
| `overall_t_stat` | T-statistic |
| `overall_p_value` | P-value for H0: ATT = 0 |
| `overall_conf_int` | Confidence interval |
| `event_study_effects` | Dict of relative time -> effect dict (if `aggregate='event_study'` or `'all'`) |
| `group_effects` | Dict of cohort -> effect dict (if `aggregate='group'` or `'all'`) |
| `treatment_effects` | DataFrame of unit-level treatment effects |
| `n_treated_obs` | Number of treated observations |
| `n_untreated_obs` | Number of untreated observations |

**Methods:**

| Method | Description |
|--------|-------------|
| `summary(alpha)` | Get formatted summary string |
| `print_summary(alpha)` | Print summary to stdout |
| `to_dataframe(level)` | Convert to DataFrame ('observation', 'event_study', 'group') |

### TripleDifference

```python
Expand Down Expand Up @@ -2582,6 +2681,10 @@ The `HonestDiD` module implements sensitivity analysis methods for relaxing the

- **Sun, L., & Abraham, S. (2021).** "Estimating Dynamic Treatment Effects in Event Studies with Heterogeneous Treatment Effects." *Journal of Econometrics*, 225(2), 175-199. [https://doi.org/10.1016/j.jeconom.2020.09.006](https://doi.org/10.1016/j.jeconom.2020.09.006)

- **Gardner, J. (2022).** "Two-stage differences in differences." *arXiv preprint arXiv:2207.05943*. [https://arxiv.org/abs/2207.05943](https://arxiv.org/abs/2207.05943)

- **Butts, K., & Gardner, J. (2022).** "did2s: Two-Stage Difference-in-Differences." *The R Journal*, 14(1), 162-173. [https://doi.org/10.32614/RJ-2022-048](https://doi.org/10.32614/RJ-2022-048)

- **de Chaisemartin, C., & D'Haultfœuille, X. (2020).** "Two-Way Fixed Effects Estimators with Heterogeneous Treatment Effects." *American Economic Review*, 110(9), 2964-2996. [https://doi.org/10.1257/aer.20181169](https://doi.org/10.1257/aer.20181169)

- **Goodman-Bacon, A. (2021).** "Difference-in-Differences with Variation in Treatment Timing." *Journal of Econometrics*, 225(2), 254-277. [https://doi.org/10.1016/j.jeconom.2021.03.014](https://doi.org/10.1016/j.jeconom.2021.03.014)
Expand Down
4 changes: 2 additions & 2 deletions ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ For past changes and release history, see [CHANGELOG.md](CHANGELOG.md).

diff-diff v2.3.0 is a **production-ready** DiD library with feature parity with R's `did` + `HonestDiD` + `synthdid` ecosystem for core DiD analysis:

- **Core estimators**: Basic DiD, TWFE, MultiPeriod, Callaway-Sant'Anna, Sun-Abraham, Borusyak-Jaravel-Spiess Imputation, Synthetic DiD, Triple Difference (DDD), TROP
- **Core estimators**: Basic DiD, TWFE, MultiPeriod, Callaway-Sant'Anna, Sun-Abraham, Borusyak-Jaravel-Spiess Imputation, Synthetic DiD, Triple Difference (DDD), TROP, Two-Stage DiD (Gardner 2022)
- **Valid inference**: Robust SEs, cluster SEs, wild bootstrap, multiplier bootstrap, placebo-based variance
- **Assumption diagnostics**: Parallel trends tests, placebo tests, Goodman-Bacon decomposition
- **Sensitivity analysis**: Honest DiD (Rambachan-Roth), Pre-trends power analysis (Roth 2022)
Expand All @@ -24,7 +24,7 @@ diff-diff v2.3.0 is a **production-ready** DiD library with feature parity with

High-value additions building on our existing foundation.

### Gardner's Two-Stage DiD (did2s)
### Gardner's Two-Stage DiD (did2s) -- IMPLEMENTED (v2.4)

Two-stage approach gaining traction in applied work. First residualizes outcomes, then estimates effects.

Expand Down
10 changes: 10 additions & 0 deletions diff_diff/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,12 @@
ImputationDiDResults,
imputation_did,
)
from diff_diff.two_stage import (
TwoStageBootstrapResults,
TwoStageDiD,
TwoStageDiDResults,
two_stage_did,
)
from diff_diff.sun_abraham import (
SABootstrapResults,
SunAbraham,
Expand Down Expand Up @@ -152,6 +158,7 @@
"CallawaySantAnna",
"SunAbraham",
"ImputationDiD",
"TwoStageDiD",
"TripleDifference",
"TROP",
# Bacon Decomposition
Expand All @@ -173,6 +180,9 @@
"ImputationDiDResults",
"ImputationBootstrapResults",
"imputation_did",
"TwoStageDiDResults",
"TwoStageBootstrapResults",
"two_stage_did",
"TripleDifferenceResults",
"triple_difference",
"TROPResults",
Expand Down
Loading