@@ -24,7 +24,7 @@ Each estimator in diff-diff should be periodically reviewed to ensure:
2424| MultiPeriodDiD | ` estimators.py ` | ` fixest::feols() ` | ** Complete** | 2026-02-02 |
2525| TwoWayFixedEffects | ` twfe.py ` | ` fixest::feols() ` | ** Complete** | 2026-02-08 |
2626| CallawaySantAnna | ` staggered.py ` | ` did::att_gt() ` | ** Complete** | 2026-01-24 |
27- | SunAbraham | ` sun_abraham.py ` | ` fixest::sunab() ` | Not Started | - |
27+ | SunAbraham | ` sun_abraham.py ` | ` fixest::sunab() ` | ** Complete ** | 2026-02-15 |
2828| SyntheticDiD | ` synthetic_did.py ` | ` synthdid::synthdid_estimate() ` | ** Complete** | 2026-02-10 |
2929| TripleDifference | ` triple_diff.py ` | (forthcoming) | Not Started | - |
3030| TROP | ` trop.py ` | (forthcoming) | Not Started | - |
@@ -294,14 +294,88 @@ variables appear to the left of the `|` separator.
294294| Module | ` sun_abraham.py ` |
295295| Primary Reference | Sun & Abraham (2021) |
296296| R Reference | ` fixest::sunab() ` |
297- | Status | Not Started |
298- | Last Review | - |
297+ | Status | ** Complete** |
298+ | Last Review | 2026-02-15 |
299+
300+ ** Verified Components:**
301+ - [x] Saturated TWFE regression with cohort × relative-time interactions
302+ - [x] Within-transformation for unit and time fixed effects
303+ - [x] Interaction-weighted event study effects (δ̂_ e = Σ_g ŵ_ {g,e} × δ̂_ {g,e})
304+ - [x] IW weights match event-time sample shares (n_ {g,e} / Σ_g n_ {g,e})
305+ - [x] Overall ATT as weighted average of post-treatment effects
306+ - [x] Delta method SE for aggregated effects (Var = w' Σ w)
307+ - [x] Cluster-robust SEs at unit level
308+ - [x] Reference period normalized to zero (e=-1 excluded from design matrix)
309+ - [x] R comparison: ATT matches ` fixest::sunab() ` within machine precision (<1e-11)
310+ - [x] R comparison: SE matches within 0.3% (small scale) / 0.1% (1k scale)
311+ - [x] R comparison: Event study effects correlation = 1.000000
312+ - [x] R comparison: Event study max diff < 1e-11
313+ - [x] Bootstrap inference (pairs bootstrap)
314+ - [x] Rank deficiency handling (warn/error/silent)
315+ - [x] All REGISTRY.md edge cases tested
316+
317+ ** Test Coverage:**
318+ - 43 tests in ` tests/test_sun_abraham.py ` (36 existing + 7 methodology verification)
319+ - R benchmark tests via ` benchmarks/run_benchmarks.py --estimator sunab `
320+
321+ ** R Comparison Results:**
322+ - Overall ATT matches within machine precision (diff < 1e-11 at both scales)
323+ - Cluster-robust SE matches within 0.3% (well within 1% threshold)
324+ - Event study effects match perfectly (correlation 1.0, max diff < 1e-11)
325+ - Validated at small (200 units) and 1k (1000 units) scales
299326
300327** Corrections Made:**
301- - (None yet)
328+ 1 . ** DF adjustment for absorbed FE** (` sun_abraham.py ` , ` _fit_saturated_regression() ` ):
329+ Added ` df_adjustment = n_units + n_times - 1 ` to ` LinearRegression.fit() ` to account
330+ for absorbed unit and time fixed effects in degrees of freedom. Unlike TWFE (which uses
331+ ` -2 ` plus an explicit intercept column), SunAbraham's saturated regression has no
332+ intercept, so all absorbed df must come from the adjustment. Affects t-distribution DoF
333+ for cohort-level p-values/CIs (slightly larger p-values, slightly wider CIs) but does
334+ NOT change VCV or SE values.
335+
336+ 2 . ** NaN return for no post-treatment effects** (` sun_abraham.py ` , ` _compute_overall_att() ` ):
337+ Changed return from ` (0.0, 0.0) ` to ` (np.nan, np.nan) ` when no post-treatment effects
338+ exist. All downstream inference fields (t_stat, p_value, conf_int) correctly propagate
339+ NaN via existing guards in ` fit() ` .
340+
341+ 3 . ** Deprecation warnings for unused parameters** (` sun_abraham.py ` , ` fit() ` ):
342+ Added ` FutureWarning ` for ` min_pre_periods ` and ` min_post_periods ` parameters that
343+ are accepted but never used (no-op). These will be removed in a future version.
344+
345+ 4 . ** Removed event-time truncation at [ -20, 20] ** (` sun_abraham.py ` ):
346+ Removed the hardcoded cap ` max(min(...), -20) ` / ` min(max(...), 20) ` to match
347+ R's ` fixest::sunab() ` which has no such limit. All available relative times are
348+ now estimated.
349+
350+ 5 . ** Warning for variance fallback path** (` sun_abraham.py ` , ` _compute_overall_att() ` ):
351+ Added ` UserWarning ` when the full weight vector cannot be constructed and a
352+ simplified variance (ignoring covariances between periods) is used as fallback.
353+
354+ 6 . ** IW weights use event-time sample shares** (` sun_abraham.py ` , ` _compute_iw_effects() ` ):
355+ Changed IW weights from ` n_g / Σ_g n_g ` (cohort sizes) to ` n_{g,e} / Σ_g n_{g,e} `
356+ (per-event-time observation counts) to match the REGISTRY.md formula. For balanced
357+ panels these are identical; for unbalanced panels the new formula correctly reflects
358+ actual sample composition at each event-time. Added unbalanced panel test.
359+
360+ 7 . ** Normalize ` np.inf ` never-treated encoding** (` sun_abraham.py ` , ` fit() ` ):
361+ ` first_treat=np.inf ` (documented as valid for never-treated) was included in
362+ ` treatment_groups ` and ` _rel_time ` via ` > 0 ` checks, producing ` -inf ` event times.
363+ Fixed by normalizing ` np.inf ` to ` 0 ` immediately after computing ` _never_treated ` .
364+ Same fix applied to ` staggered.py ` (` CallawaySantAnna ` ).
302365
303366** Outstanding Concerns:**
304- - (None yet)
367+ - ** Inference distribution** : Cohort-level p-values use t-distribution (via
368+ ` LinearRegression.get_inference() ` ), while aggregated event study and overall ATT
369+ p-values use normal distribution (via ` compute_p_value() ` ). This is asymptotically
370+ equivalent and standard for delta-method-aggregated quantities. R's fixest uses
371+ t-distribution at all levels, so aggregated p-values may differ slightly for small
372+ samples — this is a documented deviation.
373+
374+ ** Deviations from R's fixest::sunab():**
375+ 1 . ** NaN for no post-treatment effects** : Python returns ` (NaN, NaN) ` for overall ATT/SE
376+ when no post-treatment effects exist. R would error.
377+ 2 . ** Normal distribution for aggregated inference** : Aggregated p-values use normal
378+ distribution (asymptotically equivalent). R uses t-distribution.
305379
306380---
307381
0 commit comments