SDID methodology review: rewrite to match R synthdid + Rust parallel variance#145
SDID methodology review: rewrite to match R synthdid + Rust parallel variance#145
Conversation
…variance Comprehensive review and rewrite of the Synthetic DiD implementation to match R's synthdid package behavior (Arkhangelsky et al. 2021): Methodology (Python): - Frank-Wolfe solver matching R's sc.weight.fw() for unit and time weights - Two-pass sparsification: 100 iters → sparsify → 10000 iters (matching R) - Auto-computed regularization from data noise level (zeta_omega, zeta_lambda) - Bootstrap SE uses fixed weights (matching R's bootstrap_sample) - Placebo SE re-estimates weights per permutation (matching R's Algorithm 4) - ATT estimator matches R's synthdid_estimate formula Rust acceleration: - New rust/src/sdid_variance.rs: parallel placebo and bootstrap SE via rayon - placebo_variance_sdid(): ~8x speedup (200 permutations in parallel) - bootstrap_variance_sdid(): ~6x speedup (200 bootstrap iterations in parallel) - Frank-Wolfe solver in Rust (weights.rs) for unit and time weight computation - Fix stale PyO3 defaults on sc_weight_fw and compute_sdid_unit_weights - Make internal weight functions pub(crate) for cross-module access Testing: - 41 new methodology tests (test_methodology_sdid.py) - 9 new Rust variance backend tests (test_rust_backend.py) - Updated estimator and utility tests for new API - All 31 SDID estimator tests pass in both Rust and pure Python modes Documentation: - Updated METHODOLOGY_REVIEW.md, REGISTRY.md, README.md, CLAUDE.md - Updated tutorial notebook and benchmark scripts Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Overall assessment: Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
…edback - Wrap Unicode formula in Rust doc comment in ```text fence (fixes doctest CI) - Guard noise_level for single first-diff element (size<=1 → 0.0, not NaN) - Remove **kwargs from SyntheticDiD.__init__ (reject unknown params) - Update Registry: placebo re-estimates weights (not fixed), convergence=1e-5 - Add tests: noise_level edge cases, placebo re-estimation pin, kwargs rejection Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall assessment: ⛔ BlockerExecutive summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
…round 2 Address 5 issues from PR #145 AI review: - P0: Validate treatment is constant within unit (reject staggered designs) - P1: Enforce balanced panel (all units must have all periods) - P1: Warn when pre-treatment fit RMSE exceeds treated outcome SD - P1: Fix Registry FW iteration count (1000 → 10000, matching R/code) - P2: Fix misleading placebo docstring (weights use fresh start, not warm start) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment: Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
…trol Address AI review round 3: (1) Document the 1e-5 min_decrease floor when noise_level == 0 as an intentional deviation from R in REGISTRY.md and inline comment — enables early stopping with equivalent results. (2) Add ValueError guard in compute_time_weights for empty Y_post_control before both Python and Rust dispatch paths. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall assessment: Executive summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
…eview round 4 Deprecate the legacy compute_placebo_effects utility (uses projected-gradient weights with fixed time weights, not matching SDID Algorithm 4). Fix _placebo_variance_se docstring to say "uniform initialization" instead of "original weights as initialization", matching the actual implementation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall assessment: Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall assessment: Executive summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
…te SE - Add np.isfinite(tau) guard in Python bootstrap and placebo loops to match Rust backend's tau.is_finite() filtering (sdid_variance.rs:220,391) - Change `se > 0` to `np.isfinite(se) and se > 0` so se=inf produces NaN inference fields instead of misleading t_stat≈0 and spurious p-values - Add regression tests for non-finite tau filtering and inf SE gating Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment
Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
…ap=0 Troubleshooting docs now use zeta_omega/zeta_lambda instead of deprecated lambda_reg, and variance_method="placebo" instead of invalid n_bootstrap=0. API autosummary removes lambda_reg and adds missing diagnostic fields. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall assessment: Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
- Add test_bootstrap_with_zeta_overrides to cover bootstrap SE path with user-specified zeta_omega/zeta_lambda overrides (P1 coverage gap) - Remove unused unit_weights/time_weights params from _placebo_variance_se signature, docstring, and call site (P3 cleanup) - Update existing test that called _placebo_variance_se directly Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall assessment: ✅ Looks good Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
Summary
synthdidpackage (Arkhangelsky et al. 2021). Frank-Wolfe solver, two-pass sparsification, auto-computed regularization, and proper ATT formula all match R's reference implementation.rust/src/sdid_variance.rsparallelizes placebo SE (~8x speedup) and bootstrap SE (~6x speedup) via rayon, following the establishedtrop.rspattern. Placebo SE — the default and slowest variance method — drops from ~240s to ~30s on benchmark data.Changes
Methodology (Python)
diff_diff/synthetic_did.py— Rewritten SDID estimator with Frank-Wolfe weights, two-pass sparsification, R-matched ATT formuladiff_diff/utils.py— New weight computation functions matching R'ssc.weight.fw(), noise level estimation, SDID estimatordiff_diff/results.py— Updated SyntheticDiDResults for new weight/diagnostic fieldsRust Acceleration
rust/src/sdid_variance.rs— NEW: Parallel placebo and bootstrap variance estimationrust/src/weights.rs— Frank-Wolfe solver in Rust,pub(crate)visibility for internal functions, fixed stale PyO3 defaultsrust/src/lib.rs— Register newsdid_variancemodulediff_diff/_backend.py— Import/fallback for new Rust functionsTests
tests/test_methodology_sdid.py— NEW: 41 methodology tests (solver convergence, sparsification, regularization, edge cases)tests/test_rust_backend.py— 9 new tests for Rust variance backend (reproducibility, statistical equivalence, edge cases)tests/test_estimators.py— Updated SDID estimator teststests/test_utils.py— Updated utility testsDocumentation
METHODOLOGY_REVIEW.md— SDID review status and findingsdocs/methodology/REGISTRY.md— Updated SDID methodology entryREADME.md— Updated SDID feature descriptionCLAUDE.md— Added SDID variance to Rust backend sectiondocs/tutorials/03_synthetic_did.ipynb— Updated tutorialbenchmarks/— Updated R and Python benchmark scriptsTest plan
maturin develop && pytest tests/test_rust_backend.py::TestSDIDVarianceRustBackend -v— 9/9 new Rust variance tests passpytest tests/test_methodology_sdid.py -v— 41/41 methodology tests passDIFF_DIFF_BACKEND=rust pytest tests/test_estimators.py::TestSyntheticDiD -v— 31/31 with Rust backendDIFF_DIFF_BACKEND=python pytest tests/test_estimators.py::TestSyntheticDiD -v— 31/31 pure Python fallbackpytest tests/test_rust_backend.py::TestSDIDRustBackend -v— 12/12 existing SDID Rust testspytest tests/test_utils.py -v— 73/73 utility tests🤖 Generated with Claude Code