Remove Rust outer-loop SDID variance to fix SE mismatch and perf regression by igerber · Pull Request #147 · igerber/diff-diff

igerber · 2026-02-15T16:46:41Z

Summary

Remove Rust parallel placebo/bootstrap variance estimation outer loops from synthetic_did.py
Delete rust/src/sdid_variance.rs (498 lines) and all associated exports/registrations
Keep Rust-accelerated inner Frank-Wolfe weight computation (18x faster than R)
Add backend SE consistency tests confirming Rust and Python backends produce identical SEs
Delete TestSDIDVarianceRustBackend test class (9 tests that directly imported deleted Rust functions)

Root causes fixed

SE mismatch: Rust used Xoshiro256PlusPlus RNG producing different permutation sequences than Python's default_rng, causing SE divergence between backends (Python=0.1048, Rust=0.0987 at small scale)
Performance regression: Rayon par_iter across all replications saturated memory bandwidth at 1k+ scale (3.2x slower at 1k, 9.7x slower at 5k vs pure Python)

Architecture after fix

Python sequential loop is the only orchestration path for variance estimation. When Rust is available, inner Frank-Wolfe weight calls dispatch to Rust via utils.py → _backend.py → _rust_backend. The DIFF_DIFF_BACKEND env var controls this cleanly.

SE convergence validation (Python vs R at increasing iteration counts)

n_reps	Python SE	R SE	Relative Diff
50	0.104772	0.112034	6.5%
200	0.113138	0.109023	3.8%
1000	0.109822	0.104476	5.1%
2000	0.105956	0.106015	0.1%

Both converge to ~0.106; gap is Monte Carlo noise.

Performance (small scale, 2000 reps)

Backend	Time	vs R
Python + Rust inner-loop	4.65s	18x faster
R synthdid	81.87s	baseline
Python pure	168.89s	2x slower

Methodology references (required if estimator / math changes)

Method name(s): Synthetic Difference-in-Differences placebo variance (Algorithm 4)
Paper / source link(s): Arkhangelsky et al. (2021). American Economic Review, 111(12), 4088-4118
Any intentional deviations from the source (and why): None — this change removes an implementation artifact (parallel Rust outer loop with different RNG) that deviated from the sequential permutation approach

Validation

Tests added/updated:
- tests/test_methodology_sdid.py: Added TestBackendSEConsistency (2 tests: placebo + bootstrap SE matching across backends, rtol=1e-4)
- tests/test_rust_backend.py: Removed TestSDIDVarianceRustBackend (9 tests for deleted Rust functions)
277 tests pass across test_methodology_sdid.py, test_rust_backend.py, test_estimators.py
60 tests pass in DIFF_DIFF_BACKEND=python mode
maturin develop --release builds cleanly
grep -r "placebo_variance_sdid\|bootstrap_variance_sdid" returns zero hits

Security / privacy

Confirm no secrets/PII in this PR: Yes

Generated with Claude Code

…d perf regression The Rust placebo/bootstrap variance paths used a different RNG (Xoshiro256PlusPlus) producing different permutation sequences than Python, causing SE divergence between backends. Rayon parallelism across all replications also saturated memory bandwidth at 1k+ scale (3-10x slower than pure Python). Inner Frank-Wolfe weight calls still dispatch to Rust for ~18x speedup over R. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions · 2026-02-15T16:51:43Z

Overall Assessment ✅ Looks good

Executive Summary

SyntheticDiD variance estimation now runs through the Python outer loop for both backends; the algorithm remains aligned with the Methodology Registry (Algorithm 4 placebo, fixed‑weight bootstrap).
Rust SDID variance exports are fully removed from _backend and Rust module registration with no remaining references.
New backend SE consistency tests cover the intended behavior change.
Minor test‑robustness suggestion noted below.

Methodology

No findings.

Code Quality

No findings.

Performance

No findings.

Maintainability

No findings.

Tech Debt

No findings.

Security

No findings.

Documentation/Tests

P3 | Impact: The “pure Python backend” comparison only patches diff_diff.utils.HAS_RUST_BACKEND, so if SDID starts using other Rust‑accelerated modules in the future (e.g., diff_diff.linalg), the test could silently include Rust and no longer validate the intended comparison. | Fix: Force the backend at the source (e.g., set DIFF_DIFF_BACKEND=python and importlib.reload() relevant modules), or patch _backend.HAS_RUST_BACKEND/_rust_* in addition to diff_diff.utils.HAS_RUST_BACKEND. | Location: tests/test_methodology_sdid.py:L1042-L1127

igerber merged commit a8eadb3 into main Feb 15, 2026
8 checks passed

igerber deleted the sdid-benchmarks branch February 15, 2026 17:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove Rust outer-loop SDID variance to fix SE mismatch and perf regression#147

Remove Rust outer-loop SDID variance to fix SE mismatch and perf regression#147
igerber merged 1 commit intomainfrom
sdid-benchmarks

igerber commented Feb 15, 2026

Uh oh!

github-actions bot commented Feb 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

igerber commented Feb 15, 2026

Summary

Root causes fixed

Architecture after fix

SE convergence validation (Python vs R at increasing iteration counts)

Performance (small scale, 2000 reps)

Methodology references (required if estimator / math changes)

Validation

Security / privacy

Uh oh!

github-actions bot commented Feb 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant