Skip to content

Add conditional platform BLAS linking for Rust backend#167

Merged
igerber merged 2 commits intomainfrom
review-sdid-implementation
Feb 18, 2026
Merged

Add conditional platform BLAS linking for Rust backend#167
igerber merged 2 commits intomainfrom
review-sdid-implementation

Conversation

@igerber
Copy link
Copy Markdown
Owner

@igerber igerber commented Feb 18, 2026

Summary

  • Add conditional BLAS linking: Apple Accelerate on macOS, OpenBLAS on Linux via Cargo features
  • Convert manual GEMV loop in sc_weight_fw_standard to ndarray::linalg::general_mat_vec_mul (dispatches to BLAS dgemv when enabled, falls back to ndarray's matrixmultiply kernel)
  • Add rust_backend_info() diagnostic function reporting compile-time BLAS feature status
  • Update CI workflows (publish.yml, rust-test.yml) with platform-specific --features flags and openblas-devel/libopenblas-dev install steps
  • Add extern crate blas_src for linker flag propagation (required for linker-only crates)
  • Windows builds unchanged: continue using pure Rust with no external dependencies

Methodology references (required if estimator / math changes)

  • Method name(s): N/A — no methodology changes. This is purely a build/performance optimization.
  • Paper / source link(s): N/A
  • Any intentional deviations from the source (and why): None

Validation

  • Tests added/updated:
    • tests/test_rust_backend.py: Added test_rust_backend_info test
    • rust/src/weights.rs: Added test_gemv_produces_correct_half_grad Rust unit test
  • All 78 Rust backend equivalence tests pass
  • All 60 SDID methodology tests pass
  • All 37 SyntheticDiD estimator tests pass
  • Local build verified with maturin develop --release --features accelerate on macOS ARM64
  • rust_backend_info() correctly reports {'blas': True, 'accelerate': True, 'openblas': False} on macOS

Security / privacy

  • Confirm no secrets/PII in this PR: Yes

Generated with Claude Code

Link Apple Accelerate on macOS and OpenBLAS on Linux via Cargo features
for matrix-vector and matrix-matrix products across all Rust-accelerated
code paths (weights, OLS, TROP). Windows continues using pure Rust.

- Add accelerate/openblas Cargo features with blas-src 0.10 dependency
- Convert manual GEMV loop to ndarray general_mat_vec_mul in weights.rs
  (dispatches to BLAS dgemv when enabled, matrixmultiply fallback otherwise)
- Add rust_backend_info() diagnostic function for BLAS feature reporting
- Update CI workflows for platform-specific BLAS features
- Add extern crate blas_src for linker flag propagation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Feb 18, 2026

/ai-review

The openblas-src -> openblas-build -> ureq -> native-tls dependency chain
fails to compile on Ubuntu CI because native-tls 0.2.17 is incompatible
with the latest Rust stable (missing Protocol::Tlsv13 match arm).

Replace blas-src/openblas with a build.rs that emits
`cargo:rustc-link-lib=openblas` directly, linking the system library
(libopenblas-dev) without any download infrastructure. The accelerate
feature (macOS) continues using blas-src as before.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 1dabaeab6a8a2999555ff11693717dd31978c14f


Overall assessment: ✅ Looks good

Executive Summary

  • Affected method is SyntheticDiD/SC Frank–Wolfe weights (sc_weight_fw); the new Gram/standard loops preserve the registry-specified objective, gradient, and step formula, so methodology alignment looks intact. (docs/methodology/REGISTRY.md:L663-L711, rust/src/weights.rs:L220-L518)
  • No changes to SE/variance, identification assumptions, or defaults; BLAS linking and rust_backend_info() are build/diagnostic only.
  • Added tests cover Gram vs standard equivalence and GEMV correctness, reducing regression risk.

Methodology
Affected methods: SyntheticDiD unit/time weights (Frank–Wolfe sc.weight.fw). Cross-check against the Methodology Registry’s FW step and objective shows the updated loops implement the same half_grad, step size, and convergence rule; no deviations found. (docs/methodology/REGISTRY.md:L663-L711, rust/src/weights.rs:L220-L518)
Findings: None.

Code Quality

  • P3: Impact: The comment “Zero heap allocations per iteration” is inaccurate because the periodic refresh reallocates ata_x every GRAM_REFRESH_INTERVAL, which can mislead performance expectations. Fix: either adjust the comment to note periodic allocations or refresh into a preallocated buffer (e.g., general_mat_vec_mul(1.0, &ata, &lam, 0.0, &mut ata_x)) to make the claim true. (rust/src/weights.rs:L220-L321)

Performance
No findings.

Maintainability
No findings.

Tech Debt
No findings.

Security
No findings.

Documentation/Tests
No findings.

Tests not run (not requested).

@igerber igerber merged commit b78d2fe into main Feb 18, 2026
10 checks passed
@igerber igerber deleted the review-sdid-implementation branch February 18, 2026 16:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant