Skip to content

Fix rank-deficient matrix handling in OLS solver#91

Merged
igerber merged 19 commits intomainfrom
fix/rank-deficient-matrix-handling
Jan 20, 2026
Merged

Fix rank-deficient matrix handling in OLS solver#91
igerber merged 19 commits intomainfrom
fix/rank-deficient-matrix-handling

Conversation

@igerber
Copy link
Copy Markdown
Owner

@igerber igerber commented Jan 20, 2026

Summary

  • Fix MultiPeriodDiD producing astronomically wrong estimates (~252 trillion instead of ~2-5) due to rank-deficient design matrices
  • Switch Python backend from gelsy to gelsd LAPACK driver (SVD-based with proper singular value truncation)
  • Replace Rust least_squares() with explicit SVD + truncated pseudoinverse matching scipy's behavior
  • Add comprehensive tests for rank-deficient matrices in both Python and Rust backends
  • Add Rust vs NumPy equivalence tests ensuring both backends produce consistent results
  • Document known limitation: NaN standard errors for rank-deficient matrices (vcov computation issue)

Methodology references (required if estimator / math changes)

  • Method name(s): OLS solver using SVD-based least squares with singular value truncation
  • Paper / source link(s): Standard numerical linear algebra - scipy.linalg.lstsq with 'gelsd' driver uses divide-and-conquer SVD (LAPACK's DGELSD)
  • Any intentional deviations from the source (and why): None - this fix aligns both backends with scipy's default robust behavior for rank-deficient systems

Validation

  • Tests added/updated:
    • tests/test_linalg.py: Updated test_rank_deficient_still_solves, added test_multiperiod_like_rank_deficiency
    • tests/test_rust_backend.py: Added test_rank_deficient_matrix_produces_valid_coefficients, test_multiperiod_did_like_design_matrix, test_rank_deficient_ols_match, test_multiperiod_did_design_equivalence
  • Backtest / simulation / notebook evidence: Verified tutorial 02 notebook use case produces reasonable estimates (single digits) instead of trillions for both Python and Rust backends

Security / privacy

  • Confirm no secrets/PII in this PR: Yes

Generated with Claude Code

MultiPeriodDiD was producing astronomically wrong estimates (~252 trillion
instead of ~2-5) due to rank-deficient design matrices being solved
incorrectly by the gelsy LAPACK driver.

Changes:
- Python: Switch from gelsy to gelsd driver (SVD-based with truncation)
- Rust: Replace least_squares() with explicit SVD + truncated pseudoinverse
- Add comprehensive tests for rank-deficient matrices in both backends
- Add Rust vs NumPy equivalence tests for rank-deficient cases
- Document NaN standard errors limitation in TODO.md

The gelsd driver properly handles rank-deficient matrices by truncating
small singular values below rcond * max(S), producing valid minimum-norm
solutions instead of garbage coefficients.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

Overall assessment: ⚠️ Needs changes
Executive summary:

  • Rank-deficient handling now returns coefficients but conflicts with registry requirements to error on perfect collinearity; this is an undocumented methodology change.
  • Analytical SEs still rely on (X'X)^{-1} and k-based df, so rank-deficient cases can yield NaN/invalid inference for MultiPeriodDiD and other estimators.
  • Rust backend switches to full SVD; if it materializes full U/V, this is a potential memory/time regression on large panels.
  • New rank-deficient tests do not assert vcov behavior and one “rank-deficient” design may be full rank, so inference regressions can slip.

Methodology

  • Severity: P1 | Impact: The new SVD path explicitly tolerates rank-deficient designs and tests assert success, which contradicts the Methodology Registry requirement that perfect collinearity raises an error for DiD/TWFE; this changes identification checks and default behavior for estimators using solve_ols. | Concrete fix: add an explicit rank check (use rank from scipy.linalg.lstsq and singular values in Rust) and raise LinAlgError/warning when rank < k, or update the registry + estimator docs to allow minimum-norm solutions and document consequences. (docs/methodology/REGISTRY.md:L58-L62, docs/methodology/REGISTRY.md:L121-L125, diff_diff/linalg.py:L217-L235, tests/test_linalg.py:L210-L235, tests/test_rust_backend.py:L344-L368)
  • Severity: P1 | Impact: Vcov still computes (X'X)^{-1} with k-based df; for rank-deficient X this is undefined, leading to NaN/incorrect SEs once the SVD solver accepts those designs (directly affecting MultiPeriodDiD’s default inference). | Concrete fix: use a Moore–Penrose pseudoinverse for the bread and adjust df based on rank, or explicitly disable analytic SEs and surface a clear error/warning; document the chosen behavior. (diff_diff/linalg.py:L332-L380, rust/src/linalg.rs:L145-L218, TODO.md:L19-L31)

Code Quality

  • No findings.

Performance

  • Severity: P2 | Impact: Rust now computes SVD with full U/V; if ndarray-linalg returns full matrices, this is O(n^2) memory/time and can regress large-panel workloads. | Concrete fix: use economy/thin SVD or call a LAPACK least-squares routine that avoids materializing full U, or compute U^T y without full U. (rust/src/linalg.rs:L57-L89)

Maintainability

  • No findings.

Tech Debt

  • No findings.

Security

  • No findings.

Documentation/Tests

  • Severity: P2 | Impact: Rank-deficient tests only validate coefficients/residuals and ignore vcov behavior even though the PR notes NaN SEs; inference regressions can pass silently. | Concrete fix: add explicit assertions for vcov behavior (expected error or NaN/None) and move the limitation into user-facing docs, not just TODO. (tests/test_linalg.py:L210-L300, tests/test_rust_backend.py:L344-L533, TODO.md:L19-L31)
  • Severity: P3 | Impact: The “MultiPeriodDiD-like” tests assert rank deficiency in comments but do not enforce it; the matrix can be full rank depending on random treatment assignment, which weakens the test’s intent. | Concrete fix: explicitly construct a redundant column or assert matrix_rank < k before fitting. (tests/test_linalg.py:L237-L273, tests/test_rust_backend.py:L382-L414)

…cation

Replace silent SVD truncation with R's lm() approach for rank-deficient matrices:

- Detect rank deficiency using pivoted QR decomposition
- Warn users with clear message listing dropped columns
- Set NaN for coefficients of linearly dependent columns
- Compute valid SEs for identified coefficients only
- Expand vcov matrix with NaN for dropped rows/columns

Add rank_deficient_action parameter ("warn", "error", "silent") to control behavior.

Hybrid Rust/Python routing:
- Full-rank matrices use fast Rust backend (when available)
- Rank-deficient matrices use Python backend for proper NA handling
  (ndarray-linalg doesn't support QR with pivoting)

Also fixes tutorial notebook 02 to avoid rank deficiency by including
both treated cohorts in the MultiPeriodDiD example.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Jan 20, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 31371fa57bebe4c4e30f183a859a5f535b87d884


Overall assessment: ⚠️ Needs changes

Executive Summary

  • Registry mismatch: DiD now warns/drops collinear covariates instead of raising, changing default behavior.
  • Rust backend SVD solver still computes vcov via (X'X)^{-1}, so rank-deficient SEs are invalid or error-prone.
  • solve_ols now does QR+SVD for all inputs, likely slowing large designs.
  • Tutorial example uses MultiPeriodDiD on staggered cohorts despite the stated assumption.

Methodology

  • Severity: P1 | Impact: Default solve_ols now warns and drops collinear covariates (NA coefficients) instead of raising, which changes DifferenceInDifferences behavior and conflicts with the registry requirement that perfect collinearity raises LinAlgError; this affects any estimator using solve_ols. | Concrete fix: Pass rank_deficient_action="error" from DifferenceInDifferences.fit (and any other estimators requiring strict behavior) or update docs/methodology/REGISTRY.md and estimator docs to explicitly permit R-style dropping and document inference implications. | References: docs/methodology/REGISTRY.md:L58-L61, diff_diff/linalg.py:L451-L555, tests/test_estimators.py:L706-L735
  • Severity: P1 | Impact: Rust solve_ols claims SVD truncation for rank-deficient matrices but still computes vcov using (X'X)^{-1} on the original design; for rank-deficient X this is undefined, so SEs will error or be invalid (tests call this path with return_vcov=True). | Concrete fix: Detect rank deficiency from singular values and either (a) compute vcov on a reduced full-rank design (matching Python), (b) use a pseudoinverse-based sandwich, or (c) return None/raise with a clear warning when rank-deficient. | References: rust/src/linalg.rs:L52-L99, rust/src/linalg.rs:L145-L168, tests/test_rust_backend.py:L344-L380

Code Quality

  • No findings. Residual risk: none beyond the methodology/performance items.

Performance

  • Severity: P3 | Impact: solve_ols now always performs pivoted QR for rank detection and then SVD (gelsd) for solving, doubling decomposition work for full-rank matrices and likely slowing large designs across all estimators. | Concrete fix: Use the rank returned by scipy.linalg.lstsq (or numpy.linalg.matrix_rank) to skip QR in full-rank cases and only run QR when rank < k to identify dropped columns. | References: diff_diff/linalg.py:L420-L557

Maintainability

  • Severity: P3 | Impact: Rank-deficiency handling diverges between Python (drop columns/NaN + warnings) and Rust (minimum-norm coefficients), while Python bypasses Rust for rank-deficient inputs; this split behavior complicates backend parity and future changes. | Concrete fix: Align policies (e.g., implement drop/NA in Rust or explicitly document Rust as full-rank-only and adjust tests/docs accordingly). | References: diff_diff/linalg.py:L420-L437, rust/src/linalg.rs:L20-L23, tests/test_rust_backend.py:L344-L380

Tech Debt

  • Severity: P3 | Impact: The TODO notes NaN SEs for rank-deficient matrices because compute_robust_vcov uses (X'X)^{-1}; with new rank-deficiency behavior, users calling compute_robust_vcov directly still face this limitation. | Concrete fix: Implement pseudoinverse-based bread or reuse the reduced-design logic in compute_robust_vcov and update/remove the TODO once fixed. | References: diff_diff/linalg.py:L574-L672, TODO.md:L15-L31

Security

  • No findings. Residual risk: none observed.

Documentation/Tests

  • Severity: P2 | Impact: The tutorial now fits MultiPeriodDiD on multiple treatment cohorts despite stating it assumes simultaneous treatment timing; this conflicts with the registry warning about varying treatment timing and could mislead users. | Concrete fix: Revert to a single-cohort example or explicitly show/handle the warning and recommend staggered estimators for multi-cohort data. | References: docs/methodology/REGISTRY.md:L82-L86, docs/tutorials/02_staggered_did.ipynb:L674

igerber and others added 2 commits January 20, 2026 09:31
Review feedback addressed:

P1 - Methodology alignment:
- Update REGISTRY.md to document R-style rank deficiency handling
  (warn + set NA instead of error)
- Update TODO.md to mark NaN SE issue as resolved

P2 - Tutorial fix:
- Revert to single-cohort filter (cohort 3) for MultiPeriodDiD example
- Add clear documentation about rank deficiency warning behavior
- Explain that MultiPeriodDiD assumes simultaneous treatment timing

P3 - Performance documentation:
- Document QR decomposition overhead trade-off in code comments
- Explain that ~2x cost is necessary for proper rank detection
- Document why Python/Rust backends differ (pivoted QR not in ndarray-linalg)

Note: vcov validation was already present in test_linalg.py (lines 242-251)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Instead of explaining rank deficiency behavior, create a clean dataset
using generate_did_data() with simultaneous treatment timing. This shows
MultiPeriodDiD working as intended without warnings or edge cases.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Jan 20, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 768ceea53950c9e10b3d925eceb781bea835dd97


Overall Assessment
⚠️ Needs changes

Executive Summary

  • Rank-deficiency handling now applies globally, but Methodology Registry still expects TWFE collinearity to error and doesn’t document NA-dropping for other OLS-based estimators.
  • Rust backend now assumes integer cluster_ids, which breaks string/categorical clusters when Rust is enabled.
  • Classical OLS inference (robust=False) doesn’t propagate NA handling for rank-deficient designs, yielding inconsistent SEs/df.
  • Unconditional QR rank checks add measurable overhead on full-rank, large FE designs.

Methodology

  • Severity: P1 | Impact: TwoWayFixedEffects registry requires an error when treatment is perfectly collinear with FE, but the new rank-deficiency path warns/drops instead, so TWFE no longer matches documented assumptions. | Concrete fix: add an explicit collinearity check in TWFE to raise, or update the registry to the new warn+NA behavior. docs/methodology/REGISTRY.md:140 diff_diff/linalg.py:527
  • Severity: P1 | Impact: Rank-deficiency handling is now the default for all estimators via the unified solver, but MultiPeriodDiD (and other OLS-based sections) don’t document NA-dropping/warnings, making this a default-behavior change without registry coverage. | Concrete fix: add a shared registry note or update each affected estimator section. diff_diff/linalg.py:311 docs/methodology/REGISTRY.md:97 diff_diff/linalg.py:356
  • Severity: P1 | Impact: In LinearRegression with robust=False, rank-deficient designs return NaN coefficients but vcov/df are still computed from the full X (n‑k), producing incorrect SEs for dropped columns and conflicting with “valid SEs for identified coefficients only.” | Concrete fix: detect rank deficiency in this branch and compute vcov on the reduced matrix with NA expansion (or raise). diff_diff/linalg.py:1008 diff_diff/linalg.py:557

Code Quality

  • Severity: P1 | Impact: Rust backend path now coerces cluster_ids to int64 without factorization, so string/categorical cluster IDs will error when Rust is enabled, breaking cluster-robust SEs. | Concrete fix: restore factorization (e.g., pd.factorize) before the Rust call or map non-numeric IDs to contiguous ints. diff_diff/linalg.py:271

Performance

  • Severity: P2 | Impact: solve_ols always runs pivoted QR rank detection before routing to Rust/NumPy, adding O(nk²) overhead even for full-rank designs and potentially regressing large FE workloads. | Concrete fix: add an opt-out/flag for rank checks or only run QR when rank-deficient handling is explicitly requested. diff_diff/linalg.py:428

Maintainability

  • Severity: P3 | Impact: solve_ols docs still state fitted = X @ coefficients, which is false when coefficients include NaN; this can mislead users about fitted values in rank-deficient cases. | Concrete fix: update the docstring to describe fitted values as computed from identified columns. diff_diff/linalg.py:347 diff_diff/linalg.py:561

Tech Debt

  • No findings.

Security

  • No findings.

Documentation/Tests

  • Severity: P3 | Impact: No tests cover non-numeric cluster IDs with the Rust backend or LinearRegression(robust=False) under rank deficiency, so these regressions could slip. | Concrete fix: add targeted tests in the Rust backend suite and linalg tests. tests/test_rust_backend.py:330 tests/test_linalg.py:210

P1 Fixes:
- Add _factorize_cluster_ids() helper to handle string/categorical cluster IDs
  before passing to Rust backend (prevents type mismatch errors)
- Add NaN check for treatment coefficient in TWFE to error early when treatment
  is perfectly collinear with fixed effects (clearer error message)
- Fix LinearRegression.fit() to compute valid vcov when robust=False and matrix
  is rank-deficient (use reduced X matrix instead of full X)

P2 Fixes:
- Add skip_rank_check parameter to solve_ols() for performance-critical code
  paths where user knows matrix is full rank (avoids O(nk²) QR overhead)

P3 Fixes:
- Update fitted values and residuals documentation to clarify that for
  rank-deficient matrices, only identified coefficients are used

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Jan 20, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: c94e8b2069628c3b9c324d82c98da3d84af6a62a


Overall assessment: ⚠️ Needs changes

Executive Summary

  • Rank-deficiency handling aligns with the registry, but analytical inference uses full k even when columns are dropped, so df/p-values/CIs are wrong for OLS-based estimators (DiD, TWFE, MultiPeriodDiD).
  • Rust backend now computes SVD coefficients but still inverts X'X for vcov, so rank-deficient cases can error or return invalid SEs despite new tests and docs.
  • TWFE now raises on NaN ATT (good), but the error message can misattribute the collinearity source.
  • Pivoted-QR rank checks now run on every solve before Rust, adding O(nk²) overhead in full-rank workloads.

Methodology

  • Severity: P1; Impact: Degrees of freedom for t/p-values/CIs remain n - k - df_adjustment even when rank-deficient columns are dropped, which violates the registry’s “valid SEs for identified coefficients” expectation and yields incorrect inference for identified coefficients (docs/methodology/REGISTRY.md:L58-L62, diff_diff/linalg.py:L1071-L1111). Fix: compute k_effective (rank or non-NaN coefficients) and set self.n_params_/self.df_ accordingly; use the same k_effective for adjusted R².
  • Severity: P1; Impact: Rust solve_ols uses SVD for coefficients but still computes vcov via X'X inversion; rank-deficient matrices will fail or produce unstable SEs, contradicting the new rank-deficiency policy and tests (rust/src/linalg.rs:L95-L218, rust/src/linalg.rs:L224-L274). Fix: detect rank from SVD and either compute vcov via truncated pseudoinverse/reduced system with NA expansion, or return NaN/None with an explicit warning for rank-deficient cases.

Code Quality

  • Severity: P3; Impact: TWFE’s error message attributes NaN ATT solely to FE collinearity, but the dropped column could be a covariate, which can mislead users (diff_diff/twfe.py:L143-L153). Fix: broaden the message to include covariate collinearity or surface dropped column names from solve_ols.

Performance

  • Severity: P3; Impact: solve_ols always does a pivoted-QR rank check before calling Rust, adding O(nk²) overhead even when matrices are full rank (diff_diff/linalg.py:L485-L518). Fix: consider a cheaper precheck, caching rank info when X is reused, or letting internal callers opt into skip_rank_check=True when full rank is guaranteed.

Maintainability

  • None.

Tech Debt

  • None.

Security

  • None.

Documentation/Tests

  • Severity: P2; Impact: Rank-deficiency tests assert NaN coefficients/vcov but do not cover df/p-value correctness, so inference regressions can slip (tests/test_linalg.py:L210-L303, diff_diff/linalg.py:L1071-L1111). Fix: add tests validating df_, p-values, and CIs for identified coefficients when columns are dropped.

P1 Fixes:
- Fix degrees of freedom calculation to use effective rank (n_params_effective_)
  instead of total columns when design matrix is rank-deficient. This ensures
  correct t-statistics, p-values, and confidence intervals for identified
  coefficients.
- Improve Rust backend error message for rank-deficient X'X to suggest using
  solve_ols without skip_rank_check for R-style handling.
- Improve TWFE collinearity error message to surface actual dropped column names
  and distinguish between treatment collinearity (error) vs covariate
  collinearity (warning).

P2 Fixes:
- Add inference validation tests that verify degrees of freedom, p-values, and
  confidence intervals are computed correctly when columns are dropped due to
  rank deficiency.

Tests:
- test_rank_deficient_degrees_of_freedom: Verifies n_params_effective_ and df_
- test_rank_deficient_inference_uses_correct_df: Verifies p-value and CI
  use correct df (n - rank)
- test_rank_deficient_inference_nan_for_dropped_coef: Verifies NaN inference
  for dropped coefficients

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Jan 20, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: afcc21f0049360d0e43871015c48a4f6b75f2e4d


Overall assessment: ⚠️ Needs changes

Executive Summary

  • The OLS solver now does pivoted‑QR rank detection with R‑style NA coefficients, affecting all OLS‑based estimators (DifferenceInDifferences, MultiPeriodDiD, TwoWayFixedEffects).
  • Methodology Registry updates are incomplete for MultiPeriodDiD/TWFE; the new rank‑deficiency behavior is not documented there. (P1)
  • Rust backend still inverts X'X for vcov and will error on rank‑deficient matrices with return_vcov=True, conflicting with the new tests and stated fix. (P1)
  • Unconditional QR rank checks add O(nk²) overhead for full‑rank solves; consider exposing a fast‑path to callers. (P3)

Methodology

  • Severity: P1. Impact: Default behavior for rank‑deficient designs (drop columns, NA coefficients) now applies to MultiPeriodDiD and TWFE via the shared OLS path, but the Methodology Registry only documents this for DifferenceInDifferences, leaving method docs out of sync with implementation and cited fixest behavior. Concrete fix: Add explicit rank‑deficiency edge‑case notes to the MultiPeriodDiD (and TWFE, if desired) sections in docs/methodology/REGISTRY.md:L75-L140, aligned with diff_diff/linalg.py:L485-L519.

Code Quality

  • Severity: P1. Impact: Rust solve_ols computes coefficients via SVD but still calls compute_robust_vcov_internal, which inverts X'X and will raise on rank‑deficient matrices when return_vcov=True; this breaks the new rank‑deficient tests and undermines the fix. Concrete fix: Detect rank deficiency from singular values and either (a) bypass vcov with a NaN matrix/None, or (b) compute vcov via a pseudoinverse‑based sandwich to avoid invert_symmetric on singular X'X; update the Rust path to mirror the Python NA handling. References: rust/src/linalg.rs:L95-L149, rust/src/linalg.rs:L145-L218, tests/test_rust_backend.py:L344-L380.

Performance

  • Severity: P3. Impact: solve_ols now always performs a pivoted‑QR rank check, adding O(nk²) cost even for full‑rank matrices, which can slow repeated estimation (bootstrap/event‑study loops). Concrete fix: Expose skip_rank_check (or a fast‑path heuristic) in estimator APIs so callers can bypass QR when they know the design is full rank. Reference: diff_diff/linalg.py:L485-L519.

Maintainability

  • No issues found.

Tech Debt

  • No issues found.

Security

  • No issues found.

Documentation/Tests

  • No issues found.

P1 Fixes:
- Update docs/methodology/REGISTRY.md to document rank-deficiency behavior
  for MultiPeriodDiD and TwoWayFixedEffects (edge cases section):
  - MultiPeriodDiD: warns and sets NA for dropped coefficients (R-style)
  - TWFE: treatment collinearity raises error, covariate collinearity warns

- Fix Rust backend vcov for rank-deficient matrices:
  - Detect rank from SVD singular values (count values above threshold)
  - If rank < k, return NaN-filled vcov matrix instead of attempting X'X inversion
  - Updated docstring to document this behavior
  - Full R-style handling (QR pivoting) still requires Python backend

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Jan 20, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 3e480a92049e754bfe968b8f8e3deaf288575e83


Overall assessment: ⚠️ Needs changes

Executive Summary

  • Methodology check: rank-deficiency handling in solve_ols aligns with REGISTRY updates for DifferenceInDifferences, MultiPeriodDiD, and TwoWayFixedEffects; no P0/P1 mismatches found.
  • LinearRegression.predict now returns NaN for rank-deficient fits because coefficients contain NaN, which is a user-facing regression for prediction workflows.
  • Adjusted R² still uses the total parameter count instead of the effective rank, so adjusted R² is understated when columns are dropped.
  • Tests cover rank-deficient solves/inference but not prediction or adjusted R² behavior for dropped columns.

Methodology

  • None (checked docs/methodology/REGISTRY.md:L41-L146 against diff_diff/linalg.py:L321-L642 and diff_diff/twfe.py:L100-L175).

Code Quality

  • Severity: P2; Impact: LinearRegression.predict multiplies by NaN coefficients, so predictions become all-NaN after rank-deficient fits; Fix: treat dropped coefficients as 0 or store kept_cols/coef_reduced from solve_ols and compute X_reduced @ coef_reduced, or raise a clear error when coefficients contain NaN. diff_diff/linalg.py:L1332-L1353.
  • Severity: P3; Impact: Adjusted R² uses n_params_ instead of the effective rank, making adjusted R² inconsistent with the new df logic for rank-deficient fits; Fix: pass n_params_effective_ to compute_r_squared when available. diff_diff/linalg.py:L1313-L1330, diff_diff/linalg.py:L1118-L1119.

Performance

  • None.

Maintainability

  • None.

Tech Debt

  • None.

Security

  • None.

Documentation/Tests

  • Severity: P3; Impact: No tests exercise predict or adjusted R² under rank deficiency, so regressions can slip; Fix: add tests covering prediction and adjusted R² for dropped columns. tests/test_linalg.py.

P2 Fixes:
- Fix predict() to handle NaN coefficients by treating them as zero
  instead of producing all-NaN predictions. This matches the behavior
  of fitted values computed during fit().

P3 Fixes:
- Fix adjusted R² to use n_params_effective_ (effective rank) instead
  of n_params_ (total columns) for consistency with df correction.

Tests:
- test_rank_deficient_predict_uses_identified_coefficients: Verifies
  predictions are finite and match fitted values for rank-deficient fits
- test_rank_deficient_adjusted_r_squared_uses_effective_params: Verifies
  adjusted R² uses effective params, not total params

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Jan 20, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: cdcb7651b264586b670aef16d22c6acee079f55a


Overall Assessment: ✅ Looks good

Executive Summary

  • Registry updates match the new R-style rank-deficiency handling in the OLS backend; no methodology mismatches found for DiD, MultiPeriodDiD, or TWFE.
  • skip_rank_check does not skip QR in the Python path when Rust is unavailable, so the advertised fast path is not reachable there.
  • TWFE now emits a second collinearity warning on top of the solver warning, which may duplicate messages.
  • Full-rank solves now do pivoted QR and SVD, which may double compute cost for large panels.

Methodology

  • No findings.

Code Quality

  • Severity: P3 | Impact: skip_rank_check=True still runs the QR rank check when Rust is not available, so the parameter does not actually skip rank detection in the Python backend and contradicts the docstring. | Concrete fix: clarify that the skip only applies with Rust, or add a true no-rank-check Python branch that calls scipy_lstsq directly. (diff_diff/linalg.py:367, diff_diff/linalg.py:465)

Performance

  • Severity: P3 | Impact: Full-rank solves do pivoted QR (including Q) and then SVD (gelsd), adding O(n*k^2) overhead for every regression. | Concrete fix: use qr(..., mode='r') to avoid forming Q, and/or only run QR when SVD indicates rank deficiency, reusing one decomposition for the solve. (diff_diff/linalg.py:119, diff_diff/linalg.py:493, diff_diff/linalg.py:628)

Maintainability

  • Severity: P3 | Impact: TWFE emits its own collinearity warning after solve_ols already warns, so users can see duplicated messages for the same issue. | Concrete fix: allow LinearRegression.fit to pass rank_deficient_action="silent" (or filter warnings) when TWFE will handle messaging itself. (diff_diff/linalg.py:594, diff_diff/twfe.py:169)

Tech Debt

  • No findings.

Security

  • No findings.

Documentation/Tests

  • Severity: P3 | Impact: The new skip_rank_check path and the TWFE "treatment coefficient dropped" error path are not explicitly covered, so regressions could slip by. | Concrete fix: add unit tests that exercise solve_ols(..., skip_rank_check=True) without Rust and TWFE when the ATT column is dropped. (diff_diff/linalg.py:465, diff_diff/twfe.py:156)

- Add _skip_rank_check parameter to _solve_ols_numpy() for Python backend
- Suppress duplicate collinearity warnings in TWFE (handles own messaging)
- Add test_skip_rank_check_bypasses_qr_decomposition test
- Add test_twfe_treatment_collinearity_raises_error test
- Document QR+SVD redundancy rationale in TODO.md

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Jan 20, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 8c18cacd9a1f6761241e2ee58acf85ebd481810f


Overall Assessment

  • ⚠️ Needs changes

Executive Summary

  • MultiPeriodDiD still computes vcov/df on the full design matrix, so rank‑deficient cases can still error or produce invalid SEs despite the new R‑style solver changes.
  • The new rank‑deficiency handling in solve_ols/LinearRegression and TWFE’s collinearity messaging align with the updated registry for those code paths.
  • check_finite=False no longer fully skips finite checks because QR still validates inputs.
  • Estimator‑level coverage for MultiPeriodDiD rank‑deficient inference is still missing.

Methodology

  • P1 | Impact: MultiPeriodDiD calculates df = n - X.shape[1] - n_absorbed_effects and calls compute_robust_vcov(X, residuals, ...) on the full (possibly rank‑deficient) matrix, so collinear designs will still raise during inversion or yield incorrect SE/df, contradicting the registry’s “warn + NA for dropped coefficients” behavior. | Fix: route MultiPeriodDiD through solve_ols(..., return_vcov=True, cluster_ids=...) or compute vcov on the reduced X (then expand with NaNs), and set df using k_effective = np.sum(~np.isnan(coefficients)) before inference/averaging. | Location: diff_diff/estimators.py:L875-L936, docs/methodology/REGISTRY.md:L75-L102

Code Quality

  • P3 | Impact: check_finite=False still triggers SciPy’s internal finite check in _detect_rank_deficiency, so callers can’t fully skip validation and may see NaN/Inf errors despite opting out. | Fix: pass check_finite=check_finite into scipy.linalg.qr or skip rank detection when check_finite=False and the caller opts into skip_rank_check. | Location: diff_diff/linalg.py:L117-L123, diff_diff/linalg.py:L452-L486

Performance

  • P3 | Impact: Default solve_ols now does QR rank detection on every fit, roughly doubling work for large full‑rank designs (estimators don’t expose skip_rank_check). | Fix: expose skip_rank_check in estimator APIs or add heuristics to bypass rank checks when rank deficiency is impossible. | Location: diff_diff/linalg.py:L321-L372, diff_diff/linalg.py:L486-L519

Maintainability

  • No findings.

Tech Debt

  • No findings.

Security

  • No findings.

Documentation/Tests

  • P2 | Impact: Tests validate rank‑deficient handling at the linalg/Rust layer but not at the MultiPeriodDiD estimator level, so the regression that motivated this PR could persist unnoticed. | Fix: add a MultiPeriodDiD test with a deliberately collinear design matrix that asserts warning + NA coefficients and verifies df/SE behavior. | Location: tests/test_estimators.py

igerber and others added 2 commits January 20, 2026 12:32
- Route MultiPeriodDiD through solve_ols with return_vcov=True and cluster_ids
- Calculate degrees of freedom using effective rank (non-NaN coefficients)
- Handle homoskedastic vcov case for rank-deficient matrices
- Add test_rank_deficient_design_warns_and_sets_nan test

This ensures MultiPeriodDiD properly handles rank-deficient design matrices
by warning users, setting NaN for dropped coefficients, and computing valid
SEs for identified coefficients only.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Added section explaining that check_finite=False doesn't fully bypass
  scipy's internal QR validation
- Marked as low priority since it's an edge case optimization
- QR+SVD redundancy was already documented in previous commit

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Jan 20, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: d6fc08b209aaa4c738a539ecf2730a1410483b56


Overall Assessment
⚠️ Needs changes

Executive Summary

  • P1: Rank-deficiency tolerance does not match R’s lm() default despite registry claims, so near-collinear columns may be kept and inference can diverge from documented behavior.
  • P2: Rust vs Python rank detection can disagree in edge cases, yielding silent NaN VCoV without rank_deficient_action warnings/errors.
  • P3: New rank_deficient_action/skip_rank_check knobs are only documented internally and in docstrings; user-facing docs do not surface them.
  • Tests are significantly expanded for rank-deficient scenarios and backend equivalence, which is a solid improvement.

Methodology

  • Severity: P1 | Impact: _detect_rank_deficiency uses rcond = max(n, k) * eps, which is materially tighter than R’s default qr tolerance (~1e-07). This can keep near-collinear columns that R lm() would drop, contradicting the “matches lm()” claim in the Methodology Registry and altering identification/SEs for estimators that rely on solve_ols (DifferenceInDifferences, MultiPeriodDiD, TwoWayFixedEffects). | Concrete fix: Align the default tolerance to R’s (tol=1e-07 relative to r_diag[0]) or expose a user-facing tolerance parameter and update the registry to document the deviation. | Refs: diff_diff/linalg.py:L121-L135, docs/methodology/REGISTRY.md:L58-L61, docs/methodology/REGISTRY.md:L97-L101

Code Quality

  • Severity: P2 | Impact: solve_ols routes “full-rank” matrices to Rust after QR-based rank detection, but Rust independently truncates singular values and can return all-NaN VCoV when its SVD-based rank is < k. If QR and SVD thresholds disagree on near-collinear designs, this yields silent NaN SEs without honoring rank_deficient_action. | Concrete fix: After Rust solve, detect rank deficiency via NaNs in VCoV or propagate the precomputed rank into Rust and apply the same rank_deficient_action (warn/error) in Python. | Refs: diff_diff/linalg.py:L486-L507, rust/src/linalg.rs:L75-L116

Performance

  • No issues found.

Maintainability

  • No issues found.

Tech Debt

  • No issues found.

Security

  • No issues found.

Documentation/Tests

  • Severity: P3 | Impact: The new rank_deficient_action and skip_rank_check API is only described in internal docs and docstrings, so users won’t discover or understand the new behavior/control knobs. | Concrete fix: Add these options to public-facing documentation (e.g., API reference or user guide) and note their effect on rank-deficient handling. | Refs: diff_diff/linalg.py:L330-L373, CLAUDE.md:L121-L135

igerber and others added 2 commits January 20, 2026 13:00
- Change rcond from max(n,k)*eps to 1e-07 (matches R's qr() default)
- Update both Python (_detect_rank_deficiency) and Rust (solve_ols) backends
- Update REGISTRY.md to document the tolerance value
- Fix test_near_singular_matrix_stability to use noise above new tolerance

The previous tolerance (max(n,k)*eps ≈ 2.2e-14 for n=100) was stricter than
R's default (1e-07), meaning near-collinear columns could be retained when
R would drop them. This ensures consistent behavior with R's lm().

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update noise from 1e-8 to 1e-5 (above the 1e-07 rank detection threshold)
to ensure the covariates are not treated as linearly dependent.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Jan 20, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 9db9e6ca8fb62569f41ca3fd4336da3215d64924


Overall assessment: ⚠️ Needs changes

Executive summary

  • P1: MultiPeriodDiD average ATT inference can report t_stat=0 and p_value=1 when rank deficiency yields NaNs, which masks unidentified effects.
  • P3: _detect_rank_deficiency docstring contradicts the implemented default tolerance.
  • P3: No test exercises average-ATT inference when an interaction term is dropped.

Methodology

  • Severity: P1 | Impact: When rank deficiency drops a period interaction, avg_att/avg_se become NaN but avg_t_stat is forced to 0 and avg_p_value to 1, which conflicts with the registry’s NA handling and can mislead inference. | Concrete fix: Guard avg_att/avg_se with np.isnan(...) and propagate NaNs to avg_t_stat, avg_p_value, and avg_conf_int (or explicitly compute averages over identified effects and document the deviation). | Location: diff_diff/estimators.py:L937-L950

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

  • No findings.

Tech Debt

  • No findings.

Security

  • No findings.

Documentation/Tests

  • Severity: P3 | Impact: _detect_rank_deficiency docstring says the default tolerance uses max(n,k)*eps, but the implementation hard-codes 1e-07, so the documented rank rule is incorrect. | Concrete fix: Update the docstring to reflect the 1e-07 default and QR-diagonal-based threshold (or change code to match the docstring). | Location: diff_diff/linalg.py:L100-L126
  • Severity: P3 | Impact: Tests cover NaN coefficients for collinear covariates but not the average-ATT inference path when an interaction term is dropped, so the P1 bug can regress unnoticed. | Concrete fix: Add a test in the MultiPeriodDiD rank-def suite that forces a dropped interaction and asserts avg_att, avg_se, and avg_p_value are NaN (or the documented behavior). | Location: tests/test_estimators.py:L1627-L1668

…ng (P3)

P1: Average ATT now properly propagates NaN when:
- All period effects are NaN (cannot compute average)
- Vcov has NaN entries (dropped columns)
- Only identified (non-NaN) effects are included in the average

Previously, the code would set avg_t_stat=0 and avg_p_value from that,
which could mislead inference when interaction terms were dropped.

P3: Fix _detect_rank_deficiency docstring to document actual tolerance
(1e-07 matching R's qr()) instead of the old max(n,k)*eps.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Jan 20, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 4c62776c2d60ac63488ad3bb758225a74058d3be


Overall assessment: ⚠️ Needs changes

Executive summary

  • P1: MultiPeriodDiD now averages only identified post-period effects when rank-deficient, which changes the estimand without an explicit policy or warning.
  • P2: Rust SVD truncation uses a fixed rcond that can diverge from SciPy’s gelsd default, risking backend inconsistencies for ill-conditioned matrices.
  • P3: check_finite=False no longer fully bypasses finite checks because QR rank detection still validates inputs.

Methodology

  • P1: Avg ATT drops NaN period effects instead of treating the overall ATT as unidentified when any post-period coefficient is dropped; Impact: the reported avg_att/avg_se can correspond to a different estimand than “mean over specified post_periods,” potentially masking identification failures; Fix: if any post-period effect is NaN, either set avg_att/avg_se/avg_* to NaN and warn, or explicitly redefine and document the estimator as averaging only identified periods (and surface the dropped periods in results). (diff_diff/estimators.py:L937-L974)

Code Quality

  • P2: Rust SVD truncation uses rcond=1e-07 while SciPy gelsd is called with default cond, so ill-conditioned but full-rank matrices can yield different coefficients/SEs across backends (and Rust can emit NaN vcov even when QR said full rank); Impact: backend-dependent results and inference; Fix: plumb a shared rcond into both backends (e.g., pass cond=1e-07 to scipy_lstsq) or align the Rust threshold to SciPy’s default. (rust/src/linalg.rs:L75-L111, diff_diff/linalg.py:L621-L642)

Performance

  • P3 (info): No issues found. Impact: None. Fix: None.

Maintainability

  • P3 (info): No issues found. Impact: None. Fix: None.

Tech Debt

  • P3: check_finite=False no longer fully bypasses finite checks because _detect_rank_deficiency() calls QR with default check_finite=True; Impact: API contract/perf regression for callers expecting a full bypass; Fix: pass check_finite through to qr (SciPy ≥1.9) or skip rank checks when check_finite=False and skip_rank_check=True is set. (diff_diff/linalg.py:L358-L366, diff_diff/linalg.py:L117-L125, diff_diff/linalg.py:L488-L497)

Security

  • P3 (info): No issues found. Impact: None. Fix: None.

Documentation/Tests

  • P3 (info): No issues found. Impact: None. Fix: None.

Open questions / assumptions

  • Should avg_att be NA when any post-period effect is dropped, or is the intent to redefine it as “mean over identified periods only”? If the latter, do you want dropped periods surfaced explicitly in results?

- avg_att is now NaN if ANY post-period effect is unidentified, matching
  R's default NA propagation semantics (mean(c(1,2,NA)) returns NA)
- Add explicit cond=1e-07 to scipy_lstsq calls for consistency with
  Rust backend and QR rank tolerance
- Document avg_att NA behavior in REGISTRY.md
- Add test for avg_att NaN propagation when period effect is unidentified

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Jan 20, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: c49edf6c8685197f45d47dbb8e7098d7dc32e780


Overall assessment: ⚠️ Needs changes

Executive summary

  • Methodology: rank‑deficiency handling aligns with the updated registry and R/fixest behavior for OLS‑based estimators; no formula/SE deviations found.
  • Docs: the registry says rank_deficient_action is controllable for estimators but the estimator APIs do not expose it.
  • Performance: always‑on pivoted‑QR rank checks add O(nk²) overhead for full‑rank designs without an estimator‑level bypass.
  • Tech debt: check_finite=False still triggers QR finite checks, so “skip validation” is not fully honored.

Methodology

  • No issues found. Affected methods (DifferenceInDifferences, MultiPeriodDiD, TwoWayFixedEffects, plus other OLS‑based paths) match the registry’s R‑style rank‑deficiency handling and cited references.

Code Quality

  • No issues found.

Performance

  • P3: solve_ols now always runs a pivoted‑QR rank check before solving, adding O(nk²) overhead for full‑rank designs; estimators don’t expose skip_rank_check, so the cost is unavoidable in typical usage. Impact: potential ~2× runtime on large designs. Fix: expose skip_rank_check (or a higher‑level flag) on estimator APIs or add a heuristic/option to skip QR when users know the design is full‑rank. Location: diff_diff/linalg.py:L488-L496.

Maintainability

  • No issues found.

Tech Debt

  • P3: check_finite=False does not fully bypass finite checks because _detect_rank_deficiency calls SciPy qr with default check_finite=True. Impact: behavior/doc mismatch for users relying on check_finite=False. Fix: pass check_finite through to qr (SciPy ≥1.9), or skip rank checking when check_finite=False + skip_rank_check=True, and update the docstring. Location: diff_diff/linalg.py:L115-L120, diff_diff/linalg.py:L358-L360.

Security

  • No issues found.

Documentation/Tests

  • P2: Registry claims rank_deficient_action is controllable for estimators, but estimator APIs do not expose it. Impact: users can’t choose warn/error/silent as documented for estimator workflows. Fix: plumb rank_deficient_action through estimator constructors/fit methods, or clarify the registry to scope this control to solve_ols only. Location: docs/methodology/REGISTRY.md:L58-L63.

Add rank_deficient_action parameter to DifferenceInDifferences and
LinearRegression classes, allowing users to control behavior when
design matrices are rank-deficient:
- "warn" (default): Issue warning and drop linearly dependent columns
- "error": Raise ValueError
- "silent": Drop columns silently without warning

This fixes a documentation gap where REGISTRY.md claimed this parameter
was controllable at the estimator level, but it was only available in
the low-level solve_ols() function.

Changes:
- linalg.py: Add rank_deficient_action to LinearRegression.__init__
  and pass through to solve_ols() in fit()
- estimators.py: Add rank_deficient_action to DifferenceInDifferences
  and pass through to LinearRegression/solve_ols in fit()
- MultiPeriodDiD inherits parameter and passes to solve_ols()
- Add 8 new tests for the parameter in estimators and LinearRegression
- Update CLAUDE.md with estimator-level usage examples

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Jan 20, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: f35e57c178012722fcf3930c38a89c3e8f5f6c36


Overall Assessment

  • Overall assessment: ⚠️ Needs changes

Executive Summary

  • Methodology updates for R-style rank-deficiency handling and avg_att NA propagation appear aligned with docs/methodology/REGISTRY.md.
  • TwoWayFixedEffects ignores the new rank_deficient_action parameter, so "silent"/"error" are not respected (P2).
  • New unconditional QR rank check adds per-fit overhead; consider exposing the fast-path to estimators for large designs (P3).

Methodology

  • No findings.

Code Quality

  • Severity: P2. Impact: TwoWayFixedEffects always suppresses solver warnings and then emits its own warnings/errors, ignoring user-configured rank_deficient_action, so "silent"/"error" can’t be honored and behavior diverges from other estimators. Fix: pass rank_deficient_action=self.rank_deficient_action into LinearRegression and gate the warning/error block on that flag (e.g., skip warnings when "silent", raise when "error"). diff_diff/twfe.py:L131-L179

Performance

  • Severity: P3. Impact: solve_ols always runs pivoted QR to detect rank deficiency, adding O(nk²) overhead even for full-rank matrices; estimators don’t expose skip_rank_check, so the fast path is unreachable in typical use. Fix: surface skip_rank_check (or auto-disable rank checking when rank_deficient_action="silent" and column names aren’t needed) at estimator level. diff_diff/linalg.py:L467-L496

Maintainability

  • No findings.

Tech Debt

  • No findings.

Security

  • No findings.

Documentation/Tests

  • No findings.

Complete parameter propagation for rank_deficient_action across all
estimators that use OLS internally:

- TwoWayFixedEffects: Pass to LinearRegression, conditional warning
  suppression based on setting
- CallawaySantAnna: Add parameter to __init__, pass to _linear_regression
  helper in _outcome_regression and _doubly_robust methods
- SunAbraham: Add parameter to __init__, pass to LinearRegression in
  _fit_saturated_regression
- TripleDifference: Add parameter to __init__, pass to solve_ols and
  LinearRegression in regression methods, also update convenience function

All estimators now support:
- "warn" (default): Issue warning and drop linearly dependent columns
- "error": Raise ValueError on rank deficiency
- "silent": Drop columns silently without warning

Tests added for each estimator verifying error/silent behavior.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Jan 20, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: df5d62ab4cd3932286cba57e9f2a8896204f2ab5


Overall assessment

⚠️ Needs changes

Executive summary

  • P1 methodology doc gap: rank‑deficient handling is now part of CallawaySantAnna/SunAbraham estimation paths but isn’t reflected in their Methodology Registry sections.
  • P2 code quality risk: Python QR rank check and Rust SVD rank check can disagree, potentially yielding silent NaN vcov/SEs when the Rust path is selected.
  • Coverage looks strong for the new rank‑deficiency behavior (many targeted tests added).

Methodology

  • Severity: P1 — Impact: CallawaySantAnna and SunAbraham now drop collinear columns and can produce NaN coefficients/SEs via rank_deficient_action, but the Methodology Registry doesn’t document this edge case or the new parameter. That’s a default behavior change relative to the source-of-truth spec, which can mislead users about identification/NA propagation. Fix: add explicit rank‑deficiency handling notes and rank_deficient_action semantics to the CallawaySantAnna and SunAbraham sections (edge cases + assumptions), mirroring the DiD/TWFE updates. Refs: diff_diff/staggered.py:L109-L144, diff_diff/staggered.py:L197-L211, diff_diff/sun_abraham.py:L395-L773, docs/methodology/REGISTRY.md:L165-L244

Code Quality

  • Severity: P2 — Impact: solve_ols routes to Rust when QR deems the matrix full‑rank, but Rust independently computes rank via SVD and returns an all‑NaN vcov if it decides rank < k. If QR/SVD tolerances disagree on ill‑conditioned matrices, you can get silent NaN SEs without a rank‑deficiency warning. Fix: if Rust returns NaN vcov (or exposes rank < k), fall back to the Python R‑style path or emit a warning; alternatively, align the routing criterion to SVD‑based rank to match Rust. Refs: diff_diff/linalg.py:L488-L510, rust/src/linalg.rs:L75-L116

Performance

  • No issues found.

Maintainability

  • No issues found.

Tech Debt

  • No issues found.

Security

  • No issues found.

Documentation/Tests

  • No issues found beyond the Methodology Registry gap noted above.

P1 - Methodology Documentation:
- Document rank-deficiency handling for CallawaySantAnna in REGISTRY.md
- Document rank-deficiency handling for SunAbraham in REGISTRY.md
- Match format used for DifferenceInDifferences section

P2 - Rust NaN vcov Fallback:
- Add check in solve_ols() for NaN in vcov after Rust backend call
- Fall back to Python R-style handling when QR/SVD disagree on rank
- Prevents silent NaN SEs from reaching users
- Add test_nan_vcov_fallback_to_python() test case

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Jan 20, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: e054cc3b120c4bd25848397b79ac194c700b5f1a


Overall assessment: ⚠️ Needs changes

Executive Summary

  • Methodology alignment looks OK: rank-deficiency handling and NA propagation match the updated registry for DiD/MultiPeriodDiD/TWFE/CallawaySantAnna/SunAbraham/TripleDifference.
  • rank_deficient_action isn’t surfaced in get_params, so sklearn-style cloning/grid search can silently drop the setting.
  • Rust fallback warning claims “R‑style rank handling” but reuses cached full‑rank QR info, which can mislead and still leave unstable vcov in the ill‑conditioned edge case.

Methodology

  • No issues found.

Code Quality

  • Severity: P2; Impact: DifferenceInDifferences.get_params omits rank_deficient_action, so param round‑trips (clone/grid‑search) for DiD/TWFE/MultiPeriodDiD can lose the user setting and change estimator behavior silently; Concrete fix: include rank_deficient_action in the returned dict in diff_diff/estimators.py:L590-L607, and consider a small regression test in estimator tests.
  • Severity: P3; Impact: when Rust returns NaN vcov, the warning says “R‑style rank handling” but the fallback uses cached full‑rank QR info and won’t drop columns, which can mislead and still yield unstable vcov for borderline rank cases; Concrete fix: in diff_diff/linalg.py:L511-L537, either bypass _precomputed_rank_info (force a fresh rank check / treat as rank‑deficient) or soften the warning to describe a solver fallback rather than R‑style handling.

Performance

  • No issues found.

Maintainability

  • No issues found.

Tech Debt

  • No issues found.

Security

  • No issues found.

Documentation/Tests

  • No issues found.

- Add rank_deficient_action to DifferenceInDifferences.get_params()
  (fixes sklearn-style cloning/serialization for DiD, TWFE, MultiPeriodDiD)
- Fix NaN vcov fallback to force fresh QR rank detection instead of
  passing cached rank info that may have incorrectly assessed full-rank
- Strengthen test_nan_vcov_fallback_to_python to verify R-style NaN
  handling actually occurs when fallback is triggered

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented Jan 20, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 41120ae4d614dddbf09b2b5f90125c44c57d8390


Overall assessment: ✅ Looks good

Executive Summary

  • Rank-deficiency handling and avg_att NA propagation align with docs/methodology/REGISTRY.md for DiD/MultiPeriodDiD/TWFE/CallawaySantAnna/SunAbraham; no P0/P1 methodology mismatches found.
  • check_finite=False no longer fully bypasses finite validation because QR rank detection still checks inputs.
  • Full-rank solves now always pay a pivoted-QR cost before Rust/SVD, reducing performance upside on large designs.
  • Tests and docs are expanded for rank-deficient matrices across Python/Rust backends.

Methodology

  • Severity: P3 (info). Impact: Affected methods (OLS backend + DiD, MultiPeriodDiD, TWFE, CallawaySantAnna, SunAbraham, TripleDifference) match the updated registry on rank-deficiency handling and NA propagation; no P0/P1 deviations found. Concrete fix: None. docs/methodology/REGISTRY.md:L58-L149 diff_diff/linalg.py:L623-L670 diff_diff/estimators.py:L947-L975

Code Quality

  • Severity: P3. Impact: check_finite=False is not a true bypass because _detect_rank_deficiency still runs qr with default finite checks, so NaN/Inf can raise despite opting out. Concrete fix: thread check_finite into _detect_rank_deficiency/qr (and gate rank checking when check_finite=False if needed) to preserve the documented contract. diff_diff/linalg.py:L85-L126 diff_diff/linalg.py:L454-L496

Performance

  • Severity: P3. Impact: Every solve_ols call now performs pivoted QR before Rust/SVD, adding O(nk²) overhead and shrinking Rust backend gains on large full-rank designs. Concrete fix: add a fast-path in estimators to pass skip_rank_check=True when users opt out of rank diagnostics, or expose a config toggle for known full-rank data. diff_diff/linalg.py:L488-L496

Maintainability

  • Severity: P3 (info). Impact: No issues found. Concrete fix: None.

Tech Debt

  • Severity: P3 (info). Impact: No issues found. Concrete fix: None.

Security

  • Severity: P3 (info). Impact: No issues found. Concrete fix: None.

Documentation/Tests

  • Severity: P3 (info). Impact: No issues found. Concrete fix: None.

@igerber igerber merged commit 50a4912 into main Jan 20, 2026
4 checks passed
@igerber igerber deleted the fix/rank-deficient-matrix-handling branch January 20, 2026 22:58
igerber added a commit that referenced this pull request Jan 20, 2026
Add estimator inheritance map, test writing guidelines, and development
checklists based on lessons learned from PR #91 review process. These
additions help prevent common issues:

- Estimator inheritance map: Shows which classes inherit get_params/set_params
  to prevent missing parameters in subclasses
- Test writing guidelines: Emphasizes testing behavior (not just no-crash)
  for fallback paths, new parameters, and warnings
- Development checklists: Actionable steps for adding parameters and
  warning/error handling with verification steps

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant