diff --git a/CLAUDE.md b/CLAUDE.md
index 38c87a73..34b02522 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -119,13 +119,20 @@ pytest tests/test_rust_backend.py -v
   - Integrated with `TwoWayFixedEffects.decompose()` method
 
 - **`diff_diff/linalg.py`** - Unified linear algebra backend (v1.4.0+):
-  - `solve_ols()` - OLS solver using scipy's gelsy LAPACK driver (QR-based, faster than SVD)
+  - `solve_ols()` - OLS solver with R-style rank deficiency handling
+  - `_detect_rank_deficiency()` - Detect linearly dependent columns via pivoted QR
   - `compute_robust_vcov()` - Vectorized HC1 and cluster-robust variance-covariance estimation
   - `compute_r_squared()` - R-squared and adjusted R-squared computation
   - `LinearRegression` - High-level OLS helper class with unified coefficient extraction and inference
   - `InferenceResult` - Dataclass container for coefficient-level inference (SE, t-stat, p-value, CI)
   - Single optimization point for all estimators (reduces code duplication)
   - Cluster-robust SEs use pandas groupby instead of O(n × clusters) loop
+  - **Rank deficiency handling** (R-style):
+    - Detects rank-deficient matrices using pivoted QR decomposition
+    - `rank_deficient_action` parameter: "warn" (default), "error", or "silent"
+    - Dropped columns have NaN coefficients (like R's `lm()`)
+    - VCoV matrix has NaN for rows/cols of dropped coefficients
+    - Warnings include column names when provided
 
 - **`diff_diff/_backend.py`** - Backend detection and configuration (v2.0.0):
   - Detects optional Rust backend availability
@@ -240,7 +247,7 @@ diff-diff achieved significant performance improvements in v1.4.0, now **faster
 
 All estimators use a single optimized OLS/SE implementation:
 
-- **scipy.linalg.lstsq with 'gelsy' driver**: QR-based solving, faster than NumPy's default SVD-based solver
+- **R-style rank deficiency handling**: Uses pivoted QR to detect linearly dependent columns, drops them, sets NaN for their coefficients, and emits informative warnings (following R's `lm()` approach)
 - **Vectorized cluster-robust SE**: Uses pandas groupby aggregation instead of O(n × clusters) Python loop
 - **Single optimization point**: Changes to `linalg.py` benefit all estimators
 
@@ -248,8 +255,17 @@ All estimators use a single optimized OLS/SE implementation:
 # All estimators import from linalg.py
 from diff_diff.linalg import solve_ols, compute_robust_vcov
 
-# Example usage
+# Example usage (warns on rank deficiency, sets NaN for dropped coefficients)
 coefficients, residuals, vcov = solve_ols(X, y, cluster_ids=cluster_ids)
+
+# Suppress warning or raise error:
+coefficients, residuals, vcov = solve_ols(X, y, rank_deficient_action="silent")  # no warning
+coefficients, residuals, vcov = solve_ols(X, y, rank_deficient_action="error")   # raises ValueError
+
+# At estimator level (DifferenceInDifferences, MultiPeriodDiD):
+from diff_diff import DifferenceInDifferences
+did = DifferenceInDifferences(rank_deficient_action="error")   # raises on collinear data
+did = DifferenceInDifferences(rank_deficient_action="silent")  # no warning
 ```
 
 #### CallawaySantAnna Optimizations (`staggered.py`)
diff --git a/TODO.md b/TODO.md
index 96ebf679..fd6316a3 100644
--- a/TODO.md
+++ b/TODO.md
@@ -15,6 +15,21 @@ Current limitations that may affect users:
 | MultiPeriodDiD wild bootstrap not supported | `estimators.py:1068-1074` | Low | Edge case |
 | `predict()` raises NotImplementedError | `estimators.py:532-554` | Low | Rarely needed |
 
+### ~~NaN Standard Errors for Rank-Deficient Matrices~~ (RESOLVED)
+
+**Status**: Resolved in v2.2.0 with R-style rank deficiency handling.
+
+**Solution**: The OLS solver now detects rank-deficient design matrices using pivoted QR decomposition and handles them following R's `lm()` approach:
+- Warns users about dropped columns
+- Sets NaN for coefficients of linearly dependent columns
+- Computes valid SEs for identified (non-dropped) coefficients only
+- Expands vcov matrix with NaN for dropped rows/columns
+
+This is controlled by the `rank_deficient_action` parameter in `solve_ols()`:
+- `"warn"` (default): Emit warning, set NA for dropped coefficients
+- `"error"`: Raise ValueError
+- `"silent"`: No warning, but still set NA for dropped coefficients
+
 ---
 
 ## Code Quality
@@ -143,3 +158,37 @@ Potential future optimizations:
 - [ ] JIT compilation for bootstrap loops (numba)
 - [ ] Sparse matrix handling for large fixed effects
 
+### QR+SVD Redundancy in Rank Detection
+
+**Background**: The current `solve_ols()` implementation performs both QR (for rank detection) and SVD (for solving) decompositions on rank-deficient matrices. This is technically redundant since SVD can determine rank directly.
+
+**Current approach** (R-style, chosen for robustness):
+1. QR with pivoting for rank detection (`_detect_rank_deficiency()`)
+2. scipy's `lstsq` with 'gelsd' driver (SVD-based) for solving
+
+**Why we use QR for rank detection**:
+- QR with pivoting provides the canonical ordering of linearly dependent columns
+- R's `lm()` uses this approach for consistent dropped-column reporting
+- Ensures consistent column dropping across runs (SVD column selection can vary)
+
+**Potential optimization** (future work):
+- Skip QR when `rank_deficient_action="silent"` since we don't need column names
+- Use SVD rank directly in the Rust backend (already implemented)
+- Add `skip_rank_check` parameter for hot paths where matrix is known to be full-rank (implemented in v2.2.0)
+
+**Priority**: Low - the QR overhead is minimal compared to SVD solve, and correctness is more important than micro-optimization.
+
+### Incomplete `check_finite` Bypass
+
+**Background**: The `solve_ols()` function accepts a `check_finite=False` parameter intended to skip NaN/Inf validation for performance in hot paths where data is known to be clean.
+
+**Current limitation**: When `check_finite=False`, our explicit validation is skipped, but scipy's internal QR decomposition in `_detect_rank_deficiency()` still validates finite values. This means callers cannot fully bypass all finite checks.
+
+**Impact**: Minimal - the scipy check is fast and only affects edge cases where users explicitly pass `check_finite=False` with non-finite data (which would be a bug in their code anyway).
+
+**Potential fix** (future work):
+- Pass `check_finite=False` through to scipy's QR call (requires scipy >= 1.9.0)
+- Or skip `_detect_rank_deficiency()` entirely when `check_finite=False` and `_skip_rank_check=True`
+
+**Priority**: Low - this is an edge case optimization that doesn't affect correctness.
+
diff --git a/diff_diff/estimators.py b/diff_diff/estimators.py
index 7fa4c805..83fc72b0 100644
--- a/diff_diff/estimators.py
+++ b/diff_diff/estimators.py
@@ -64,6 +64,11 @@ class DifferenceInDifferences:
     seed : int, optional
         Random seed for reproducibility when using bootstrap inference.
         If None (default), results will vary between runs.
+    rank_deficient_action : str, default "warn"
+        Action when design matrix is rank-deficient (linearly dependent columns):
+        - "warn": Issue warning and drop linearly dependent columns (default)
+        - "error": Raise ValueError
+        - "silent": Drop columns silently without warning
 
     Attributes
     ----------
@@ -120,7 +125,8 @@ def __init__(
         inference: str = "analytical",
         n_bootstrap: int = 999,
         bootstrap_weights: str = "rademacher",
-        seed: Optional[int] = None
+        seed: Optional[int] = None,
+        rank_deficient_action: str = "warn",
     ):
         self.robust = robust
         self.cluster = cluster
@@ -129,6 +135,7 @@ def __init__(
         self.n_bootstrap = n_bootstrap
         self.bootstrap_weights = bootstrap_weights
         self.seed = seed
+        self.rank_deficient_action = rank_deficient_action
 
         self.is_fitted_ = False
         self.results_ = None
@@ -283,6 +290,7 @@ def fit(
             robust=self.robust,
             cluster_ids=cluster_ids if self.inference != "wild_bootstrap" else None,
             alpha=self.alpha,
+            rank_deficient_action=self.rank_deficient_action,
         ).fit(X, y, df_adjustment=n_absorbed_effects)
 
         coefficients = reg.coefficients_
@@ -596,6 +604,7 @@ def get_params(self) -> Dict[str, Any]:
             "n_bootstrap": self.n_bootstrap,
             "bootstrap_weights": self.bootstrap_weights,
             "seed": self.seed,
+            "rank_deficient_action": self.rank_deficient_action,
         }
 
     def set_params(self, **params) -> "DifferenceInDifferences":
@@ -873,29 +882,43 @@ def fit(  # type: ignore[override]
                     var_names.append(col)
 
         # Fit OLS using unified backend
-        coefficients, residuals, fitted, _ = solve_ols(
-            X, y, return_fitted=True, return_vcov=False
+        # Pass cluster_ids to solve_ols for proper vcov computation
+        # This handles rank-deficient matrices by returning NaN for dropped columns
+        cluster_ids = data[self.cluster].values if self.cluster is not None else None
+
+        # Note: Wild bootstrap for multi-period effects is complex (multiple coefficients)
+        # For now, we use analytical inference even if inference="wild_bootstrap"
+        coefficients, residuals, fitted, vcov = solve_ols(
+            X, y,
+            return_fitted=True,
+            return_vcov=True,
+            cluster_ids=cluster_ids,
+            column_names=var_names,
+            rank_deficient_action=self.rank_deficient_action,
         )
         r_squared = compute_r_squared(y, residuals)
 
-        # Degrees of freedom
-        df = len(y) - X.shape[1] - n_absorbed_effects
+        # Degrees of freedom using effective rank (non-NaN coefficients)
+        k_effective = int(np.sum(~np.isnan(coefficients)))
+        df = len(y) - k_effective - n_absorbed_effects
 
-        # Compute standard errors
-        # Note: Wild bootstrap for multi-period effects is complex (multiple coefficients)
-        # For now, we use analytical inference even if inference="wild_bootstrap"
-        if self.cluster is not None:
-            cluster_ids = data[self.cluster].values
-            vcov = compute_robust_vcov(X, residuals, cluster_ids)
-        elif self.robust:
-            vcov = compute_robust_vcov(X, residuals)
-        else:
+        # For non-robust, non-clustered case, we need homoskedastic vcov
+        # solve_ols returns HC1 by default, so compute homoskedastic if needed
+        if not self.robust and self.cluster is None:
             n = len(y)
-            k = X.shape[1]
-            mse = np.sum(residuals**2) / (n - k)
+            mse = np.sum(residuals**2) / (n - k_effective)
             # Use solve() instead of inv() for numerical stability
-            # solve(A, B) computes X where AX=B, so this yields (X'X)^{-1} * mse
-            vcov = np.linalg.solve(X.T @ X, mse * np.eye(k))
+            # Only compute for identified columns (non-NaN coefficients)
+            identified_mask = ~np.isnan(coefficients)
+            if np.all(identified_mask):
+                vcov = np.linalg.solve(X.T @ X, mse * np.eye(X.shape[1]))
+            else:
+                # For rank-deficient case, compute vcov on reduced matrix then expand
+                X_reduced = X[:, identified_mask]
+                vcov_reduced = np.linalg.solve(X_reduced.T @ X_reduced, mse * np.eye(X_reduced.shape[1]))
+                # Expand to full size with NaN for dropped columns
+                vcov = np.full((X.shape[1], X.shape[1]), np.nan)
+                vcov[np.ix_(identified_mask, identified_mask)] = vcov_reduced
 
         # Extract period-specific treatment effects
         period_effects = {}
@@ -922,19 +945,43 @@ def fit(  # type: ignore[override]
             effect_indices.append(idx)
 
         # Compute average treatment effect
-        # Average ATT = mean of period-specific effects
-        avg_att = np.mean(effect_values)
-
-        # Standard error of average: need to account for covariance
-        # Var(avg) = (1/n^2) * sum of all elements in the sub-covariance matrix
-        n_post = len(post_periods)
-        sub_vcov = vcov[np.ix_(effect_indices, effect_indices)]
-        avg_var = np.sum(sub_vcov) / (n_post ** 2)
-        avg_se = np.sqrt(avg_var)
-
-        avg_t_stat = avg_att / avg_se if avg_se > 0 else 0.0
-        avg_p_value = compute_p_value(avg_t_stat, df=df)
-        avg_conf_int = compute_confidence_interval(avg_att, avg_se, self.alpha, df=df)
+        # R-style NA propagation: if ANY period effect is NaN, average is undefined
+        effect_arr = np.array(effect_values)
+
+        if np.any(np.isnan(effect_arr)):
+            # Some period effects are NaN (unidentified) - cannot compute valid average
+            # This follows R's default behavior where mean(c(1, 2, NA)) returns NA
+            avg_att = np.nan
+            avg_se = np.nan
+            avg_t_stat = np.nan
+            avg_p_value = np.nan
+            avg_conf_int = (np.nan, np.nan)
+        else:
+            # All effects identified - compute average normally
+            avg_att = float(np.mean(effect_arr))
+
+            # Standard error of average: need to account for covariance
+            n_post = len(post_periods)
+            sub_vcov = vcov[np.ix_(effect_indices, effect_indices)]
+            avg_var = np.sum(sub_vcov) / (n_post ** 2)
+
+            if np.isnan(avg_var) or avg_var < 0:
+                # Vcov has NaN (dropped columns) - propagate NaN
+                avg_se = np.nan
+                avg_t_stat = np.nan
+                avg_p_value = np.nan
+                avg_conf_int = (np.nan, np.nan)
+            else:
+                avg_se = float(np.sqrt(avg_var))
+                if avg_se > 0:
+                    avg_t_stat = avg_att / avg_se
+                    avg_p_value = compute_p_value(avg_t_stat, df=df)
+                    avg_conf_int = compute_confidence_interval(avg_att, avg_se, self.alpha, df=df)
+                else:
+                    # Zero SE (degenerate case)
+                    avg_t_stat = np.nan
+                    avg_p_value = np.nan
+                    avg_conf_int = (np.nan, np.nan)
 
         # Count observations
         n_treated = int(np.sum(d))
diff --git a/diff_diff/linalg.py b/diff_diff/linalg.py
index 0caf4190..c833a885 100644
--- a/diff_diff/linalg.py
+++ b/diff_diff/linalg.py
@@ -5,15 +5,34 @@
 Rust backend for maximum performance.
 
 The key optimizations are:
-1. scipy.linalg.lstsq with 'gelsy' driver (QR-based, faster than SVD)
+1. scipy.linalg.lstsq with 'gelsd' driver (SVD-based, handles rank-deficient matrices)
 2. Vectorized cluster-robust SE via groupby (eliminates O(n*clusters) loop)
 3. Single interface for all estimators (reduces code duplication)
 4. Optional Rust backend for additional speedup (when available)
+5. R-style rank deficiency handling: detect, warn, and set NA for dropped columns
 
 The Rust backend is automatically used when available, with transparent
 fallback to NumPy/SciPy implementations.
+
+Rank Deficiency Handling
+------------------------
+When a design matrix is rank-deficient (has linearly dependent columns), the OLS
+solution is not unique. This module follows R's `lm()` approach:
+
+1. Detect rank deficiency using pivoted QR decomposition
+2. Identify which columns are linearly dependent
+3. Drop redundant columns from the solve
+4. Set NA (NaN) for coefficients of dropped columns
+5. Warn with clear message listing dropped columns
+6. Compute valid SEs for remaining (identified) coefficients
+
+This is controlled by the `rank_deficient_action` parameter:
+- "warn" (default): Emit warning, set NA for dropped coefficients
+- "error": Raise ValueError with dropped column information
+- "silent": No warning, but still set NA for dropped coefficients
 """
 
+import warnings
 from dataclasses import dataclass
 from typing import Dict, List, Optional, Tuple, Union
 
@@ -21,6 +40,7 @@
 import pandas as pd
 from scipy import stats
 from scipy.linalg import lstsq as scipy_lstsq
+from scipy.linalg import qr
 
 # Import Rust backend if available (from _backend to avoid circular imports)
 from diff_diff._backend import (
@@ -30,6 +50,276 @@
 )
 
 
+# =============================================================================
+# Utility Functions
+# =============================================================================
+
+
+def _factorize_cluster_ids(cluster_ids: np.ndarray) -> np.ndarray:
+    """
+    Convert cluster IDs to contiguous integer codes for Rust backend.
+
+    Handles string, categorical, or non-contiguous integer cluster IDs by
+    mapping them to contiguous integers starting from 0.
+
+    Parameters
+    ----------
+    cluster_ids : np.ndarray
+        Cluster identifiers (can be strings, integers, or categorical).
+
+    Returns
+    -------
+    np.ndarray
+        Integer cluster codes (dtype int64) suitable for Rust backend.
+    """
+    # Use pandas factorize for efficient conversion of any dtype
+    codes, _ = pd.factorize(cluster_ids)
+    return codes.astype(np.int64)
+
+
+# =============================================================================
+# Rank Deficiency Detection and Handling
+# =============================================================================
+
+
+def _detect_rank_deficiency(
+    X: np.ndarray,
+    rcond: Optional[float] = None,
+) -> Tuple[int, np.ndarray, np.ndarray]:
+    """
+    Detect rank deficiency using pivoted QR decomposition.
+
+    This follows R's lm() approach of using pivoted QR to detect which columns
+    are linearly dependent. The pivoting ensures we drop the "least important"
+    columns (those with smallest contribution to the column space).
+
+    Parameters
+    ----------
+    X : ndarray of shape (n, k)
+        Design matrix.
+    rcond : float, optional
+        Relative condition number threshold for determining rank.
+        Diagonal elements of R smaller than rcond * max(|R_ii|) are treated
+        as zero. If None, uses 1e-07 to match R's qr() default tolerance.
+
+    Returns
+    -------
+    rank : int
+        Numerical rank of the matrix.
+    dropped_cols : ndarray of int
+        Indices of columns that are linearly dependent (should be dropped).
+        Empty if matrix is full rank.
+    pivot : ndarray of int
+        Column permutation from QR decomposition.
+    """
+    n, k = X.shape
+
+    # Compute pivoted QR decomposition: X @ P = Q @ R
+    # P is a permutation matrix, represented as pivot indices
+    Q, R, pivot = qr(X, mode='economic', pivoting=True)
+
+    # Determine rank tolerance
+    # R's qr() uses tol = 1e-07 by default, which is sqrt(eps) ≈ 1.49e-08
+    # We use 1e-07 to match R's lm() behavior for consistency
+    if rcond is None:
+        rcond = 1e-07
+
+    # The diagonal of R contains information about linear independence
+    # After pivoting, |R[i,i]| is decreasing
+    r_diag = np.abs(np.diag(R))
+
+    # Find numerical rank: count singular values above threshold
+    # The threshold is relative to the largest diagonal element
+    if r_diag[0] == 0:
+        rank = 0
+    else:
+        tol = rcond * r_diag[0]
+        rank = int(np.sum(r_diag > tol))
+
+    # Columns after rank position (in pivot order) are linearly dependent
+    # We need to map back to original column indices
+    if rank < k:
+        dropped_cols = np.sort(pivot[rank:])
+    else:
+        dropped_cols = np.array([], dtype=int)
+
+    return rank, dropped_cols, pivot
+
+
+def _format_dropped_columns(
+    dropped_cols: np.ndarray,
+    column_names: Optional[List[str]] = None,
+) -> str:
+    """
+    Format dropped column information for error/warning messages.
+
+    Parameters
+    ----------
+    dropped_cols : ndarray of int
+        Indices of dropped columns.
+    column_names : list of str, optional
+        Names for the columns. If None, uses indices.
+
+    Returns
+    -------
+    str
+        Formatted string describing dropped columns.
+    """
+    if len(dropped_cols) == 0:
+        return ""
+
+    if column_names is not None:
+        names = [column_names[i] if i < len(column_names) else f"column {i}"
+                 for i in dropped_cols]
+        if len(names) == 1:
+            return f"'{names[0]}'"
+        elif len(names) <= 5:
+            return ", ".join(f"'{n}'" for n in names)
+        else:
+            shown = ", ".join(f"'{n}'" for n in names[:5])
+            return f"{shown}, ... and {len(names) - 5} more"
+    else:
+        if len(dropped_cols) == 1:
+            return f"column {dropped_cols[0]}"
+        elif len(dropped_cols) <= 5:
+            return ", ".join(f"column {i}" for i in dropped_cols)
+        else:
+            shown = ", ".join(f"column {i}" for i in dropped_cols[:5])
+            return f"{shown}, ... and {len(dropped_cols) - 5} more"
+
+
+def _expand_coefficients_with_nan(
+    coef_reduced: np.ndarray,
+    k_full: int,
+    kept_cols: np.ndarray,
+) -> np.ndarray:
+    """
+    Expand reduced coefficients to full size, filling dropped columns with NaN.
+
+    Parameters
+    ----------
+    coef_reduced : ndarray of shape (rank,)
+        Coefficients for kept columns only.
+    k_full : int
+        Total number of columns in original design matrix.
+    kept_cols : ndarray of int
+        Indices of columns that were kept.
+
+    Returns
+    -------
+    ndarray of shape (k_full,)
+        Full coefficient vector with NaN for dropped columns.
+    """
+    coef_full = np.full(k_full, np.nan)
+    coef_full[kept_cols] = coef_reduced
+    return coef_full
+
+
+def _expand_vcov_with_nan(
+    vcov_reduced: np.ndarray,
+    k_full: int,
+    kept_cols: np.ndarray,
+) -> np.ndarray:
+    """
+    Expand reduced vcov matrix to full size, filling dropped entries with NaN.
+
+    Parameters
+    ----------
+    vcov_reduced : ndarray of shape (rank, rank)
+        Variance-covariance matrix for kept columns only.
+    k_full : int
+        Total number of columns in original design matrix.
+    kept_cols : ndarray of int
+        Indices of columns that were kept.
+
+    Returns
+    -------
+    ndarray of shape (k_full, k_full)
+        Full vcov matrix with NaN for dropped rows/columns.
+    """
+    vcov_full = np.full((k_full, k_full), np.nan)
+    # Use advanced indexing to fill in the kept entries
+    ix = np.ix_(kept_cols, kept_cols)
+    vcov_full[ix] = vcov_reduced
+    return vcov_full
+
+
+def _solve_ols_rust(
+    X: np.ndarray,
+    y: np.ndarray,
+    *,
+    cluster_ids: Optional[np.ndarray] = None,
+    return_vcov: bool = True,
+    return_fitted: bool = False,
+) -> Union[
+    Tuple[np.ndarray, np.ndarray, Optional[np.ndarray]],
+    Tuple[np.ndarray, np.ndarray, np.ndarray, Optional[np.ndarray]],
+]:
+    """
+    Rust backend implementation of solve_ols for full-rank matrices.
+
+    This is only called when:
+    1. The Rust backend is available
+    2. The design matrix is full rank (no rank deficiency handling needed)
+
+    For rank-deficient matrices, the Python backend is used instead to
+    properly handle R-style NA coefficients for dropped columns.
+
+    Why the backends differ (by design):
+    - Rust uses SVD-based solve (minimum-norm solution for rank-deficient)
+    - Python uses pivoted QR to identify and drop linearly dependent columns
+    - ndarray-linalg doesn't support QR with pivoting, so Rust can't identify
+      which specific columns to drop
+    - For full-rank matrices, both approaches give identical results
+    - For rank-deficient matrices, only Python can provide R-style NA handling
+
+    Parameters
+    ----------
+    X : np.ndarray
+        Design matrix of shape (n, k), must be full rank.
+    y : np.ndarray
+        Response vector of shape (n,).
+    cluster_ids : np.ndarray, optional
+        Cluster identifiers for cluster-robust SEs.
+    return_vcov : bool
+        Whether to compute variance-covariance matrix.
+    return_fitted : bool
+        Whether to return fitted values.
+
+    Returns
+    -------
+    coefficients : np.ndarray
+        OLS coefficients of shape (k,).
+    residuals : np.ndarray
+        Residuals of shape (n,).
+    fitted : np.ndarray, optional
+        Fitted values if return_fitted=True.
+    vcov : np.ndarray, optional
+        Variance-covariance matrix if return_vcov=True.
+    """
+    # Convert cluster_ids to int64 for Rust (handles string/categorical IDs)
+    if cluster_ids is not None:
+        cluster_ids = _factorize_cluster_ids(cluster_ids)
+
+    # Call Rust backend
+    coefficients, residuals, vcov = _rust_solve_ols(
+        X, y, cluster_ids=cluster_ids, return_vcov=return_vcov
+    )
+
+    # Convert to numpy arrays
+    coefficients = np.asarray(coefficients)
+    residuals = np.asarray(residuals)
+    if vcov is not None:
+        vcov = np.asarray(vcov)
+
+    # Return with optional fitted values
+    if return_fitted:
+        fitted = X @ coefficients
+        return coefficients, residuals, fitted, vcov
+    else:
+        return coefficients, residuals, vcov
+
+
 def solve_ols(
     X: np.ndarray,
     y: np.ndarray,
@@ -38,6 +328,9 @@ def solve_ols(
     return_vcov: bool = True,
     return_fitted: bool = False,
     check_finite: bool = True,
+    rank_deficient_action: str = "warn",
+    column_names: Optional[List[str]] = None,
+    skip_rank_check: bool = False,
 ) -> Union[
     Tuple[np.ndarray, np.ndarray, Optional[np.ndarray]],
     Tuple[np.ndarray, np.ndarray, np.ndarray, Optional[np.ndarray]],
@@ -65,24 +358,48 @@ def solve_ols(
     check_finite : bool, default True
         Whether to check that X and y contain only finite values (no NaN/Inf).
         Set to False for faster computation if you are certain your data is clean.
+    rank_deficient_action : str, default "warn"
+        How to handle rank-deficient design matrices:
+        - "warn": Emit warning and set NaN for dropped coefficients (R-style)
+        - "error": Raise ValueError with dropped column information
+        - "silent": No warning, but still set NaN for dropped coefficients
+    column_names : list of str, optional
+        Names for the columns (used in warning/error messages).
+        If None, columns are referred to by their indices.
+    skip_rank_check : bool, default False
+        If True, skip the pivoted QR rank check and use Rust backend directly
+        (when available). This saves O(nk²) computation but will not detect
+        rank-deficient matrices. Use only when you know the design matrix is
+        full rank. If the matrix is actually rank-deficient, results may be
+        incorrect (minimum-norm solution instead of R-style NA handling).
 
     Returns
     -------
     coefficients : ndarray of shape (k,)
-        OLS coefficient estimates.
+        OLS coefficient estimates. For rank-deficient matrices, coefficients
+        of linearly dependent columns are set to NaN.
     residuals : ndarray of shape (n,)
-        Residuals (y - X @ coefficients).
+        Residuals (y - fitted). For rank-deficient matrices, uses only
+        identified coefficients to compute fitted values.
     fitted : ndarray of shape (n,), optional
-        Fitted values (X @ coefficients). Only returned if return_fitted=True.
+        Fitted values. For full-rank matrices, this is X @ coefficients.
+        For rank-deficient matrices, uses only identified coefficients
+        (X_reduced @ coefficients_reduced). Only returned if return_fitted=True.
     vcov : ndarray of shape (k, k) or None
         Variance-covariance matrix (HC1 or cluster-robust).
-        None if return_vcov=False.
+        For rank-deficient matrices, rows/columns for dropped coefficients
+        are filled with NaN. None if return_vcov=False.
 
     Notes
     -----
-    This function uses scipy.linalg.lstsq with the 'gelsy' driver, which is
-    QR-based and typically faster than NumPy's default SVD-based solver for
-    well-conditioned matrices.
+    This function detects rank-deficient matrices using pivoted QR decomposition
+    and handles them following R's lm() approach:
+
+    1. Detect linearly dependent columns via pivoted QR
+    2. Drop redundant columns and solve the reduced system
+    3. Set NaN for coefficients of dropped columns
+    4. Compute valid SEs for identified coefficients only
+    5. Expand vcov matrix with NaN for dropped rows/columns
 
     The cluster-robust standard errors use the sandwich estimator with the
     standard small-sample adjustment: (G/(G-1)) * ((n-1)/(n-k)).
@@ -95,6 +412,15 @@ def solve_ols(
     >>> y = 2 + 3 * X[:, 1] + np.random.randn(100)
     >>> coef, resid, vcov = solve_ols(X, y)
     >>> print(f"Intercept: {coef[0]:.2f}, Slope: {coef[1]:.2f}")
+
+    For rank-deficient matrices with collinear columns:
+
+    >>> X = np.random.randn(100, 3)
+    >>> X[:, 2] = X[:, 0] + X[:, 1]  # Perfect collinearity
+    >>> y = np.random.randn(100)
+    >>> coef, resid, vcov = solve_ols(X, y)  # Emits warning
+    >>> print(np.isnan(coef[2]))  # Dropped column has NaN coefficient
+    True
     """
     # Validate inputs
     X = np.asarray(X, dtype=np.float64)
@@ -117,6 +443,14 @@ def solve_ols(
             "Cannot solve underdetermined system."
         )
 
+    # Validate rank_deficient_action
+    valid_actions = {"warn", "error", "silent"}
+    if rank_deficient_action not in valid_actions:
+        raise ValueError(
+            f"rank_deficient_action must be one of {valid_actions}, "
+            f"got '{rank_deficient_action}'"
+        )
+
     # Check for NaN/Inf values if requested
     if check_finite:
         if not np.isfinite(X).all():
@@ -130,43 +464,87 @@ def solve_ols(
                 "Clean your data or set check_finite=False to skip this check."
             )
 
-    # Use Rust backend if available
-    # Note: Fall back to NumPy if check_finite=False since Rust's LAPACK
-    # doesn't support non-finite values
-    if HAS_RUST_BACKEND and check_finite:
-        # Ensure contiguous arrays for Rust
-        X = np.ascontiguousarray(X, dtype=np.float64)
-        y = np.ascontiguousarray(y, dtype=np.float64)
+    # Fast path: skip rank check and use Rust directly when requested
+    # This saves O(nk²) QR overhead but won't detect rank-deficient matrices
+    if skip_rank_check:
+        if HAS_RUST_BACKEND and _rust_solve_ols is not None:
+            return _solve_ols_rust(
+                X, y,
+                cluster_ids=cluster_ids,
+                return_vcov=return_vcov,
+                return_fitted=return_fitted,
+            )
+        # Fall through to Python without rank check (user guarantees full rank)
+        return _solve_ols_numpy(
+            X, y,
+            cluster_ids=cluster_ids,
+            return_vcov=return_vcov,
+            return_fitted=return_fitted,
+            rank_deficient_action=rank_deficient_action,
+            column_names=column_names,
+            _skip_rank_check=True,
+        )
 
-        # Convert cluster_ids to int64 for Rust (if provided)
-        cluster_ids_int = None
-        if cluster_ids is not None:
-            cluster_ids_int = pd.factorize(cluster_ids)[0].astype(np.int64)
+    # Check for rank deficiency using fast pivoted QR decomposition.
+    # This adds O(nk²) overhead but is necessary for:
+    # 1. Detecting which columns to drop (R-style NA handling)
+    # 2. Routing rank-deficient cases to Python (Rust doesn't support pivoted QR)
+    #
+    # Trade-off: ~2x compute cost for full-rank matrices in exchange for proper
+    # rank deficiency handling. For maximum performance on known full-rank data,
+    # set skip_rank_check=True.
+    rank, dropped_cols, pivot = _detect_rank_deficiency(X)
+    is_rank_deficient = len(dropped_cols) > 0
+
+    # Routing strategy:
+    # - Full-rank + Rust available → fast Rust backend (SVD-based solve)
+    # - Rank-deficient → Python backend (proper NA handling, valid SEs)
+    # - No Rust → Python backend (works for all cases)
+    if HAS_RUST_BACKEND and _rust_solve_ols is not None and not is_rank_deficient:
+        result = _solve_ols_rust(
+            X, y,
+            cluster_ids=cluster_ids,
+            return_vcov=return_vcov,
+            return_fitted=return_fitted,
+        )
 
-        try:
-            coefficients, residuals, vcov = _rust_solve_ols(
-                X, y, cluster_ids_int, return_vcov
+        # Check for NaN vcov: Rust SVD may detect rank-deficiency that QR missed
+        # for ill-conditioned matrices (QR and SVD have different numerical properties).
+        # When this happens, fall back to Python's R-style handling.
+        vcov = result[-1]  # vcov is always the last element
+        if return_vcov and vcov is not None and np.any(np.isnan(vcov)):
+            warnings.warn(
+                "Rust backend detected ill-conditioned matrix (NaN in variance-covariance). "
+                "Re-running with Python backend for proper rank detection.",
+                UserWarning,
+                stacklevel=2,
+            )
+            # Force fresh rank detection - don't pass cached info since QR
+            # and SVD disagreed about rank. Python's QR will re-detect and
+            # apply R-style NaN handling for dropped columns.
+            return _solve_ols_numpy(
+                X, y,
+                cluster_ids=cluster_ids,
+                return_vcov=return_vcov,
+                return_fitted=return_fitted,
+                rank_deficient_action=rank_deficient_action,
+                column_names=column_names,
+                _precomputed_rank_info=None,  # Force re-detection
             )
-        except ValueError as e:
-            # Translate Rust LAPACK errors to consistent Python error messages
-            error_msg = str(e)
-            if "Matrix inversion failed" in error_msg or "Least squares failed" in error_msg:
-                raise ValueError(
-                    "Design matrix is rank-deficient (singular X'X matrix). "
-                    "This indicates perfect multicollinearity. Check your fixed effects "
-                    "and covariates for linear dependencies."
-                ) from e
-            raise
-
-        if return_fitted:
-            fitted = X @ coefficients
-            return coefficients, residuals, fitted, vcov
         else:
-            return coefficients, residuals, vcov
+            return result
 
-    # Fallback to NumPy/SciPy implementation
+    # Use NumPy implementation for rank-deficient cases (R-style NA handling)
+    # or when Rust backend is not available
     return _solve_ols_numpy(
-        X, y, cluster_ids=cluster_ids, return_vcov=return_vcov, return_fitted=return_fitted
+        X, y,
+        cluster_ids=cluster_ids,
+        return_vcov=return_vcov,
+        return_fitted=return_fitted,
+        rank_deficient_action=rank_deficient_action,
+        column_names=column_names,
+        # Pass pre-computed rank info to avoid redundant computation
+        _precomputed_rank_info=(rank, dropped_cols, pivot),
     )
 
 
@@ -177,18 +555,20 @@ def _solve_ols_numpy(
     cluster_ids: Optional[np.ndarray] = None,
     return_vcov: bool = True,
     return_fitted: bool = False,
+    rank_deficient_action: str = "warn",
+    column_names: Optional[List[str]] = None,
+    _precomputed_rank_info: Optional[Tuple[int, np.ndarray, np.ndarray]] = None,
+    _skip_rank_check: bool = False,
 ) -> Union[
     Tuple[np.ndarray, np.ndarray, Optional[np.ndarray]],
     Tuple[np.ndarray, np.ndarray, np.ndarray, Optional[np.ndarray]],
 ]:
     """
-    NumPy/SciPy fallback implementation of solve_ols.
+    NumPy/SciPy implementation of solve_ols with R-style rank deficiency handling.
 
-    Uses scipy.linalg.lstsq with 'gelsy' driver (QR with column pivoting)
-    for numerically stable least squares solving. QR decomposition is preferred
-    over normal equations because it doesn't square the condition number of X,
-    making it more robust for ill-conditioned matrices common in DiD designs
-    (e.g., many unit/time fixed effects).
+    Detects rank-deficient matrices using pivoted QR decomposition and handles
+    them following R's lm() approach: drop redundant columns, set NA (NaN) for
+    their coefficients, and compute valid SEs for identified coefficients only.
 
     Parameters
     ----------
@@ -202,32 +582,100 @@ def _solve_ols_numpy(
         Whether to compute variance-covariance matrix.
     return_fitted : bool
         Whether to return fitted values.
+    rank_deficient_action : str
+        How to handle rank deficiency: "warn", "error", or "silent".
+    column_names : list of str, optional
+        Names for the columns (used in warning/error messages).
+    _precomputed_rank_info : tuple, optional
+        Pre-computed (rank, dropped_cols, pivot) from _detect_rank_deficiency.
+        Used internally to avoid redundant computation when called from solve_ols.
+    _skip_rank_check : bool, default False
+        If True, skip rank detection entirely and assume full rank.
+        Used when caller has already determined matrix is full rank.
 
     Returns
     -------
     coefficients : np.ndarray
-        OLS coefficients of shape (k,).
+        OLS coefficients of shape (k,). NaN for dropped columns.
     residuals : np.ndarray
         Residuals of shape (n,).
     fitted : np.ndarray, optional
         Fitted values if return_fitted=True.
     vcov : np.ndarray, optional
-        Variance-covariance matrix if return_vcov=True.
+        Variance-covariance matrix if return_vcov=True. NaN for dropped rows/cols.
     """
-    # Solve OLS using QR decomposition via scipy's optimized LAPACK routines
-    # 'gelsy' uses QR with column pivoting, which is numerically stable even
-    # for ill-conditioned matrices (doesn't square the condition number like
-    # normal equations would)
-    coefficients = scipy_lstsq(X, y, lapack_driver="gelsy", check_finite=False)[0]
+    n, k = X.shape
+
+    # Determine rank deficiency status
+    if _skip_rank_check:
+        # Caller guarantees full rank - skip expensive QR decomposition
+        is_rank_deficient = False
+        dropped_cols = np.array([], dtype=int)
+    elif _precomputed_rank_info is not None:
+        # Use pre-computed rank info
+        rank, dropped_cols, pivot = _precomputed_rank_info
+        is_rank_deficient = len(dropped_cols) > 0
+    else:
+        # Compute rank via pivoted QR
+        rank, dropped_cols, pivot = _detect_rank_deficiency(X)
+        is_rank_deficient = len(dropped_cols) > 0
+
+    if is_rank_deficient:
+        # Format dropped column information for messages
+        dropped_str = _format_dropped_columns(dropped_cols, column_names)
+
+        if rank_deficient_action == "error":
+            raise ValueError(
+                f"Design matrix is rank-deficient. {k - rank} of {k} columns are "
+                f"linearly dependent and cannot be uniquely estimated: {dropped_str}. "
+                "This indicates multicollinearity in your model specification."
+            )
+        elif rank_deficient_action == "warn":
+            warnings.warn(
+                f"Rank-deficient design matrix: dropping {k - rank} of {k} columns "
+                f"({dropped_str}). Coefficients for these columns are set to NA. "
+                "This may indicate multicollinearity in your model specification.",
+                UserWarning,
+                stacklevel=3,  # Point to user code that called solve_ols
+            )
+        # else: "silent" - no warning
+
+        # Extract kept columns for the reduced solve
+        kept_cols = np.array([i for i in range(k) if i not in dropped_cols])
+        X_reduced = X[:, kept_cols]
+
+        # Solve the reduced system (now full-rank)
+        # Use cond=1e-07 for consistency with Rust backend and QR rank tolerance
+        coefficients_reduced = scipy_lstsq(
+            X_reduced, y, lapack_driver="gelsd", check_finite=False, cond=1e-07
+        )[0]
+
+        # Expand coefficients to full size with NaN for dropped columns
+        coefficients = _expand_coefficients_with_nan(coefficients_reduced, k, kept_cols)
+
+        # Compute residuals using only the identified coefficients
+        # Note: Dropped coefficients are NaN, so we use the reduced form
+        fitted = X_reduced @ coefficients_reduced
+        residuals = y - fitted
+
+        # Compute variance-covariance matrix for reduced system, then expand
+        vcov = None
+        if return_vcov:
+            vcov_reduced = _compute_robust_vcov_numpy(X_reduced, residuals, cluster_ids)
+            vcov = _expand_vcov_with_nan(vcov_reduced, k, kept_cols)
+    else:
+        # Full-rank case: proceed normally
+        # Use cond=1e-07 for consistency with Rust backend and QR rank tolerance
+        coefficients = scipy_lstsq(X, y, lapack_driver="gelsd", check_finite=False, cond=1e-07)[0]
 
-    # Compute residuals and fitted values
-    fitted = X @ coefficients
-    residuals = y - fitted
+        # Compute residuals and fitted values
+        fitted = X @ coefficients
+        residuals = y - fitted
 
-    # Compute variance-covariance matrix if requested
-    vcov = None
-    if return_vcov:
-        vcov = _compute_robust_vcov_numpy(X, residuals, cluster_ids)
+        # Compute variance-covariance matrix if requested
+        vcov = None
+        if return_vcov:
+            vcov = _compute_robust_vcov_numpy(X, residuals, cluster_ids)
 
     if return_fitted:
         return coefficients, residuals, fitted, vcov
@@ -475,12 +923,22 @@ class InferenceResult:
     alpha: float = 0.05
 
     def is_significant(self, alpha: Optional[float] = None) -> bool:
-        """Check if the coefficient is statistically significant."""
+        """Check if the coefficient is statistically significant.
+
+        Returns False for NaN p-values (unidentified coefficients).
+        """
+        if np.isnan(self.p_value):
+            return False
         threshold = alpha if alpha is not None else self.alpha
         return self.p_value < threshold
 
     def significance_stars(self) -> str:
-        """Return significance stars based on p-value."""
+        """Return significance stars based on p-value.
+
+        Returns empty string for NaN p-values (unidentified coefficients).
+        """
+        if np.isnan(self.p_value):
+            return ""
         if self.p_value < 0.001:
             return "***"
         elif self.p_value < 0.01:
@@ -526,6 +984,11 @@ class LinearRegression:
         Overrides the `robust` parameter if provided.
     alpha : float, default 0.05
         Significance level for confidence intervals.
+    rank_deficient_action : str, default "warn"
+        Action when design matrix is rank-deficient (linearly dependent columns):
+        - "warn": Issue warning and drop linearly dependent columns (default)
+        - "error": Raise ValueError
+        - "silent": Drop columns silently without warning
 
     Attributes
     ----------
@@ -541,8 +1004,11 @@ class LinearRegression:
         Number of observations (available after fit).
     n_params_ : int
         Number of parameters including intercept (available after fit).
+    n_params_effective_ : int
+        Effective number of parameters after dropping linearly dependent columns.
+        Equals n_params_ for full-rank matrices (available after fit).
     df_ : int
-        Degrees of freedom (n - k) (available after fit).
+        Degrees of freedom (n - n_params_effective) (available after fit).
 
     Examples
     --------
@@ -577,11 +1043,13 @@ def __init__(
         robust: bool = True,
         cluster_ids: Optional[np.ndarray] = None,
         alpha: float = 0.05,
+        rank_deficient_action: str = "warn",
     ):
         self.include_intercept = include_intercept
         self.robust = robust
         self.cluster_ids = cluster_ids
         self.alpha = alpha
+        self.rank_deficient_action = rank_deficient_action
 
         # Fitted attributes (set by fit())
         self.coefficients_: Optional[np.ndarray] = None
@@ -592,6 +1060,7 @@ def __init__(
         self._X: Optional[np.ndarray] = None
         self.n_obs_: Optional[int] = None
         self.n_params_: Optional[int] = None
+        self.n_params_effective_: Optional[int] = None
         self.df_: Optional[int] = None
 
     def fit(
@@ -643,6 +1112,7 @@ def fit(
                 cluster_ids=effective_cluster_ids,
                 return_fitted=True,
                 return_vcov=compute_vcov,
+                rank_deficient_action=self.rank_deficient_action,
             )
         else:
             # Classical OLS - compute vcov separately
@@ -650,15 +1120,37 @@ def fit(
                 X, y,
                 return_fitted=True,
                 return_vcov=False,
+                rank_deficient_action=self.rank_deficient_action,
             )
             # Compute classical OLS variance-covariance matrix
+            # Handle rank-deficient case: use effective rank for df
             n, k = X.shape
-            mse = np.sum(residuals**2) / (n - k)
-            try:
-                vcov = np.linalg.solve(X.T @ X, mse * np.eye(k))
-            except np.linalg.LinAlgError:
-                # Fall back to pseudo-inverse for singular matrices
-                vcov = np.linalg.pinv(X.T @ X) * mse
+            nan_mask = np.isnan(coefficients)
+            k_effective = k - np.sum(nan_mask)  # Number of identified coefficients
+
+            if k_effective == 0:
+                # All coefficients dropped - no valid inference
+                vcov = np.full((k, k), np.nan)
+            elif np.any(nan_mask):
+                # Rank-deficient: compute vcov for identified coefficients only
+                kept_cols = np.where(~nan_mask)[0]
+                X_reduced = X[:, kept_cols]
+                mse = np.sum(residuals**2) / (n - k_effective)
+                try:
+                    vcov_reduced = np.linalg.solve(
+                        X_reduced.T @ X_reduced, mse * np.eye(k_effective)
+                    )
+                except np.linalg.LinAlgError:
+                    vcov_reduced = np.linalg.pinv(X_reduced.T @ X_reduced) * mse
+                # Expand to full size with NaN for dropped columns
+                vcov = _expand_vcov_with_nan(vcov_reduced, k, kept_cols)
+            else:
+                # Full rank: standard computation
+                mse = np.sum(residuals**2) / (n - k)
+                try:
+                    vcov = np.linalg.solve(X.T @ X, mse * np.eye(k))
+                except np.linalg.LinAlgError:
+                    vcov = np.linalg.pinv(X.T @ X) * mse
 
         # Store fitted attributes
         self.coefficients_ = coefficients
@@ -669,7 +1161,12 @@ def fit(
         self._X = X
         self.n_obs_ = X.shape[0]
         self.n_params_ = X.shape[1]
-        self.df_ = self.n_obs_ - self.n_params_ - df_adjustment
+
+        # Compute effective number of parameters (excluding dropped columns)
+        # This is needed for correct degrees of freedom in inference
+        nan_mask = np.isnan(coefficients)
+        self.n_params_effective_ = int(self.n_params_ - np.sum(nan_mask))
+        self.df_ = self.n_obs_ - self.n_params_effective_ - df_adjustment
 
         return self
 
@@ -876,10 +1373,18 @@ def r_squared(self, adjusted: bool = False) -> float:
         -------
         float
             R-squared value.
+
+        Notes
+        -----
+        For rank-deficient fits, adjusted R² uses the effective number of
+        parameters (excluding dropped columns) for consistency with the
+        corrected degrees of freedom.
         """
         self._check_fitted()
+        # Use effective params for adjusted R² to match df correction
+        n_params = self.n_params_effective_ if adjusted else self.n_params_
         return compute_r_squared(
-            self._y, self.residuals_, adjusted=adjusted, n_params=self.n_params_
+            self._y, self.residuals_, adjusted=adjusted, n_params=n_params
         )
 
     def predict(self, X: np.ndarray) -> np.ndarray:
@@ -896,6 +1401,12 @@ def predict(self, X: np.ndarray) -> np.ndarray:
         -------
         ndarray
             Predicted values.
+
+        Notes
+        -----
+        For rank-deficient fits where some coefficients are NaN, predictions
+        use only the identified (non-NaN) coefficients. This is equivalent to
+        treating dropped columns as having zero coefficients.
         """
         self._check_fitted()
         X = np.asarray(X, dtype=np.float64)
@@ -903,7 +1414,12 @@ def predict(self, X: np.ndarray) -> np.ndarray:
         if self.include_intercept:
             X = np.column_stack([np.ones(X.shape[0]), X])
 
-        return X @ self.coefficients_
+        # Handle rank-deficient case: use only identified coefficients
+        # Replace NaN with 0 so they don't contribute to prediction
+        coef = self.coefficients_.copy()
+        coef[np.isnan(coef)] = 0.0
+
+        return X @ coef
 
 
 # =============================================================================
diff --git a/diff_diff/results.py b/diff_diff/results.py
index b99c5170..213d09fd 100644
--- a/diff_diff/results.py
+++ b/diff_diff/results.py
@@ -181,7 +181,14 @@ def significance_stars(self) -> str:
 
 
 def _get_significance_stars(p_value: float) -> str:
-    """Return significance stars based on p-value."""
+    """Return significance stars based on p-value.
+
+    Returns empty string for NaN p-values (unidentified coefficients from
+    rank-deficient matrices).
+    """
+    import numpy as np
+    if np.isnan(p_value):
+        return ""
     if p_value < 0.001:
         return "***"
     elif p_value < 0.01:
diff --git a/diff_diff/staggered.py b/diff_diff/staggered.py
index aa1fce1e..3b1bcb41 100644
--- a/diff_diff/staggered.py
+++ b/diff_diff/staggered.py
@@ -109,6 +109,7 @@ def gradient(beta: np.ndarray) -> np.ndarray:
 def _linear_regression(
     X: np.ndarray,
     y: np.ndarray,
+    rank_deficient_action: str = "warn",
 ) -> Tuple[np.ndarray, np.ndarray]:
     """
     Fit OLS regression.
@@ -119,6 +120,11 @@ def _linear_regression(
         Feature matrix (n_samples, n_features). Intercept added automatically.
     y : np.ndarray
         Outcome variable.
+    rank_deficient_action : str, default "warn"
+        Action when design matrix is rank-deficient:
+        - "warn": Issue warning and drop linearly dependent columns (default)
+        - "error": Raise ValueError
+        - "silent": Drop columns silently without warning
 
     Returns
     -------
@@ -132,7 +138,10 @@ def _linear_regression(
     X_with_intercept = np.column_stack([np.ones(n), X])
 
     # Use unified OLS backend (no vcov needed)
-    beta, residuals, _ = solve_ols(X_with_intercept, y, return_vcov=False)
+    beta, residuals, _ = solve_ols(
+        X_with_intercept, y, return_vcov=False,
+        rank_deficient_action=rank_deficient_action,
+    )
 
     return beta, residuals
 
@@ -195,6 +204,11 @@ class CallawaySantAnna(
             Use ``bootstrap_weights`` instead. Will be removed in v2.0.
     seed : int, optional
         Random seed for reproducibility.
+    rank_deficient_action : str, default="warn"
+        Action when design matrix is rank-deficient (linearly dependent columns):
+        - "warn": Issue warning and drop linearly dependent columns (default)
+        - "error": Raise ValueError
+        - "silent": Drop columns silently without warning
 
     Attributes
     ----------
@@ -277,6 +291,7 @@ def __init__(
         bootstrap_weights: Optional[str] = None,
         bootstrap_weight_type: Optional[str] = None,
         seed: Optional[int] = None,
+        rank_deficient_action: str = "warn",
     ):
         import warnings
 
@@ -312,6 +327,12 @@ def __init__(
                 f"got '{bootstrap_weights}'"
             )
 
+        if rank_deficient_action not in ["warn", "error", "silent"]:
+            raise ValueError(
+                f"rank_deficient_action must be 'warn', 'error', or 'silent', "
+                f"got '{rank_deficient_action}'"
+            )
+
         self.control_group = control_group
         self.anticipation = anticipation
         self.estimation_method = estimation_method
@@ -322,6 +343,7 @@ def __init__(
         # Keep bootstrap_weight_type for backward compatibility
         self.bootstrap_weight_type = bootstrap_weights
         self.seed = seed
+        self.rank_deficient_action = rank_deficient_action
 
         self.is_fitted_ = False
         self.results_: Optional[CallawaySantAnnaResults] = None
@@ -778,7 +800,10 @@ def _outcome_regression(
         if X_treated is not None and X_control is not None and X_treated.shape[1] > 0:
             # Covariate-adjusted outcome regression
             # Fit regression on control units: E[Delta Y | X, D=0]
-            beta, residuals = _linear_regression(X_control, control_change)
+            beta, residuals = _linear_regression(
+                X_control, control_change,
+                rank_deficient_action=self.rank_deficient_action,
+            )
 
             # Predict counterfactual for treated units
             X_treated_with_intercept = np.column_stack([np.ones(n_t), X_treated])
@@ -938,7 +963,10 @@ def _doubly_robust(
         if X_treated is not None and X_control is not None and X_treated.shape[1] > 0:
             # Doubly robust estimation with covariates
             # Step 1: Outcome regression - fit E[Delta Y | X] on control
-            beta, _ = _linear_regression(X_control, control_change)
+            beta, _ = _linear_regression(
+                X_control, control_change,
+                rank_deficient_action=self.rank_deficient_action,
+            )
 
             # Predict counterfactual for both treated and control
             X_treated_with_intercept = np.column_stack([np.ones(n_t), X_treated])
@@ -1014,6 +1042,7 @@ def get_params(self) -> Dict[str, Any]:
             # Deprecated but kept for backward compatibility
             "bootstrap_weight_type": self.bootstrap_weight_type,
             "seed": self.seed,
+            "rank_deficient_action": self.rank_deficient_action,
         }
 
     def set_params(self, **params) -> "CallawaySantAnna":
diff --git a/diff_diff/sun_abraham.py b/diff_diff/sun_abraham.py
index 018b618d..cb7313f9 100644
--- a/diff_diff/sun_abraham.py
+++ b/diff_diff/sun_abraham.py
@@ -319,6 +319,11 @@ class SunAbraham:
         If 0, uses analytical cluster-robust standard errors.
     seed : int, optional
         Random seed for reproducibility.
+    rank_deficient_action : str, default="warn"
+        Action when design matrix is rank-deficient (linearly dependent columns):
+        - "warn": Issue warning and drop linearly dependent columns (default)
+        - "error": Raise ValueError
+        - "silent": Drop columns silently without warning
 
     Attributes
     ----------
@@ -395,6 +400,7 @@ def __init__(
         cluster: Optional[str] = None,
         n_bootstrap: int = 0,
         seed: Optional[int] = None,
+        rank_deficient_action: str = "warn",
     ):
         if control_group not in ["never_treated", "not_yet_treated"]:
             raise ValueError(
@@ -402,12 +408,19 @@ def __init__(
                 f"got '{control_group}'"
             )
 
+        if rank_deficient_action not in ["warn", "error", "silent"]:
+            raise ValueError(
+                f"rank_deficient_action must be 'warn', 'error', or 'silent', "
+                f"got '{rank_deficient_action}'"
+            )
+
         self.control_group = control_group
         self.anticipation = anticipation
         self.alpha = alpha
         self.cluster = cluster
         self.n_bootstrap = n_bootstrap
         self.seed = seed
+        self.rank_deficient_action = rank_deficient_action
 
         self.is_fitted_ = False
         self.results_: Optional[SunAbrahamResults] = None
@@ -756,6 +769,7 @@ def _fit_saturated_regression(
             include_intercept=False,  # Already demeaned, no intercept needed
             robust=True,
             cluster_ids=cluster_ids,
+            rank_deficient_action=self.rank_deficient_action,
         ).fit(X, y)
 
         coefficients = reg.coefficients_
@@ -1153,6 +1167,7 @@ def get_params(self) -> Dict[str, Any]:
             "cluster": self.cluster,
             "n_bootstrap": self.n_bootstrap,
             "seed": self.seed,
+            "rank_deficient_action": self.rank_deficient_action,
         }
 
     def set_params(self, **params) -> "SunAbraham":
diff --git a/diff_diff/triple_diff.py b/diff_diff/triple_diff.py
index 1c82fc05..126d30b0 100644
--- a/diff_diff/triple_diff.py
+++ b/diff_diff/triple_diff.py
@@ -330,6 +330,7 @@ def gradient(beta: np.ndarray) -> np.ndarray:
 def _linear_regression(
     X: np.ndarray,
     y: np.ndarray,
+    rank_deficient_action: str = "warn",
 ) -> Tuple[np.ndarray, np.ndarray, float]:
     """
     Fit OLS regression.
@@ -340,6 +341,11 @@ def _linear_regression(
         Feature matrix (n_samples, n_features). Intercept added automatically.
     y : np.ndarray
         Outcome variable.
+    rank_deficient_action : str, default "warn"
+        Action when design matrix is rank-deficient:
+        - "warn": Issue warning and drop linearly dependent columns (default)
+        - "error": Raise ValueError
+        - "silent": Drop columns silently without warning
 
     Returns
     -------
@@ -355,7 +361,8 @@ def _linear_regression(
 
     # Use unified OLS backend
     beta, residuals, fitted, _ = solve_ols(
-        X_with_intercept, y, return_fitted=True, return_vcov=False
+        X_with_intercept, y, return_fitted=True, return_vcov=False,
+        rank_deficient_action=rank_deficient_action,
     )
 
     # Compute R-squared
@@ -400,6 +407,11 @@ class TripleDifference:
     pscore_trim : float, default=0.01
         Trimming threshold for propensity scores. Scores below this value
         or above (1 - pscore_trim) are clipped to avoid extreme weights.
+    rank_deficient_action : str, default="warn"
+        Action when design matrix is rank-deficient (linearly dependent columns):
+        - "warn": Issue warning and drop linearly dependent columns (default)
+        - "error": Raise ValueError
+        - "silent": Drop columns silently without warning
 
     Attributes
     ----------
@@ -478,17 +490,24 @@ def __init__(
         cluster: Optional[str] = None,
         alpha: float = 0.05,
         pscore_trim: float = 0.01,
+        rank_deficient_action: str = "warn",
     ):
         if estimation_method not in ("dr", "reg", "ipw"):
             raise ValueError(
                 f"estimation_method must be 'dr', 'reg', or 'ipw', "
                 f"got '{estimation_method}'"
             )
+        if rank_deficient_action not in ["warn", "error", "silent"]:
+            raise ValueError(
+                f"rank_deficient_action must be 'warn', 'error', or 'silent', "
+                f"got '{rank_deficient_action}'"
+            )
         self.estimation_method = estimation_method
         self.robust = robust
         self.cluster = cluster
         self.alpha = alpha
         self.pscore_trim = pscore_trim
+        self.rank_deficient_action = rank_deficient_action
 
         self.is_fitted_ = False
         self.results_: Optional[TripleDifferenceResults] = None
@@ -744,6 +763,7 @@ def _regression_adjustment(
             include_intercept=False,  # Intercept already in design_matrix
             robust=self.robust,
             alpha=self.alpha,
+            rank_deficient_action=self.rank_deficient_action,
         ).fit(design_matrix, y)
 
         # ATT is the coefficient on G*P*T (index 7)
@@ -937,7 +957,10 @@ def _doubly_robust(
                     if np.sum(mask) > 1:
                         X_cell = np.column_stack([X[mask], T[mask]])
                         try:
-                            _, fitted, _ = _linear_regression(X_cell, y[mask])
+                            _, fitted, _ = _linear_regression(
+                                X_cell, y[mask],
+                                rank_deficient_action=self.rank_deficient_action,
+                            )
                             mu_fitted[mask] = fitted
                         except Exception:
                             mu_fitted[mask] = np.mean(y[mask])
@@ -1166,6 +1189,7 @@ def get_params(self) -> Dict[str, Any]:
             "cluster": self.cluster,
             "alpha": self.alpha,
             "pscore_trim": self.pscore_trim,
+            "rank_deficient_action": self.rank_deficient_action,
         }
 
     def set_params(self, **params) -> "TripleDifference":
@@ -1223,6 +1247,7 @@ def triple_difference(
     robust: bool = True,
     cluster: Optional[str] = None,
     alpha: float = 0.05,
+    rank_deficient_action: str = "warn",
 ) -> TripleDifferenceResults:
     """
     Estimate Triple Difference (DDD) treatment effect.
@@ -1256,6 +1281,11 @@ def triple_difference(
         Column name for cluster-robust standard errors.
     alpha : float, default=0.05
         Significance level for confidence intervals.
+    rank_deficient_action : str, default="warn"
+        Action when design matrix is rank-deficient:
+        - "warn": Issue warning and drop linearly dependent columns (default)
+        - "error": Raise ValueError
+        - "silent": Drop columns silently without warning
 
     Returns
     -------
@@ -1280,6 +1310,7 @@ def triple_difference(
         robust=robust,
         cluster=cluster,
         alpha=alpha,
+        rank_deficient_action=rank_deficient_action,
     )
     return estimator.fit(
         data=data,
diff --git a/diff_diff/twfe.py b/diff_diff/twfe.py
index f3167ec0..45b08d64 100644
--- a/diff_diff/twfe.py
+++ b/diff_diff/twfe.py
@@ -127,12 +127,30 @@ def fit(  # type: ignore[override]
         # Always use LinearRegression for initial fit (unified code path)
         # For wild bootstrap, we don't need cluster SEs from the initial fit
         cluster_ids = data[cluster_var].values
-        reg = LinearRegression(
-            include_intercept=False,  # Intercept already in X
-            robust=True,  # TWFE always uses robust/cluster SEs
-            cluster_ids=cluster_ids if self.inference != "wild_bootstrap" else None,
-            alpha=self.alpha,
-        ).fit(X, y, df_adjustment=df_adjustment)
+
+        # Pass rank_deficient_action to LinearRegression
+        # If "error", let LinearRegression raise immediately
+        # If "warn" or "silent", suppress generic warning and use TWFE's context-specific
+        # error/warning messages (more informative for panel data)
+        if self.rank_deficient_action == "error":
+            reg = LinearRegression(
+                include_intercept=False,
+                robust=True,
+                cluster_ids=cluster_ids if self.inference != "wild_bootstrap" else None,
+                alpha=self.alpha,
+                rank_deficient_action="error",
+            ).fit(X, y, df_adjustment=df_adjustment)
+        else:
+            # Suppress generic warning, TWFE provides context-specific messages below
+            with warnings.catch_warnings():
+                warnings.filterwarnings("ignore", message="Rank-deficient design matrix")
+                reg = LinearRegression(
+                    include_intercept=False,
+                    robust=True,
+                    cluster_ids=cluster_ids if self.inference != "wild_bootstrap" else None,
+                    alpha=self.alpha,
+                    rank_deficient_action="silent",
+                ).fit(X, y, df_adjustment=df_adjustment)
 
         coefficients = reg.coefficients_
         residuals = reg.residuals_
@@ -140,6 +158,41 @@ def fit(  # type: ignore[override]
         r_squared = reg.r_squared()
         att = coefficients[att_idx]
 
+        # Check for unidentified coefficients (collinearity)
+        # Build column names for informative error messages
+        column_names = ["intercept", "treatment×post"]
+        if covariates:
+            column_names.extend(covariates)
+
+        nan_mask = np.isnan(coefficients)
+        if np.any(nan_mask):
+            dropped_indices = np.where(nan_mask)[0]
+            dropped_names = [column_names[i] if i < len(column_names)
+                            else f"column {i}" for i in dropped_indices]
+
+            # Determine the source of collinearity for better error message
+            if att_idx in dropped_indices:
+                # Treatment coefficient is unidentified
+                raise ValueError(
+                    f"Treatment effect cannot be identified due to collinearity. "
+                    f"Dropped columns: {', '.join(dropped_names)}. "
+                    "This can happen when: (1) treatment is perfectly collinear with "
+                    "unit/time fixed effects, (2) all treated units are treated in all "
+                    "periods, or (3) a covariate is collinear with the treatment indicator. "
+                    "Check your data structure and model specification."
+                )
+            else:
+                # Only covariates are dropped - this is a warning, not an error
+                # The ATT can still be estimated
+                # Respect rank_deficient_action setting for warning
+                if self.rank_deficient_action == "warn":
+                    warnings.warn(
+                        f"Some covariates are collinear and were dropped: "
+                        f"{', '.join(dropped_names)}. The treatment effect is still identified.",
+                        UserWarning,
+                        stacklevel=2,
+                    )
+
         # Get inference - either from bootstrap or analytical
         if self.inference == "wild_bootstrap":
             # Override with wild cluster bootstrap inference
diff --git a/docs/methodology/REGISTRY.md b/docs/methodology/REGISTRY.md
index 36da4796..80e46129 100644
--- a/docs/methodology/REGISTRY.md
+++ b/docs/methodology/REGISTRY.md
@@ -58,7 +58,9 @@ where τ is the ATT.
 *Edge cases:*
 - Empty cells (e.g., no treated-pre observations) raise ValueError
 - Singleton groups in clustering are dropped with warning
-- Perfect collinearity in covariates raises LinAlgError
+- Rank-deficient design matrix (collinearity): warns and sets NA for dropped coefficients (R-style, matches `lm()`)
+  - Tolerance: `1e-07` (matches R's `qr()` default), relative to largest diagonal element of R in QR decomposition
+  - Controllable via `rank_deficient_action` parameter: "warn" (default), "error", or "silent"
 
 **Reference implementation(s):**
 - R: `fixest::feols()` with interaction term
@@ -98,6 +100,8 @@ where E_i is treatment time for unit i, and δ_e are event-study coefficients.
 - Unbalanced panels: only uses observations where event-time is defined
 - Never-treated units: event-time indicators are all zero
 - Endpoint binning: distant event times can be binned
+- Rank-deficient design matrix (collinearity): warns and sets NA for dropped coefficients (R-style, matches `lm()`)
+- Average ATT (`avg_att`) is NA if any post-period effect is unidentified (R-style NA propagation)
 
 **Reference implementation(s):**
 - R: `fixest::feols()` with `i(event_time, ref=-1)`
@@ -139,7 +143,9 @@ where tildes denote demeaned variables.
 
 *Edge cases:*
 - Singleton units/periods are automatically dropped
-- Treatment perfectly collinear with FE raises error
+- Treatment perfectly collinear with FE raises error with informative message listing dropped columns
+- Covariate collinearity emits warning but estimation continues (ATT still identified)
+- Rank-deficient design matrix: warns and sets NA for dropped coefficients (R-style, matches `lm()`)
 - Unbalanced panels handled via proper demeaning
 
 **Reference implementation(s):**
@@ -194,6 +200,10 @@ Aggregations:
 - Groups with single observation: included but may have high variance
 - Missing group-time cells: ATT(g,t) set to NaN
 - Anticipation: `anticipation` parameter shifts reference period
+- Rank-deficient design matrix (covariate collinearity):
+  - Detection: Pivoted QR decomposition with tolerance `1e-07` (R's `qr()` default)
+  - Handling: Warns and drops linearly dependent columns, sets NA for dropped coefficients (R-style, matches `lm()`)
+  - Parameter: `rank_deficient_action` controls behavior: "warn" (default), "error", or "silent"
 
 **Reference implementation(s):**
 - R: `did::att_gt()` (Callaway & Sant'Anna's official package)
@@ -242,6 +252,10 @@ where weights ŵ_{g,e} = n_{g,e} / Σ_g n_{g,e} (sample share of cohort g at eve
 - Single cohort: reduces to standard event study
 - Cohorts with no observations at some event-times: weighted appropriately
 - Extrapolation beyond observed event-times: not estimated
+- Rank-deficient design matrix (covariate collinearity):
+  - Detection: Pivoted QR decomposition with tolerance `1e-07` (R's `qr()` default)
+  - Handling: Warns and drops linearly dependent columns, sets NA for dropped coefficients (R-style, matches `lm()`)
+  - Parameter: `rank_deficient_action` controls behavior: "warn" (default), "error", or "silent"
 
 **Reference implementation(s):**
 - R: `fixest::sunab()` (Laurent Bergé's implementation)
diff --git a/docs/tutorials/02_staggered_did.ipynb b/docs/tutorials/02_staggered_did.ipynb
index ff75190d..f15f6c6a 100644
--- a/docs/tutorials/02_staggered_did.ipynb
+++ b/docs/tutorials/02_staggered_did.ipynb
@@ -29,10 +29,10 @@
    "execution_count": null,
    "metadata": {
     "execution": {
-     "iopub.execute_input": "2026-01-18T18:23:53.563913Z",
-     "iopub.status.busy": "2026-01-18T18:23:53.563787Z",
-     "iopub.status.idle": "2026-01-18T18:23:54.146380Z",
-     "shell.execute_reply": "2026-01-18T18:23:54.146080Z"
+     "iopub.execute_input": "2026-01-20T14:37:22.343991Z",
+     "iopub.status.busy": "2026-01-20T14:37:22.343878Z",
+     "iopub.status.idle": "2026-01-20T14:37:23.009082Z",
+     "shell.execute_reply": "2026-01-20T14:37:23.008730Z"
     }
    },
    "outputs": [],
@@ -66,24 +66,48 @@
    "execution_count": null,
    "metadata": {
     "execution": {
-     "iopub.execute_input": "2026-01-18T18:23:54.147667Z",
-     "iopub.status.busy": "2026-01-18T18:23:54.147584Z",
-     "iopub.status.idle": "2026-01-18T18:23:54.153830Z",
-     "shell.execute_reply": "2026-01-18T18:23:54.153627Z"
+     "iopub.execute_input": "2026-01-20T14:37:23.010593Z",
+     "iopub.status.busy": "2026-01-20T14:37:23.010493Z",
+     "iopub.status.idle": "2026-01-20T14:37:23.019860Z",
+     "shell.execute_reply": "2026-01-20T14:37:23.019620Z"
     }
    },
    "outputs": [],
-   "source": "# Generate staggered adoption data using the library function\nfrom diff_diff import generate_staggered_data\n\n# Generate data with 100 units, 8 periods, two treatment cohorts (periods 3 and 5),\n# and 40% never-treated\ndf = generate_staggered_data(\n    n_units=100,\n    n_periods=8,\n    cohort_periods=[3, 5],  # Treatment cohorts at periods 3 and 5\n    never_treated_frac=0.4,\n    treatment_effect=2.0,\n    dynamic_effects=True,\n    effect_growth=0.5,  # Effect grows 0.5 per period\n    unit_fe_sd=2.0,\n    noise_sd=0.5,\n    seed=42\n)\n\n# Add a 'cohort' column that matches the old format (first_treat is already there)\ndf['cohort'] = df['first_treat']\n\nprint(f\"Dataset: {len(df)} observations, {df['unit'].nunique()} units, {df['period'].nunique()} periods\")\ndf.head(10)"
+   "source": [
+    "# Generate staggered adoption data using the library function\n",
+    "from diff_diff import generate_staggered_data\n",
+    "\n",
+    "# Generate data with 100 units, 8 periods, two treatment cohorts (periods 3 and 5),\n",
+    "# and 40% never-treated\n",
+    "df = generate_staggered_data(\n",
+    "    n_units=100,\n",
+    "    n_periods=8,\n",
+    "    cohort_periods=[3, 5],  # Treatment cohorts at periods 3 and 5\n",
+    "    never_treated_frac=0.4,\n",
+    "    treatment_effect=2.0,\n",
+    "    dynamic_effects=True,\n",
+    "    effect_growth=0.5,  # Effect grows 0.5 per period\n",
+    "    unit_fe_sd=2.0,\n",
+    "    noise_sd=0.5,\n",
+    "    seed=42\n",
+    ")\n",
+    "\n",
+    "# Add a 'cohort' column that matches the old format (first_treat is already there)\n",
+    "df['cohort'] = df['first_treat']\n",
+    "\n",
+    "print(f\"Dataset: {len(df)} observations, {df['unit'].nunique()} units, {df['period'].nunique()} periods\")\n",
+    "df.head(10)"
+   ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
    "metadata": {
     "execution": {
-     "iopub.execute_input": "2026-01-18T18:23:54.165680Z",
-     "iopub.status.busy": "2026-01-18T18:23:54.165600Z",
-     "iopub.status.idle": "2026-01-18T18:23:54.168804Z",
-     "shell.execute_reply": "2026-01-18T18:23:54.168561Z"
+     "iopub.execute_input": "2026-01-20T14:37:23.034321Z",
+     "iopub.status.busy": "2026-01-20T14:37:23.034210Z",
+     "iopub.status.idle": "2026-01-20T14:37:23.037701Z",
+     "shell.execute_reply": "2026-01-20T14:37:23.037433Z"
     }
    },
    "outputs": [],
@@ -115,10 +139,10 @@
    "execution_count": null,
    "metadata": {
     "execution": {
-     "iopub.execute_input": "2026-01-18T18:23:54.169846Z",
-     "iopub.status.busy": "2026-01-18T18:23:54.169789Z",
-     "iopub.status.idle": "2026-01-18T18:23:54.173444Z",
-     "shell.execute_reply": "2026-01-18T18:23:54.173244Z"
+     "iopub.execute_input": "2026-01-20T14:37:23.038882Z",
+     "iopub.status.busy": "2026-01-20T14:37:23.038806Z",
+     "iopub.status.idle": "2026-01-20T14:37:23.046891Z",
+     "shell.execute_reply": "2026-01-20T14:37:23.046635Z"
     }
    },
    "outputs": [],
@@ -160,10 +184,10 @@
    "execution_count": null,
    "metadata": {
     "execution": {
-     "iopub.execute_input": "2026-01-18T18:23:54.174452Z",
-     "iopub.status.busy": "2026-01-18T18:23:54.174400Z",
-     "iopub.status.idle": "2026-01-18T18:23:54.180139Z",
-     "shell.execute_reply": "2026-01-18T18:23:54.179950Z"
+     "iopub.execute_input": "2026-01-20T14:37:23.048342Z",
+     "iopub.status.busy": "2026-01-20T14:37:23.048265Z",
+     "iopub.status.idle": "2026-01-20T14:37:23.056285Z",
+     "shell.execute_reply": "2026-01-20T14:37:23.056047Z"
     }
    },
    "outputs": [],
@@ -188,10 +212,10 @@
    "execution_count": null,
    "metadata": {
     "execution": {
-     "iopub.execute_input": "2026-01-18T18:23:54.181019Z",
-     "iopub.status.busy": "2026-01-18T18:23:54.180969Z",
-     "iopub.status.idle": "2026-01-18T18:23:54.283154Z",
-     "shell.execute_reply": "2026-01-18T18:23:54.282934Z"
+     "iopub.execute_input": "2026-01-20T14:37:23.057419Z",
+     "iopub.status.busy": "2026-01-20T14:37:23.057355Z",
+     "iopub.status.idle": "2026-01-20T14:37:23.190283Z",
+     "shell.execute_reply": "2026-01-20T14:37:23.189957Z"
     }
    },
    "outputs": [],
@@ -233,10 +257,10 @@
    "execution_count": null,
    "metadata": {
     "execution": {
-     "iopub.execute_input": "2026-01-18T18:23:54.284227Z",
-     "iopub.status.busy": "2026-01-18T18:23:54.284141Z",
-     "iopub.status.idle": "2026-01-18T18:23:54.289412Z",
-     "shell.execute_reply": "2026-01-18T18:23:54.289217Z"
+     "iopub.execute_input": "2026-01-20T14:37:23.191535Z",
+     "iopub.status.busy": "2026-01-20T14:37:23.191440Z",
+     "iopub.status.idle": "2026-01-20T14:37:23.218970Z",
+     "shell.execute_reply": "2026-01-20T14:37:23.218711Z"
     }
    },
    "outputs": [],
@@ -275,10 +299,10 @@
    "execution_count": null,
    "metadata": {
     "execution": {
-     "iopub.execute_input": "2026-01-18T18:23:54.290406Z",
-     "iopub.status.busy": "2026-01-18T18:23:54.290354Z",
-     "iopub.status.idle": "2026-01-18T18:23:54.291918Z",
-     "shell.execute_reply": "2026-01-18T18:23:54.291715Z"
+     "iopub.execute_input": "2026-01-20T14:37:23.220213Z",
+     "iopub.status.busy": "2026-01-20T14:37:23.220143Z",
+     "iopub.status.idle": "2026-01-20T14:37:23.222201Z",
+     "shell.execute_reply": "2026-01-20T14:37:23.221951Z"
     }
    },
    "outputs": [],
@@ -298,10 +322,10 @@
    "execution_count": null,
    "metadata": {
     "execution": {
-     "iopub.execute_input": "2026-01-18T18:23:54.292699Z",
-     "iopub.status.busy": "2026-01-18T18:23:54.292649Z",
-     "iopub.status.idle": "2026-01-18T18:23:54.295835Z",
-     "shell.execute_reply": "2026-01-18T18:23:54.295677Z"
+     "iopub.execute_input": "2026-01-20T14:37:23.223255Z",
+     "iopub.status.busy": "2026-01-20T14:37:23.223189Z",
+     "iopub.status.idle": "2026-01-20T14:37:23.227098Z",
+     "shell.execute_reply": "2026-01-20T14:37:23.226868Z"
     }
    },
    "outputs": [],
@@ -326,10 +350,10 @@
    "execution_count": null,
    "metadata": {
     "execution": {
-     "iopub.execute_input": "2026-01-18T18:23:54.296754Z",
-     "iopub.status.busy": "2026-01-18T18:23:54.296696Z",
-     "iopub.status.idle": "2026-01-18T18:23:54.298177Z",
-     "shell.execute_reply": "2026-01-18T18:23:54.297978Z"
+     "iopub.execute_input": "2026-01-20T14:37:23.228183Z",
+     "iopub.status.busy": "2026-01-20T14:37:23.228109Z",
+     "iopub.status.idle": "2026-01-20T14:37:23.229832Z",
+     "shell.execute_reply": "2026-01-20T14:37:23.229620Z"
     }
    },
    "outputs": [],
@@ -347,10 +371,10 @@
    "execution_count": null,
    "metadata": {
     "execution": {
-     "iopub.execute_input": "2026-01-18T18:23:54.299040Z",
-     "iopub.status.busy": "2026-01-18T18:23:54.298981Z",
-     "iopub.status.idle": "2026-01-18T18:23:54.300379Z",
-     "shell.execute_reply": "2026-01-18T18:23:54.300207Z"
+     "iopub.execute_input": "2026-01-20T14:37:23.230821Z",
+     "iopub.status.busy": "2026-01-20T14:37:23.230751Z",
+     "iopub.status.idle": "2026-01-20T14:37:23.232362Z",
+     "shell.execute_reply": "2026-01-20T14:37:23.232135Z"
     }
    },
    "outputs": [],
@@ -367,10 +391,10 @@
    "execution_count": null,
    "metadata": {
     "execution": {
-     "iopub.execute_input": "2026-01-18T18:23:54.301184Z",
-     "iopub.status.busy": "2026-01-18T18:23:54.301120Z",
-     "iopub.status.idle": "2026-01-18T18:23:54.302604Z",
-     "shell.execute_reply": "2026-01-18T18:23:54.302425Z"
+     "iopub.execute_input": "2026-01-20T14:37:23.233366Z",
+     "iopub.status.busy": "2026-01-20T14:37:23.233302Z",
+     "iopub.status.idle": "2026-01-20T14:37:23.235186Z",
+     "shell.execute_reply": "2026-01-20T14:37:23.234965Z"
     }
    },
    "outputs": [],
@@ -412,10 +436,10 @@
    "execution_count": null,
    "metadata": {
     "execution": {
-     "iopub.execute_input": "2026-01-18T18:23:54.303485Z",
-     "iopub.status.busy": "2026-01-18T18:23:54.303436Z",
-     "iopub.status.idle": "2026-01-18T18:23:54.309925Z",
-     "shell.execute_reply": "2026-01-18T18:23:54.309739Z"
+     "iopub.execute_input": "2026-01-20T14:37:23.236227Z",
+     "iopub.status.busy": "2026-01-20T14:37:23.236160Z",
+     "iopub.status.idle": "2026-01-20T14:37:23.246416Z",
+     "shell.execute_reply": "2026-01-20T14:37:23.246159Z"
     }
    },
    "outputs": [],
@@ -452,10 +476,10 @@
    "execution_count": null,
    "metadata": {
     "execution": {
-     "iopub.execute_input": "2026-01-18T18:23:54.310776Z",
-     "iopub.status.busy": "2026-01-18T18:23:54.310721Z",
-     "iopub.status.idle": "2026-01-18T18:23:54.312619Z",
-     "shell.execute_reply": "2026-01-18T18:23:54.312433Z"
+     "iopub.execute_input": "2026-01-20T14:37:23.247555Z",
+     "iopub.status.busy": "2026-01-20T14:37:23.247475Z",
+     "iopub.status.idle": "2026-01-20T14:37:23.249720Z",
+     "shell.execute_reply": "2026-01-20T14:37:23.249504Z"
     }
    },
    "outputs": [],
@@ -492,10 +516,10 @@
    "execution_count": null,
    "metadata": {
     "execution": {
-     "iopub.execute_input": "2026-01-18T18:23:54.313659Z",
-     "iopub.status.busy": "2026-01-18T18:23:54.313599Z",
-     "iopub.status.idle": "2026-01-18T18:23:54.343103Z",
-     "shell.execute_reply": "2026-01-18T18:23:54.342886Z"
+     "iopub.execute_input": "2026-01-20T14:37:23.250809Z",
+     "iopub.status.busy": "2026-01-20T14:37:23.250729Z",
+     "iopub.status.idle": "2026-01-20T14:37:23.288942Z",
+     "shell.execute_reply": "2026-01-20T14:37:23.288685Z"
     }
    },
    "outputs": [],
@@ -521,10 +545,10 @@
    "execution_count": null,
    "metadata": {
     "execution": {
-     "iopub.execute_input": "2026-01-18T18:23:54.344073Z",
-     "iopub.status.busy": "2026-01-18T18:23:54.344016Z",
-     "iopub.status.idle": "2026-01-18T18:23:54.382161Z",
-     "shell.execute_reply": "2026-01-18T18:23:54.381956Z"
+     "iopub.execute_input": "2026-01-20T14:37:23.290133Z",
+     "iopub.status.busy": "2026-01-20T14:37:23.290051Z",
+     "iopub.status.idle": "2026-01-20T14:37:23.342093Z",
+     "shell.execute_reply": "2026-01-20T14:37:23.341852Z"
     }
    },
    "outputs": [],
@@ -557,10 +581,10 @@
    "execution_count": null,
    "metadata": {
     "execution": {
-     "iopub.execute_input": "2026-01-18T18:23:54.383226Z",
-     "iopub.status.busy": "2026-01-18T18:23:54.383149Z",
-     "iopub.status.idle": "2026-01-18T18:23:54.388355Z",
-     "shell.execute_reply": "2026-01-18T18:23:54.388156Z"
+     "iopub.execute_input": "2026-01-20T14:37:23.343322Z",
+     "iopub.status.busy": "2026-01-20T14:37:23.343236Z",
+     "iopub.status.idle": "2026-01-20T14:37:23.349197Z",
+     "shell.execute_reply": "2026-01-20T14:37:23.348958Z"
     }
    },
    "outputs": [],
@@ -600,10 +624,10 @@
    "execution_count": null,
    "metadata": {
     "execution": {
-     "iopub.execute_input": "2026-01-18T18:23:54.389338Z",
-     "iopub.status.busy": "2026-01-18T18:23:54.389272Z",
-     "iopub.status.idle": "2026-01-18T18:23:54.393588Z",
-     "shell.execute_reply": "2026-01-18T18:23:54.393388Z"
+     "iopub.execute_input": "2026-01-20T14:37:23.350293Z",
+     "iopub.status.busy": "2026-01-20T14:37:23.350229Z",
+     "iopub.status.idle": "2026-01-20T14:37:23.355209Z",
+     "shell.execute_reply": "2026-01-20T14:37:23.354990Z"
     }
    },
    "outputs": [],
@@ -639,10 +663,10 @@
    "execution_count": null,
    "metadata": {
     "execution": {
-     "iopub.execute_input": "2026-01-18T18:23:54.394514Z",
-     "iopub.status.busy": "2026-01-18T18:23:54.394463Z",
-     "iopub.status.idle": "2026-01-18T18:23:54.404705Z",
-     "shell.execute_reply": "2026-01-18T18:23:54.404487Z"
+     "iopub.execute_input": "2026-01-20T14:37:23.356328Z",
+     "iopub.status.busy": "2026-01-20T14:37:23.356250Z",
+     "iopub.status.idle": "2026-01-20T14:37:23.378022Z",
+     "shell.execute_reply": "2026-01-20T14:37:23.377751Z"
     }
    },
    "outputs": [],
@@ -674,7 +698,11 @@
    "source": [
     "## 11. Comparing with MultiPeriodDiD\n",
     "\n",
-    "For comparison, here's how you would use `MultiPeriodDiD` which estimates period-specific effects but doesn't handle staggered adoption as carefully."
+    "For comparison, here's how you would use `MultiPeriodDiD` which estimates period-specific effects. \n",
+    "\n",
+    "**Important**: `MultiPeriodDiD` assumes **simultaneous treatment timing** (all treated units get treated at the same time). For staggered adoption, always use `CallawaySantAnna` or `SunAbraham` instead.\n",
+    "\n",
+    "To demonstrate `MultiPeriodDiD` properly, we'll create a simple dataset where all treated units receive treatment at the same time."
    ]
   },
   {
@@ -682,24 +710,38 @@
    "execution_count": null,
    "metadata": {
     "execution": {
-     "iopub.execute_input": "2026-01-18T18:23:54.405752Z",
-     "iopub.status.busy": "2026-01-18T18:23:54.405696Z",
-     "iopub.status.idle": "2026-01-18T18:23:54.408664Z",
-     "shell.execute_reply": "2026-01-18T18:23:54.408454Z"
+     "iopub.execute_input": "2026-01-20T14:37:23.379194Z",
+     "iopub.status.busy": "2026-01-20T14:37:23.379121Z",
+     "iopub.status.idle": "2026-01-20T14:37:23.388215Z",
+     "shell.execute_reply": "2026-01-20T14:37:23.387986Z"
     }
    },
    "outputs": [],
    "source": [
-    "# For this comparison, let's use data from cohort 3 only (single treatment timing)\n",
-    "cohort3_df = df[df['cohort'].isin([0, 3])].copy()\n",
+    "# Create a simple dataset with simultaneous treatment timing\n",
+    "# This is the appropriate data structure for MultiPeriodDiD\n",
+    "from diff_diff import generate_did_data\n",
+    "\n",
+    "# Generate data with simultaneous treatment at period 4\n",
+    "mp_data = generate_did_data(\n",
+    "    n_units=100,\n",
+    "    n_periods=8,\n",
+    "    treatment_period=4,  # All treated units get treatment at period 4\n",
+    "    treatment_fraction=0.5,\n",
+    "    treatment_effect=2.5,\n",
+    "    seed=42\n",
+    ")\n",
+    "\n",
+    "print(f\"MultiPeriodDiD dataset: {len(mp_data)} obs\")\n",
+    "print(f\"Treatment starts at period 4 for all treated units\")\n",
     "\n",
     "mp_did = MultiPeriodDiD()\n",
     "results_mp = mp_did.fit(\n",
-    "    cohort3_df,\n",
+    "    mp_data,\n",
     "    outcome=\"outcome\",\n",
     "    treatment=\"treated\",\n",
     "    time=\"period\",\n",
-    "    post_periods=[3, 4, 5, 6, 7]\n",
+    "    post_periods=[4, 5, 6, 7]\n",
     ")\n",
     "\n",
     "print(results_mp.summary())"
@@ -710,10 +752,10 @@
    "execution_count": null,
    "metadata": {
     "execution": {
-     "iopub.execute_input": "2026-01-18T18:23:54.409636Z",
-     "iopub.status.busy": "2026-01-18T18:23:54.409582Z",
-     "iopub.status.idle": "2026-01-18T18:23:54.411076Z",
-     "shell.execute_reply": "2026-01-18T18:23:54.410866Z"
+     "iopub.execute_input": "2026-01-20T14:37:23.389404Z",
+     "iopub.status.busy": "2026-01-20T14:37:23.389320Z",
+     "iopub.status.idle": "2026-01-20T14:37:23.391035Z",
+     "shell.execute_reply": "2026-01-20T14:37:23.390778Z"
     }
    },
    "outputs": [],
@@ -749,10 +791,10 @@
    "execution_count": null,
    "metadata": {
     "execution": {
-     "iopub.execute_input": "2026-01-18T18:23:54.412012Z",
-     "iopub.status.busy": "2026-01-18T18:23:54.411958Z",
-     "iopub.status.idle": "2026-01-18T18:23:54.422080Z",
-     "shell.execute_reply": "2026-01-18T18:23:54.421878Z"
+     "iopub.execute_input": "2026-01-20T14:37:23.392190Z",
+     "iopub.status.busy": "2026-01-20T14:37:23.392111Z",
+     "iopub.status.idle": "2026-01-20T14:37:23.410168Z",
+     "shell.execute_reply": "2026-01-20T14:37:23.409921Z"
     }
    },
    "outputs": [],
@@ -780,10 +822,10 @@
    "execution_count": null,
    "metadata": {
     "execution": {
-     "iopub.execute_input": "2026-01-18T18:23:54.423081Z",
-     "iopub.status.busy": "2026-01-18T18:23:54.423018Z",
-     "iopub.status.idle": "2026-01-18T18:23:54.425058Z",
-     "shell.execute_reply": "2026-01-18T18:23:54.424845Z"
+     "iopub.execute_input": "2026-01-20T14:37:23.411166Z",
+     "iopub.status.busy": "2026-01-20T14:37:23.411095Z",
+     "iopub.status.idle": "2026-01-20T14:37:23.413556Z",
+     "shell.execute_reply": "2026-01-20T14:37:23.413310Z"
     }
    },
    "outputs": [],
@@ -819,10 +861,10 @@
    "execution_count": null,
    "metadata": {
     "execution": {
-     "iopub.execute_input": "2026-01-18T18:23:54.426005Z",
-     "iopub.status.busy": "2026-01-18T18:23:54.425952Z",
-     "iopub.status.idle": "2026-01-18T18:23:54.427888Z",
-     "shell.execute_reply": "2026-01-18T18:23:54.427679Z"
+     "iopub.execute_input": "2026-01-20T14:37:23.414763Z",
+     "iopub.status.busy": "2026-01-20T14:37:23.414691Z",
+     "iopub.status.idle": "2026-01-20T14:37:23.417345Z",
+     "shell.execute_reply": "2026-01-20T14:37:23.417101Z"
     }
    },
    "outputs": [],
@@ -907,7 +949,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.12.4"
+   "version": "3.9.6"
   }
  },
  "nbformat": 4,
diff --git a/rust/src/linalg.rs b/rust/src/linalg.rs
index de7dace6..79c5f2e2 100644
--- a/rust/src/linalg.rs
+++ b/rust/src/linalg.rs
@@ -6,13 +6,27 @@
 //! - Cluster-robust variance-covariance estimation
 
 use ndarray::{Array1, Array2, ArrayView1, ArrayView2, Axis};
-use ndarray_linalg::{FactorizeC, LeastSquaresSvd, Solve, SolveC, UPLO};
+use ndarray_linalg::{FactorizeC, Solve, SolveC, SVD, UPLO};
 use numpy::{IntoPyArray, PyArray1, PyArray2, PyReadonlyArray1, PyReadonlyArray2};
 use pyo3::prelude::*;
 use std::collections::HashMap;
 
 /// Solve OLS regression: β = (X'X)^{-1} X'y
 ///
+/// Uses SVD with truncation for rank-deficient matrices:
+/// - Computes SVD: X = U * S * V^T
+/// - Truncates singular values below rcond * max(S)
+/// - Computes solution: β = V * S^{-1}_truncated * U^T * y
+///
+/// This matches scipy's 'gelsd' driver behavior for handling rank-deficient
+/// design matrices that can occur in DiD estimation (e.g., MultiPeriodDiD
+/// with redundant period dummies + treatment interactions).
+///
+/// For rank-deficient matrices (rank < k), the vcov matrix is filled with NaN
+/// since the sandwich estimator requires inverting the singular X'X matrix.
+/// The Python wrapper should use the full R-style handling with QR pivoting
+/// for proper rank-deficiency support.
+///
 /// # Arguments
 /// * `x` - Design matrix (n, k)
 /// * `y` - Response vector (n,)
@@ -20,7 +34,8 @@ use std::collections::HashMap;
 /// * `return_vcov` - Whether to compute and return variance-covariance matrix
 ///
 /// # Returns
-/// Tuple of (coefficients, residuals, vcov) where vcov is None if return_vcov=False
+/// Tuple of (coefficients, residuals, vcov) where vcov is None if return_vcov=False,
+/// or NaN-filled matrix if rank-deficient
 #[pyfunction]
 #[pyo3(signature = (x, y, cluster_ids=None, return_vcov=true))]
 pub fn solve_ols<'py>(
@@ -37,25 +52,69 @@ pub fn solve_ols<'py>(
     let x_arr = x.as_array();
     let y_arr = y.as_array();
 
-    // Solve least squares using SVD (more stable than normal equations)
+    let n = x_arr.nrows();
+    let k = x_arr.ncols();
+
+    // Solve using SVD with truncation for rank-deficient matrices
+    // This matches scipy's 'gelsd' behavior
     let x_owned = x_arr.to_owned();
     let y_owned = y_arr.to_owned();
 
-    let result = x_owned
-        .least_squares(&y_owned)
-        .map_err(|e| PyErr::new::<pyo3::exceptions::PyValueError, _>(format!("Least squares failed: {}", e)))?;
+    // Compute SVD: X = U * S * V^T
+    let (u_opt, s, vt_opt) = x_owned
+        .svd(true, true)
+        .map_err(|e| PyErr::new::<pyo3::exceptions::PyValueError, _>(format!("SVD failed: {}", e)))?;
+
+    let u = u_opt.ok_or_else(|| {
+        PyErr::new::<pyo3::exceptions::PyValueError, _>("SVD did not return U matrix")
+    })?;
+    let vt = vt_opt.ok_or_else(|| {
+        PyErr::new::<pyo3::exceptions::PyValueError, _>("SVD did not return V^T matrix")
+    })?;
+
+    // Compute rcond threshold to match R's lm() behavior
+    // R's qr() uses tol = 1e-07 by default, which is sqrt(eps) ≈ 1.49e-08
+    // We use 1e-07 for consistency with Python backend and R
+    let rcond = 1e-07_f64;
+    let s_max = s.iter().cloned().fold(0.0_f64, f64::max);
+    let threshold = s_max * rcond;
+
+    // Compute truncated pseudoinverse solution: β = V * S^{-1} * U^T * y
+    // Singular values below threshold are treated as zero (truncated)
+    let uty = u.t().dot(&y_owned); // (min(n,k),)
+
+    // Build S^{-1} with truncation and count effective rank
+    let mut s_inv_uty = Array1::<f64>::zeros(k);
+    let mut rank = 0usize;
+    for i in 0..s.len().min(k) {
+        if s[i] > threshold {
+            s_inv_uty[i] = uty[i] / s[i];
+            rank += 1;
+        }
+        // else: leave as 0 (truncate this singular value)
+    }
 
-    let coefficients = result.solution;
+    // Compute coefficients: β = V * (S^{-1} * U^T * y)
+    let coefficients = vt.t().dot(&s_inv_uty);
 
     // Compute fitted values and residuals
     let fitted = x_arr.dot(&coefficients);
     let residuals = &y_arr - &fitted;
 
     // Compute variance-covariance if requested
+    // For rank-deficient matrices, return NaN vcov since X'X is singular
     let vcov = if return_vcov {
-        let cluster_arr = cluster_ids.as_ref().map(|c| c.as_array().to_owned());
-        let vcov_arr = compute_robust_vcov_internal(&x_arr, &residuals.view(), cluster_arr.as_ref())?;
-        Some(vcov_arr.into_pyarray(py))
+        if rank < k {
+            // Rank-deficient: cannot compute valid vcov, return NaN matrix
+            let mut nan_vcov = Array2::<f64>::zeros((k, k));
+            nan_vcov.fill(f64::NAN);
+            Some(nan_vcov.into_pyarray(py))
+        } else {
+            // Full rank: compute robust vcov normally
+            let cluster_arr = cluster_ids.as_ref().map(|c| c.as_array().to_owned());
+            let vcov_arr = compute_robust_vcov_internal(&x_arr, &residuals.view(), cluster_arr.as_ref())?;
+            Some(vcov_arr.into_pyarray(py))
+        }
     } else {
         None
     };
@@ -224,7 +283,9 @@ fn invert_symmetric(a: &Array2<f64>) -> PyResult<Array2<f64>> {
 
         let col = a.solve(&e_i).map_err(|e| {
             PyErr::new::<pyo3::exceptions::PyValueError, _>(format!(
-                "Matrix inversion failed: {}",
+                "Matrix inversion failed (likely rank-deficient X'X): {}. \
+                 If the design matrix is rank-deficient, use solve_ols without \
+                 skip_rank_check=True to enable R-style handling.",
                 e
             ))
         })?;
diff --git a/tests/test_estimators.py b/tests/test_estimators.py
index a2e383f9..141bd4e7 100644
--- a/tests/test_estimators.py
+++ b/tests/test_estimators.py
@@ -211,6 +211,79 @@ def test_unfitted_model_error(self):
         with pytest.raises(RuntimeError, match="fitted"):
             did.summary()
 
+    def test_rank_deficient_action_error_raises(self, simple_2x2_data):
+        """Test that rank_deficient_action='error' raises ValueError on collinear data."""
+        # Add a covariate that is perfectly collinear with treatment
+        data = simple_2x2_data.copy()
+        data["collinear_cov"] = data["treated"].copy()
+
+        did = DifferenceInDifferences(rank_deficient_action="error")
+        with pytest.raises(ValueError, match="rank-deficient"):
+            did.fit(
+                data,
+                outcome="outcome",
+                treatment="treated",
+                time="post",
+                covariates=["collinear_cov"]
+            )
+
+    def test_rank_deficient_action_silent_no_warning(self, simple_2x2_data):
+        """Test that rank_deficient_action='silent' produces no warning."""
+        import warnings
+
+        # Add a covariate that is perfectly collinear with treatment
+        data = simple_2x2_data.copy()
+        data["collinear_cov"] = data["treated"].copy()
+
+        did = DifferenceInDifferences(rank_deficient_action="silent")
+
+        with warnings.catch_warnings(record=True) as w:
+            warnings.simplefilter("always")
+            results = did.fit(
+                data,
+                outcome="outcome",
+                treatment="treated",
+                time="post",
+                covariates=["collinear_cov"]
+            )
+
+            # No warnings about rank deficiency should be emitted
+            rank_warnings = [x for x in w if "Rank-deficient" in str(x.message)
+                           or "rank-deficient" in str(x.message).lower()]
+            assert len(rank_warnings) == 0, f"Expected no rank warnings, got {rank_warnings}"
+
+        # Should still have NaN for dropped coefficient
+        assert "collinear_cov" in results.coefficients
+        # Either collinear_cov or treated will be NaN
+        has_nan = (np.isnan(results.coefficients.get("collinear_cov", 0)) or
+                   np.isnan(results.coefficients.get("treated", 0)))
+        assert has_nan, "Expected NaN for one of the collinear coefficients"
+
+    def test_rank_deficient_action_warn_default(self, simple_2x2_data):
+        """Test that rank_deficient_action='warn' (default) emits warning."""
+        import warnings
+
+        # Add a covariate that is perfectly collinear with treatment
+        data = simple_2x2_data.copy()
+        data["collinear_cov"] = data["treated"].copy()
+
+        did = DifferenceInDifferences()  # Default is "warn"
+
+        with warnings.catch_warnings(record=True) as w:
+            warnings.simplefilter("always")
+            results = did.fit(
+                data,
+                outcome="outcome",
+                treatment="treated",
+                time="post",
+                covariates=["collinear_cov"]
+            )
+
+            # Should have a warning about rank deficiency
+            rank_warnings = [x for x in w if "Rank-deficient" in str(x.message)
+                           or "rank-deficient" in str(x.message).lower()]
+            assert len(rank_warnings) > 0, "Expected warning about rank deficiency"
+
 
 class TestDiDResults:
     """Tests for DiDResults class."""
@@ -704,7 +777,9 @@ class TestEdgeCases:
     """Tests for edge cases and robustness."""
 
     def test_multicollinearity_detection(self):
-        """Test that perfect multicollinearity is detected."""
+        """Test that perfect multicollinearity is detected and warning is emitted."""
+        import warnings
+
         # Create data where a covariate is perfectly correlated with treatment
         data = pd.DataFrame({
             "outcome": [10, 11, 15, 18, 9, 10, 12, 13],
@@ -714,14 +789,23 @@ def test_multicollinearity_detection(self):
         })
 
         did = DifferenceInDifferences()
-        with pytest.raises(ValueError, match="rank-deficient"):
-            did.fit(
+
+        # With R-style rank deficiency handling, a warning is emitted
+        with warnings.catch_warnings(record=True) as w:
+            warnings.simplefilter("always")
+            result = did.fit(
                 data,
                 outcome="outcome",
                 treatment="treated",
                 time="post",
                 covariates=["duplicate_treated"]
             )
+            # Should emit a warning about rank deficiency
+            rank_warnings = [x for x in w if "Rank-deficient" in str(x.message)]
+            assert len(rank_warnings) > 0, "Expected warning about rank deficiency"
+
+        # ATT should still be finite
+        assert np.isfinite(result.att)
 
     def test_wasserstein_custom_threshold(self):
         """Test that custom Wasserstein threshold is respected."""
@@ -930,6 +1014,102 @@ def test_twfe_clusters_at_unit_level(self, twfe_panel_data):
         # But the results should still reflect cluster-robust SEs were computed correctly
         assert results.se > 0
 
+    def test_twfe_treatment_collinearity_raises_error(self):
+        """Test that TWFE raises informative error when treatment is collinear."""
+        from diff_diff.estimators import TwoWayFixedEffects
+
+        # Create data where treatment is perfectly collinear with fixed effects
+        # (all treated units are treated in all periods)
+        data = []
+        for unit in range(10):
+            is_treated = unit < 5
+            for period in range(4):
+                data.append({
+                    "unit": unit,
+                    "period": period,
+                    "treated": int(is_treated),  # Same for all periods
+                    "post": 1 if period >= 2 else 0,
+                    "outcome": 10.0 + unit * 0.5 + period * 0.3 + np.random.normal(0, 0.1),
+                })
+        df = pd.DataFrame(data)
+
+        # Make treatment_post constant for treated units (collinear)
+        # by making treatment only occur in post periods
+        df_collinear = df.copy()
+        # This creates perfect collinearity: treatment is perfectly predicted by unit FE
+        # since treated units always have treated=1 and control units always have treated=0
+
+        twfe = TwoWayFixedEffects()
+
+        # Should raise or warn about collinearity - depends on what columns get dropped
+        # The key is that it should NOT silently produce misleading results
+        try:
+            results = twfe.fit(
+                df_collinear,
+                outcome="outcome",
+                treatment="treated",
+                time="post",
+                unit="unit"
+            )
+            # If we get here without error, the ATT should still be computed
+            # (this means only covariates were dropped, not the treatment)
+            assert results is not None
+        except ValueError as e:
+            # If treatment column is dropped, should get informative error
+            assert "collinear" in str(e).lower() or "Treatment effect cannot be identified" in str(e)
+
+    def test_rank_deficient_action_error_raises(self, twfe_panel_data):
+        """Test that rank_deficient_action='error' raises ValueError on collinear data."""
+        from diff_diff.estimators import TwoWayFixedEffects
+
+        # Add a covariate that is perfectly collinear with post
+        twfe_panel_data = twfe_panel_data.copy()
+        twfe_panel_data["collinear_cov"] = twfe_panel_data["post"].copy()
+
+        twfe = TwoWayFixedEffects(rank_deficient_action="error")
+        with pytest.raises(ValueError, match="rank-deficient"):
+            twfe.fit(
+                twfe_panel_data,
+                outcome="outcome",
+                treatment="treated",
+                time="post",
+                unit="unit",
+                covariates=["collinear_cov"]
+            )
+
+    def test_rank_deficient_action_silent_no_warning(self, twfe_panel_data):
+        """Test that rank_deficient_action='silent' produces no warning."""
+        import warnings
+        from diff_diff.estimators import TwoWayFixedEffects
+
+        # Add a covariate that is perfectly collinear with another
+        twfe_panel_data = twfe_panel_data.copy()
+        twfe_panel_data["size"] = np.random.normal(100, 10, len(twfe_panel_data))
+        twfe_panel_data["size_dup"] = twfe_panel_data["size"].copy()  # Perfect collinearity
+
+        twfe = TwoWayFixedEffects(rank_deficient_action="silent")
+
+        with warnings.catch_warnings(record=True) as w:
+            warnings.simplefilter("always")
+            results = twfe.fit(
+                twfe_panel_data,
+                outcome="outcome",
+                treatment="treated",
+                time="post",
+                unit="unit",
+                covariates=["size", "size_dup"]
+            )
+
+            # No warnings about rank deficiency or collinearity should be emitted
+            rank_warnings = [x for x in w if "Rank-deficient" in str(x.message)
+                           or "rank-deficient" in str(x.message).lower()
+                           or "collinear" in str(x.message).lower()]
+            assert len(rank_warnings) == 0, f"Expected no rank warnings, got {rank_warnings}"
+
+        # Should still get valid results
+        assert results is not None
+        assert twfe.is_fitted_
+
 
 class TestClusterRobustSE:
     """Tests for cluster-robust standard errors."""
@@ -1569,6 +1749,137 @@ def test_coefficients_dict(self, multi_period_data):
         # Treatment interactions
         assert any("treated:period_" in k for k in results.coefficients)
 
+    def test_rank_deficient_design_warns_and_sets_nan(self, multi_period_data):
+        """Test that rank-deficient design matrix warns and sets NaN for dropped columns."""
+        import warnings
+
+        # Add a covariate that is perfectly collinear with an existing column
+        # Use exact duplicate to ensure perfect collinearity is detected
+        multi_period_data = multi_period_data.copy()
+        multi_period_data["collinear_cov"] = multi_period_data["treated"].copy()
+
+        did = MultiPeriodDiD()
+
+        with warnings.catch_warnings(record=True) as w:
+            warnings.simplefilter("always")
+            results = did.fit(
+                multi_period_data,
+                outcome="outcome",
+                treatment="treated",
+                time="period",
+                post_periods=[3, 4, 5],
+                covariates=["collinear_cov"]
+            )
+
+        # Should have warning about rank deficiency
+        rank_warnings = [x for x in w if "Rank-deficient" in str(x.message)
+                        or "collinear" in str(x.message).lower()]
+        assert len(rank_warnings) > 0, "Expected warning about rank deficiency"
+
+        # The collinear covariate should have NaN coefficient
+        assert "collinear_cov" in results.coefficients
+        assert np.isnan(results.coefficients["collinear_cov"]), \
+            "Collinear covariate coefficient should be NaN"
+
+        # Treatment effects should still be identified (not NaN)
+        for period in [3, 4, 5]:
+            pe = results.period_effects[period]
+            assert not np.isnan(pe.effect), f"Period {period} effect should be identified"
+            assert not np.isnan(pe.se), f"Period {period} SE should be valid"
+            assert pe.se > 0, f"Period {period} SE should be positive"
+
+        # Vcov should have NaN for the collinear column
+        assert results.vcov is not None
+        assert np.any(np.isnan(results.vcov)), "Vcov should have NaN for dropped column"
+
+        # avg_att should still be computed because all period effects are identified
+        assert not np.isnan(results.avg_att), "avg_att should be valid when all period effects are identified"
+
+    def test_avg_att_nan_when_period_effect_nan(self, multi_period_data):
+        """Test that avg_att is NaN if any period effect is NaN (R-style NA propagation)."""
+        import warnings
+
+        # Remove all treated observations in period 3 to make that interaction
+        # unidentified (column of zeros)
+        data_no_treated_period3 = multi_period_data[
+            ~((multi_period_data["treated"] == 1) & (multi_period_data["period"] == 3))
+        ].copy()
+
+        did = MultiPeriodDiD()
+
+        with warnings.catch_warnings(record=True) as w:
+            warnings.simplefilter("always")
+            results = did.fit(
+                data_no_treated_period3,
+                outcome="outcome",
+                treatment="treated",
+                time="period",
+                post_periods=[3, 4, 5]
+            )
+
+        # Should have warning about rank deficiency (treated:period_3 is all zeros)
+        rank_warnings = [x for x in w if "Rank-deficient" in str(x.message)
+                        or "collinear" in str(x.message).lower()]
+        assert len(rank_warnings) > 0, "Expected warning about rank deficiency"
+
+        # The treated×period_3 interaction should have NaN coefficient (unidentified)
+        pe_3 = results.period_effects[3]
+        assert np.isnan(pe_3.effect), "Period 3 effect should be NaN (unidentified)"
+
+        # avg_att should be NaN because one period effect is NaN (R-style NA propagation)
+        assert np.isnan(results.avg_att), "avg_att should be NaN when any period effect is NaN"
+        assert np.isnan(results.avg_se), "avg_se should be NaN when avg_att is NaN"
+        assert np.isnan(results.avg_t_stat), "avg_t_stat should be NaN when avg_att is NaN"
+        assert np.isnan(results.avg_p_value), "avg_p_value should be NaN when avg_att is NaN"
+
+    def test_rank_deficient_action_error_raises(self, multi_period_data):
+        """Test that rank_deficient_action='error' raises ValueError on collinear data."""
+        # Add a covariate that is perfectly collinear with treatment
+        multi_period_data = multi_period_data.copy()
+        multi_period_data["collinear_cov"] = multi_period_data["treated"].copy()
+
+        did = MultiPeriodDiD(rank_deficient_action="error")
+        with pytest.raises(ValueError, match="rank-deficient"):
+            did.fit(
+                multi_period_data,
+                outcome="outcome",
+                treatment="treated",
+                time="period",
+                post_periods=[3, 4, 5],
+                covariates=["collinear_cov"]
+            )
+
+    def test_rank_deficient_action_silent_no_warning(self, multi_period_data):
+        """Test that rank_deficient_action='silent' produces no warning."""
+        import warnings
+
+        # Add a covariate that is perfectly collinear with treatment
+        multi_period_data = multi_period_data.copy()
+        multi_period_data["collinear_cov"] = multi_period_data["treated"].copy()
+
+        did = MultiPeriodDiD(rank_deficient_action="silent")
+
+        with warnings.catch_warnings(record=True) as w:
+            warnings.simplefilter("always")
+            results = did.fit(
+                multi_period_data,
+                outcome="outcome",
+                treatment="treated",
+                time="period",
+                post_periods=[3, 4, 5],
+                covariates=["collinear_cov"]
+            )
+
+            # No warnings about rank deficiency should be emitted
+            rank_warnings = [x for x in w if "Rank-deficient" in str(x.message)
+                           or "rank-deficient" in str(x.message).lower()]
+            assert len(rank_warnings) == 0, f"Expected no rank warnings, got {rank_warnings}"
+
+        # Should still have NaN for dropped coefficient
+        assert "collinear_cov" in results.coefficients
+        assert np.isnan(results.coefficients["collinear_cov"]), \
+            "Collinear covariate coefficient should be NaN"
+
 
 class TestSyntheticDiD:
     """Tests for SyntheticDiD estimator."""
@@ -2595,14 +2906,15 @@ def test_sdid_single_treated_unit(self):
 class TestCollinearityDetection:
     """Tests for handling perfect or near collinearity."""
 
-    def test_did_with_redundant_covariate_handles_gracefully(self):
-        """Test DiD handles perfectly collinear covariates gracefully.
+    def test_did_with_redundant_covariate_emits_warning(self):
+        """Test DiD emits warning for perfectly collinear covariates.
 
-        Note: scipy's gelsy solver handles rank-deficiency gracefully via
-        QR with column pivoting. Results may be numerically unstable but
-        the computation completes. This is acceptable for performance reasons
-        as the alternative (upfront rank check) adds O(k^3) overhead.
+        Following R's lm() approach, rank-deficient matrices emit a warning
+        and set NaN for coefficients of dropped columns. The ATT should still
+        be identified if the treatment interaction is not in the dropped set.
         """
+        import warnings
+
         np.random.seed(42)
         data = pd.DataFrame({
             "outcome": np.random.normal(10, 1, 100),
@@ -2615,20 +2927,32 @@ def test_did_with_redundant_covariate_handles_gracefully(self):
 
         did = DifferenceInDifferences()
 
-        # With scipy's gelsy solver, collinear covariates are handled gracefully
-        # (may produce numerically unstable but finite results)
-        result = did.fit(
-            data,
-            outcome="outcome",
-            treatment="treated",
-            time="post",
-            covariates=["x1", "x2"]
-        )
-        # Result should be finite (even if numerically unstable)
+        # With R-style rank deficiency handling, a warning is emitted
+        with warnings.catch_warnings(record=True) as w:
+            warnings.simplefilter("always")
+            result = did.fit(
+                data,
+                outcome="outcome",
+                treatment="treated",
+                time="post",
+                covariates=["x1", "x2"]
+            )
+            # Should emit a warning about rank deficiency
+            rank_warnings = [x for x in w if "Rank-deficient" in str(x.message)]
+            assert len(rank_warnings) > 0, "Expected warning about rank deficiency"
+
+        # ATT should still be finite (the interaction term is identified)
         assert np.isfinite(result.att)
 
-    def test_did_with_constant_covariate_raises_error(self):
-        """Test DiD raises clear error for constant covariates."""
+    def test_did_with_constant_covariate_emits_warning(self):
+        """Test DiD emits warning for constant covariates.
+
+        Constant covariates are collinear with the intercept and are dropped.
+        Following R's lm() approach, a warning is emitted and the coefficient
+        for the constant covariate is set to NaN.
+        """
+        import warnings
+
         np.random.seed(42)
         data = pd.DataFrame({
             "outcome": np.random.normal(10, 1, 100),
@@ -2640,15 +2964,22 @@ def test_did_with_constant_covariate_raises_error(self):
         did = DifferenceInDifferences()
 
         # Constant covariate is collinear with intercept
-        # Should raise clear error
-        with pytest.raises(ValueError, match="rank-deficient"):
-            did.fit(
+        # Should emit warning about rank deficiency
+        with warnings.catch_warnings(record=True) as w:
+            warnings.simplefilter("always")
+            result = did.fit(
                 data,
                 outcome="outcome",
                 treatment="treated",
                 time="post",
                 covariates=["constant_x"]
             )
+            # Should emit a warning about rank deficiency
+            rank_warnings = [x for x in w if "Rank-deficient" in str(x.message)]
+            assert len(rank_warnings) > 0, "Expected warning about rank deficiency"
+
+        # ATT should still be finite
+        assert np.isfinite(result.att)
 
     def test_did_with_near_collinear_covariates(self):
         """Test DiD handles near-collinear covariates (not perfectly collinear)."""
diff --git a/tests/test_linalg.py b/tests/test_linalg.py
index 7fab517b..33b3e5f4 100644
--- a/tests/test_linalg.py
+++ b/tests/test_linalg.py
@@ -186,34 +186,230 @@ def test_inf_in_x_raises_error(self):
             solve_ols(X, y)
 
     def test_check_finite_false_skips_validation(self):
-        """Test that check_finite=False skips NaN/Inf validation."""
+        """Test that check_finite=False skips the upfront NaN/Inf validation.
+
+        Note: With the 'gelsd' driver, LAPACK may still error on NaN values
+        during computation, which is actually safer than producing garbage.
+        """
         X = np.random.randn(100, 2)
         X[50, 0] = np.nan
         y = np.random.randn(100)
 
-        # Should not raise, but will return garbage results
-        coef, resid, vcov = solve_ols(X, y, check_finite=False)
-        # Coefficients will contain NaN due to bad input
-        assert np.isnan(coef).any() or np.isinf(coef).any()
+        # The gelsd driver may raise an error when encountering NaN during
+        # computation, or produce garbage results. Either is acceptable
+        # (the key is that we don't raise the "X contains NaN" user-friendly error)
+        try:
+            coef, resid, vcov = solve_ols(X, y, check_finite=False)
+            # If it completed, coefficients should contain NaN/Inf due to bad input
+            assert np.isnan(coef).any() or np.isinf(coef).any()
+        except ValueError as e:
+            # LAPACK may raise an error on NaN values (gelsd behavior)
+            # This is acceptable - the key is we skipped our own validation
+            assert "X contains NaN" not in str(e) and "y contains NaN" not in str(e)
+
+    def test_rank_deficient_produces_nan_for_dropped_columns(self):
+        """Test that rank-deficient matrix returns NaN for dropped columns.
+
+        Following R's lm() approach, coefficients for linearly dependent columns
+        are set to NaN while identified coefficients are computed normally.
+        """
+        import warnings
 
-    def test_rank_deficient_still_solves(self):
-        """Test that rank-deficient matrix still returns a solution.
+        np.random.seed(42)
+        X = np.random.randn(100, 3)
+        X[:, 2] = X[:, 0] + X[:, 1]  # Perfect collinearity: col 2 = col 0 + col 1
+        y = np.random.randn(100)
 
-        Note: The gelsy driver doesn't always detect rank deficiency,
-        but it still returns a valid least-squares solution.
-        """
+        # Should emit warning about rank deficiency
+        with warnings.catch_warnings(record=True) as w:
+            warnings.simplefilter("always")
+            coef, resid, vcov = solve_ols(X, y)
+            assert len(w) == 1
+            assert "Rank-deficient" in str(w[0].message)
+
+        assert coef.shape == (3,)
+        assert resid.shape == (100,)
+
+        # Exactly one coefficient should be NaN (the dropped one)
+        nan_mask = np.isnan(coef)
+        assert np.sum(nan_mask) == 1, f"Expected 1 NaN coefficient, got {np.sum(nan_mask)}: {coef}"
+
+        # Non-NaN coefficients should be finite and reasonable
+        finite_coef = coef[~nan_mask]
+        assert np.all(np.isfinite(finite_coef)), f"Finite coefficients contain non-finite values: {finite_coef}"
+        assert np.all(np.abs(finite_coef) < 1e6), f"Finite coefficients are unreasonably large: {finite_coef}"
+
+        # VCoV should have NaN for dropped column's row and column
+        assert vcov is not None
+        dropped_idx = np.where(nan_mask)[0][0]
+        assert np.all(np.isnan(vcov[dropped_idx, :])), "VCoV row for dropped column should be NaN"
+        assert np.all(np.isnan(vcov[:, dropped_idx])), "VCoV column for dropped column should be NaN"
+
+        # VCoV for identified coefficients should be finite
+        kept_idx = np.where(~nan_mask)[0]
+        vcov_kept = vcov[np.ix_(kept_idx, kept_idx)]
+        assert np.all(np.isfinite(vcov_kept)), "VCoV for kept coefficients should be finite"
+
+        # Residuals should be finite (computed using only identified coefficients)
+        assert np.all(np.isfinite(resid)), f"Residuals contain non-finite values"
+
+    def test_rank_deficient_error_mode(self):
+        """Test that rank_deficient_action='error' raises ValueError."""
         np.random.seed(42)
         X = np.random.randn(100, 3)
         X[:, 2] = X[:, 0] + X[:, 1]  # Perfect collinearity
         y = np.random.randn(100)
 
-        # Should still complete and return valid output
-        coef, resid, vcov = solve_ols(X, y)
+        with pytest.raises(ValueError, match="rank-deficient"):
+            solve_ols(X, y, rank_deficient_action="error")
+
+    def test_rank_deficient_silent_mode(self):
+        """Test that rank_deficient_action='silent' produces no warning."""
+        import warnings
 
+        np.random.seed(42)
+        X = np.random.randn(100, 3)
+        X[:, 2] = X[:, 0] + X[:, 1]  # Perfect collinearity
+        y = np.random.randn(100)
+
+        with warnings.catch_warnings(record=True) as w:
+            warnings.simplefilter("always")
+            coef, resid, vcov = solve_ols(X, y, rank_deficient_action="silent")
+            # No warnings should be emitted
+            assert len(w) == 0, f"Expected no warnings, got {len(w)}: {[str(x.message) for x in w]}"
+
+        # Should still produce NaN for dropped column
+        assert np.sum(np.isnan(coef)) == 1
+
+    def test_rank_deficient_column_names_in_warning(self):
+        """Test that column names appear in warning message."""
+        import warnings
+
+        np.random.seed(42)
+        X = np.random.randn(100, 3)
+        X[:, 2] = X[:, 0] + X[:, 1]  # Perfect collinearity
+        y = np.random.randn(100)
+
+        with warnings.catch_warnings(record=True) as w:
+            warnings.simplefilter("always")
+            coef, resid, vcov = solve_ols(
+                X, y,
+                column_names=["intercept", "x1", "x2_collinear"]
+            )
+            assert len(w) == 1
+            # Column name should appear in warning (not just index)
+            assert "x2_collinear" in str(w[0].message) or "intercept" in str(w[0].message) or "x1" in str(w[0].message)
+
+    def test_skip_rank_check_bypasses_qr_decomposition(self):
+        """Test that skip_rank_check=True skips QR rank detection.
+
+        When skip_rank_check=True, the function should skip QR decomposition
+        and go directly to SVD solving, even in Python backend.
+        """
+        import warnings
+        import os
+
+        np.random.seed(42)
+        n = 100
+        X = np.random.randn(n, 3)
+        y = np.random.randn(n)
+
+        # Force Python backend for this test
+        old_backend = os.environ.get("DIFF_DIFF_BACKEND")
+        os.environ["DIFF_DIFF_BACKEND"] = "python"
+
+        try:
+            # With skip_rank_check=True, should not emit any warnings
+            # (even if we make X rank-deficient, since we're skipping the check)
+            with warnings.catch_warnings(record=True) as w:
+                warnings.simplefilter("always")
+                coef, resid, vcov = solve_ols(X, y, skip_rank_check=True)
+                # No rank-deficiency warning should be emitted
+                rank_warnings = [x for x in w if "Rank-deficient" in str(x.message)]
+                assert len(rank_warnings) == 0
+        finally:
+            if old_backend is not None:
+                os.environ["DIFF_DIFF_BACKEND"] = old_backend
+            elif "DIFF_DIFF_BACKEND" in os.environ:
+                del os.environ["DIFF_DIFF_BACKEND"]
+
+        # Should produce valid coefficients
         assert coef.shape == (3,)
-        assert resid.shape == (100,)
-        # Residuals should still be valid (y - X @ coef)
-        np.testing.assert_allclose(resid, y - X @ coef, rtol=1e-10)
+        assert np.all(np.isfinite(coef))
+
+    def test_multiperiod_like_design_full_rank(self):
+        """Test that MultiPeriodDiD-like design matrices work when full-rank.
+
+        This test creates a properly specified MultiPeriodDiD-like design that
+        is NOT rank-deficient to verify correct coefficient recovery.
+        """
+        import warnings
+
+        np.random.seed(42)
+        n = 200
+        n_periods = 5
+
+        # Create a design matrix similar to MultiPeriodDiD:
+        # [intercept, period_1, period_2, ..., period_k, treated*post]
+
+        # Intercept
+        intercept = np.ones(n)
+
+        # Period dummies (one-hot encoding for periods 1 to n_periods-1)
+        # Period 0 is the reference
+        period_assignment = np.random.randint(0, n_periods, n)
+        period_dummies = np.zeros((n, n_periods - 1))
+        for i in range(1, n_periods):
+            period_dummies[:, i - 1] = (period_assignment == i).astype(float)
+
+        # Treatment indicator (varies within periods to ensure identification)
+        treated = np.random.binomial(1, 0.5, n)
+
+        # Post indicator (periods >= 3 are post)
+        post = (period_assignment >= 3).astype(float)
+
+        # Treatment × post interaction
+        treat_post = treated * post
+
+        # Build design matrix
+        X = np.column_stack([intercept, period_dummies, treat_post])
+
+        # True effect
+        true_effect = 2.5
+        y = (
+            1.0  # intercept effect
+            + 0.5 * period_dummies[:, 0]  # period 1 effect
+            + 0.3 * period_dummies[:, 1]  # period 2 effect
+            + 0.7 * period_dummies[:, 2]  # period 3 effect
+            + 0.9 * period_dummies[:, 3]  # period 4 effect
+            + true_effect * treat_post  # treatment effect
+            + np.random.randn(n) * 0.5  # noise
+        )
+
+        # Fit with solve_ols - should NOT produce warning if full-rank
+        with warnings.catch_warnings(record=True) as w:
+            warnings.simplefilter("always")
+            coef, resid, vcov = solve_ols(X, y)
+            # Check if any rank deficiency warnings (may or may not occur)
+            rank_warnings = [x for x in w if "Rank-deficient" in str(x.message)]
+
+        # If no rank deficiency, all coefficients should be finite
+        if len(rank_warnings) == 0:
+            assert np.all(np.isfinite(coef)), f"Full-rank matrix: coefficients should be finite"
+            assert np.all(np.abs(coef) < 1e6), f"Coefficients are unreasonably large: {coef}"
+            # The treatment effect coefficient (last one) should be close to true effect
+            assert abs(coef[-1] - true_effect) < 2.0, (
+                f"Treatment effect {coef[-1]} is too far from true {true_effect}"
+            )
+        else:
+            # If rank-deficient, check that identified coefficients are valid
+            finite_coef = coef[~np.isnan(coef)]
+            assert np.all(np.isfinite(finite_coef)), f"Identified coefficients should be finite"
+            # If treatment effect is identified, check it
+            if not np.isnan(coef[-1]):
+                assert abs(coef[-1] - true_effect) < 2.0, (
+                    f"Treatment effect {coef[-1]} is too far from true {true_effect}"
+                )
 
     def test_single_cluster_error(self):
         """Test that single cluster raises error."""
@@ -824,6 +1020,239 @@ def test_matches_solve_ols(self, simple_data):
         np.testing.assert_allclose(reg.fitted_values_, fitted, rtol=1e-10)
         np.testing.assert_allclose(reg.vcov_, vcov, rtol=1e-10)
 
+    def test_rank_deficient_degrees_of_freedom(self):
+        """Test that degrees of freedom are computed correctly when columns are dropped.
+
+        When a design matrix is rank-deficient, the effective number of parameters
+        is the rank, not the number of columns. The df should be n - rank.
+        """
+        import warnings
+
+        np.random.seed(42)
+        n = 100
+        # Create rank-deficient matrix: 4 columns but rank 3
+        X = np.random.randn(n, 3)
+        X = np.column_stack([X, X[:, 0] + X[:, 1]])  # Column 3 = Column 0 + Column 1
+
+        y = np.random.randn(n)
+
+        with warnings.catch_warnings(record=True):
+            warnings.simplefilter("always")
+            reg = LinearRegression(include_intercept=False).fit(X, y)
+
+        # n_params_ should be total columns (4)
+        assert reg.n_params_ == 4
+
+        # n_params_effective_ should be the rank (3)
+        assert reg.n_params_effective_ == 3
+
+        # df_ should be n - effective_params = 100 - 3 = 97
+        assert reg.df_ == n - 3
+
+        # Verify one coefficient is NaN (the dropped one)
+        assert np.sum(np.isnan(reg.coefficients_)) == 1
+
+    def test_rank_deficient_inference_uses_correct_df(self):
+        """Test that p-values and CIs use the correct df for rank-deficient matrices."""
+        import warnings
+        from scipy import stats
+
+        np.random.seed(42)
+        n = 100
+        # Create rank-deficient matrix
+        X = np.random.randn(n, 3)
+        X = np.column_stack([X, X[:, 0] + X[:, 1]])  # Perfect collinearity
+
+        # True coefficients for the first 3 columns only
+        y = 2 * X[:, 0] + 3 * X[:, 1] + np.random.randn(n) * 0.5
+
+        with warnings.catch_warnings(record=True):
+            warnings.simplefilter("always")
+            reg = LinearRegression(include_intercept=False).fit(X, y)
+
+        # Get inference for an identified coefficient
+        nan_mask = np.isnan(reg.coefficients_)
+        kept_idx = np.where(~nan_mask)[0][0]  # First non-NaN coefficient
+        result = reg.get_inference(kept_idx)
+
+        # Check that df is correct (should be n - rank = 97)
+        assert result.df == n - 3, f"Expected df={n-3}, got {result.df}"
+
+        # Manually compute expected values using correct df
+        coef = result.coefficient
+        se = result.se
+        t_stat_expected = coef / se
+        p_value_expected = 2 * (1 - stats.t.cdf(abs(t_stat_expected), df=n - 3))
+
+        # Verify t-stat
+        np.testing.assert_allclose(result.t_stat, t_stat_expected, rtol=1e-10)
+
+        # Verify p-value uses correct df (use atol for very small p-values)
+        np.testing.assert_allclose(result.p_value, p_value_expected, atol=1e-10)
+
+        # Verify CI uses correct df
+        t_crit = stats.t.ppf(1 - 0.05 / 2, df=n - 3)
+        ci_expected = (coef - t_crit * se, coef + t_crit * se)
+        np.testing.assert_allclose(result.conf_int, ci_expected, rtol=1e-6)
+
+    def test_rank_deficient_inference_nan_for_dropped_coef(self):
+        """Test that inference for dropped coefficients returns NaN values."""
+        import warnings
+
+        np.random.seed(42)
+        n = 100
+        X = np.random.randn(n, 3)
+        X = np.column_stack([X, X[:, 0] + X[:, 1]])  # Column 3 is dropped
+        y = np.random.randn(n)
+
+        with warnings.catch_warnings(record=True):
+            warnings.simplefilter("always")
+            reg = LinearRegression(include_intercept=False).fit(X, y)
+
+        # Find the dropped coefficient index
+        nan_mask = np.isnan(reg.coefficients_)
+        dropped_idx = np.where(nan_mask)[0][0]
+
+        # Get inference for dropped coefficient
+        result = reg.get_inference(dropped_idx)
+
+        # All inference values should be NaN
+        assert np.isnan(result.coefficient)
+        assert np.isnan(result.se)
+        assert np.isnan(result.t_stat)
+        assert np.isnan(result.p_value)
+        assert np.isnan(result.conf_int[0])
+        assert np.isnan(result.conf_int[1])
+
+    def test_rank_deficient_predict_uses_identified_coefficients(self):
+        """Test that predict() works correctly with rank-deficient fits.
+
+        Predictions should use only identified coefficients (treating dropped
+        coefficients as zero), not produce all-NaN predictions.
+        """
+        import warnings
+
+        np.random.seed(42)
+        n = 100
+        # Create rank-deficient matrix
+        X = np.random.randn(n, 3)
+        X = np.column_stack([X, X[:, 0] + X[:, 1]])  # Column 3 is collinear
+
+        # True model uses only first 3 columns
+        y = 2 * X[:, 0] + 3 * X[:, 1] - 1 * X[:, 2] + np.random.randn(n) * 0.5
+
+        with warnings.catch_warnings(record=True):
+            warnings.simplefilter("always")
+            reg = LinearRegression(include_intercept=False).fit(X, y)
+
+        # Verify one coefficient is NaN
+        assert np.sum(np.isnan(reg.coefficients_)) == 1
+
+        # Predictions should NOT be all NaN
+        X_new = np.random.randn(10, 4)
+        y_pred = reg.predict(X_new)
+
+        assert y_pred.shape == (10,)
+        assert np.all(np.isfinite(y_pred)), "Predictions should be finite, not NaN"
+
+        # Verify predictions match fitted values on training data
+        y_fitted = reg.predict(X)
+        np.testing.assert_allclose(y_fitted, reg.fitted_values_, rtol=1e-10)
+
+    def test_rank_deficient_adjusted_r_squared_uses_effective_params(self):
+        """Test that adjusted R² uses effective params, not total params.
+
+        For rank-deficient fits, adjusted R² should use n_params_effective_
+        for consistency with the corrected degrees of freedom.
+        """
+        import warnings
+
+        np.random.seed(42)
+        n = 100
+        # Create rank-deficient matrix: 4 columns but rank 3
+        X = np.random.randn(n, 3)
+        X = np.column_stack([X, X[:, 0] + X[:, 1]])  # Column 3 = Column 0 + Column 1
+
+        y = 2 * X[:, 0] + np.random.randn(n) * 0.5
+
+        with warnings.catch_warnings(record=True):
+            warnings.simplefilter("always")
+            reg = LinearRegression(include_intercept=False).fit(X, y)
+
+        # Verify setup
+        assert reg.n_params_ == 4
+        assert reg.n_params_effective_ == 3
+
+        # Get R² values
+        r2 = reg.r_squared()
+        r2_adj = reg.r_squared(adjusted=True)
+
+        # Both should be valid numbers
+        assert 0 <= r2 <= 1
+        assert r2_adj < r2  # Adjusted should be smaller
+
+        # Manually compute adjusted R² using effective params
+        # r²_adj = 1 - (1 - r²) * (n - 1) / (n - k_effective)
+        r2_adj_expected = 1 - (1 - r2) * (n - 1) / (n - 3)
+        np.testing.assert_allclose(r2_adj, r2_adj_expected, rtol=1e-10)
+
+        # Verify it's NOT using total params (which would give different result)
+        r2_adj_wrong = 1 - (1 - r2) * (n - 1) / (n - 4)
+        assert r2_adj != r2_adj_wrong, "Should use effective params, not total params"
+
+    def test_rank_deficient_action_error_raises(self):
+        """Test that LinearRegression with rank_deficient_action='error' raises on collinear data."""
+        np.random.seed(42)
+        n = 100
+        X = np.random.randn(n, 3)
+        X = np.column_stack([X, X[:, 0] + X[:, 1]])  # Perfect collinearity
+        y = np.random.randn(n)
+
+        reg = LinearRegression(include_intercept=False, rank_deficient_action="error")
+        with pytest.raises(ValueError, match="rank-deficient"):
+            reg.fit(X, y)
+
+    def test_rank_deficient_action_silent_no_warning(self):
+        """Test that LinearRegression with rank_deficient_action='silent' produces no warning."""
+        import warnings
+
+        np.random.seed(42)
+        n = 100
+        X = np.random.randn(n, 3)
+        X = np.column_stack([X, X[:, 0] + X[:, 1]])  # Perfect collinearity
+        y = np.random.randn(n)
+
+        reg = LinearRegression(include_intercept=False, rank_deficient_action="silent")
+
+        with warnings.catch_warnings(record=True) as w:
+            warnings.simplefilter("always")
+            reg.fit(X, y)
+            # No warnings should be emitted
+            assert len(w) == 0, f"Expected no warnings, got {len(w)}: {[str(x.message) for x in w]}"
+
+        # Should still produce NaN for dropped column
+        assert np.sum(np.isnan(reg.coefficients_)) == 1
+
+    def test_rank_deficient_action_warn_default(self):
+        """Test that LinearRegression with rank_deficient_action='warn' (default) emits warning."""
+        import warnings
+
+        np.random.seed(42)
+        n = 100
+        X = np.random.randn(n, 3)
+        X = np.column_stack([X, X[:, 0] + X[:, 1]])  # Perfect collinearity
+        y = np.random.randn(n)
+
+        reg = LinearRegression(include_intercept=False)  # Default is "warn"
+
+        with warnings.catch_warnings(record=True) as w:
+            warnings.simplefilter("always")
+            reg.fit(X, y)
+            # Should have a warning about rank deficiency
+            assert len(w) > 0, "Expected warning about rank deficiency"
+            assert any("Rank-deficient" in str(x.message) or "rank-deficient" in str(x.message).lower()
+                      for x in w), f"Expected rank-deficient warning, got: {[str(x.message) for x in w]}"
+
 
 class TestNumericalStability:
     """Tests for numerical stability with ill-conditioned matrices."""
@@ -833,15 +1262,18 @@ def test_near_singular_matrix_stability(self):
         np.random.seed(42)
         n = 100
 
-        # Create near-collinear design (high condition number)
+        # Create near-collinear design (high condition number but above rank tolerance)
+        # The rank detection tolerance is 1e-07 (matching R's qr()), so we use noise
+        # of 1e-5 which is clearly above the tolerance and provides a distinguishable
+        # signal. With noise < 1e-07, the column would be considered linearly dependent.
         X = np.random.randn(n, 3)
-        X[:, 2] = X[:, 0] + X[:, 1] + np.random.randn(n) * 1e-8  # Near-perfect collinearity
+        X[:, 2] = X[:, 0] + X[:, 1] + np.random.randn(n) * 1e-5  # Near but not perfect collinearity
 
         y = X[:, 0] + np.random.randn(n) * 0.1
 
         reg = LinearRegression(include_intercept=True).fit(X, y)
 
-        # Should still produce finite coefficients
+        # Should still produce finite coefficients (noise is above tolerance)
         assert np.all(np.isfinite(reg.coefficients_))
 
         # Compare with numpy's lstsq (gold standard for stability)
@@ -849,7 +1281,7 @@ def test_near_singular_matrix_stability(self):
         expected, _, _, _ = np.linalg.lstsq(X_full, y, rcond=None)
 
         # Should be close (within reasonable tolerance for ill-conditioned problem)
-        np.testing.assert_allclose(reg.coefficients_, expected, rtol=1e-6)
+        np.testing.assert_allclose(reg.coefficients_, expected, rtol=1e-4)
 
     def test_high_condition_number_matrix(self):
         """Test that high condition number matrices don't lose precision."""
diff --git a/tests/test_rust_backend.py b/tests/test_rust_backend.py
index ad42fcff..5362cb7f 100644
--- a/tests/test_rust_backend.py
+++ b/tests/test_rust_backend.py
@@ -337,6 +337,108 @@ def test_near_singular_with_clusters(self):
         assert np.all(np.isfinite(residuals)), "Residuals should be finite"
         assert vcov.shape == (3, 3), "VCoV should have correct shape"
 
+    # =========================================================================
+    # Rank-Deficient Matrix Tests (Critical for MultiPeriodDiD)
+    # =========================================================================
+
+    def test_rank_deficient_matrix_produces_valid_coefficients(self):
+        """Test that rank-deficient matrices produce finite, reasonable coefficients.
+
+        This test verifies the fix for the MultiPeriodDiD bug where rank-deficient
+        design matrices (with redundant columns) produced astronomically wrong
+        estimates (trillions instead of single digits).
+
+        The SVD-based solver should truncate small singular values and produce
+        a valid minimum-norm solution.
+        """
+        from diff_diff._rust_backend import solve_ols
+
+        np.random.seed(42)
+        n = 100
+
+        # Create perfectly collinear design matrix (rank-deficient)
+        # This mimics what can happen in MultiPeriodDiD with period dummies
+        X = np.random.randn(n, 3)
+        X[:, 2] = X[:, 0] + X[:, 1]  # Column 3 = Column 1 + Column 2
+
+        y = X[:, 0] + np.random.randn(n) * 0.1
+
+        # Rust backend should handle this gracefully via SVD truncation
+        coeffs, residuals, vcov = solve_ols(X, y, None, True)
+
+        # Coefficients must be finite (not NaN or Inf)
+        assert np.all(np.isfinite(coeffs)), f"Coefficients should be finite, got {coeffs}"
+
+        # Coefficients should be reasonable (not astronomically large like 1e12)
+        assert np.all(np.abs(coeffs) < 1e6), f"Coefficients are unreasonably large: {coeffs}"
+
+        # Residuals should be correct given coefficients
+        expected_residuals = y - X @ coeffs
+        np.testing.assert_array_almost_equal(
+            residuals, expected_residuals, decimal=8,
+            err_msg="Residuals should match y - X @ coeffs"
+        )
+
+    def test_multiperiod_did_like_design_matrix(self):
+        """Test design matrix structure similar to MultiPeriodDiD.
+
+        MultiPeriodDiD creates design matrices with:
+        - Intercept
+        - Period dummies (one-hot encoded)
+        - Treatment × post interaction terms
+
+        These can create rank-deficient matrices when period dummies and
+        interaction terms are not all linearly independent.
+        """
+        from diff_diff._rust_backend import solve_ols
+
+        np.random.seed(42)
+        n = 200
+        n_periods = 5
+
+        # Create MultiPeriodDiD-like design matrix
+        intercept = np.ones(n)
+
+        # Period dummies (periods 1-4, period 0 is reference)
+        period_assignment = np.random.randint(0, n_periods, n)
+        period_dummies = np.zeros((n, n_periods - 1))
+        for i in range(1, n_periods):
+            period_dummies[:, i - 1] = (period_assignment == i).astype(float)
+
+        # Treatment indicator and post indicator
+        treated = np.random.binomial(1, 0.5, n)
+        post = (period_assignment >= 3).astype(float)
+        treat_post = treated * post
+
+        # Build design matrix (potentially rank-deficient)
+        X = np.column_stack([intercept, period_dummies, treat_post])
+
+        # True effect
+        true_effect = 2.5
+        y = (
+            1.0
+            + 0.5 * period_dummies[:, 0]
+            + 0.3 * period_dummies[:, 1]
+            + 0.7 * period_dummies[:, 2]
+            + 0.9 * period_dummies[:, 3]
+            + true_effect * treat_post
+            + np.random.randn(n) * 0.5
+        )
+
+        # Fit with Rust backend
+        coeffs, residuals, vcov = solve_ols(X, y, None, True)
+
+        # Coefficients must be finite
+        assert np.all(np.isfinite(coeffs)), f"Coefficients should be finite, got {coeffs}"
+
+        # Coefficients should be reasonable (not trillions)
+        assert np.all(np.abs(coeffs) < 1e6), f"Coefficients are unreasonably large: {coeffs}"
+
+        # Treatment effect (last coefficient) should be close to true effect
+        assert abs(coeffs[-1] - true_effect) < 2.0, (
+            f"Treatment effect {coeffs[-1]} is too far from true effect {true_effect}"
+        )
+
 
 @pytest.mark.skipif(not HAS_RUST_BACKEND, reason="Rust backend not available")
 class TestRustVsNumpy:
@@ -393,6 +495,124 @@ def test_solve_ols_with_clusters_match(self):
             err_msg="Clustered OLS VCoV should match"
         )
 
+    def test_rank_deficient_ols_residuals_match(self):
+        """Test Rust and NumPy produce matching residuals for rank-deficient matrices.
+
+        The Rust backend uses SVD truncation while NumPy uses R-style NaN handling.
+        Despite different approaches, both should produce equivalent residuals.
+
+        Note: The coefficient representations differ:
+        - Rust: All finite (SVD minimum-norm solution)
+        - NumPy: NaN for dropped columns (R-style)
+        But both produce the same fitted values and residuals.
+        """
+        import warnings
+        from diff_diff._rust_backend import solve_ols as rust_fn
+        from diff_diff.linalg import _solve_ols_numpy as numpy_fn
+
+        np.random.seed(42)
+        n = 100
+
+        # Create rank-deficient design matrix (perfect collinearity)
+        X = np.random.randn(n, 3)
+        X[:, 2] = X[:, 0] + X[:, 1]  # Column 3 = Column 1 + Column 2
+
+        y = X[:, 0] + 2 * X[:, 1] + np.random.randn(n) * 0.1
+
+        # Rust backend produces finite coefficients via SVD truncation
+        rust_coeffs, rust_resid, _ = rust_fn(X, y, None, True)
+
+        # NumPy backend produces NaN for dropped columns (R-style)
+        with warnings.catch_warnings():
+            warnings.simplefilter("ignore")  # Suppress rank-deficient warning
+            numpy_coeffs, numpy_resid, _ = numpy_fn(X, y, cluster_ids=None)
+
+        # Rust should produce finite coefficients
+        assert np.all(np.isfinite(rust_coeffs)), "Rust coefficients should be finite"
+        assert np.all(np.abs(rust_coeffs) < 1e6), "Rust coefficients should be reasonable"
+
+        # NumPy should produce exactly one NaN coefficient (the dropped one)
+        assert np.sum(np.isnan(numpy_coeffs)) == 1, "NumPy should have one NaN coefficient"
+
+        # Non-NaN NumPy coefficients should be reasonable
+        finite_numpy = numpy_coeffs[~np.isnan(numpy_coeffs)]
+        assert np.all(np.abs(finite_numpy) < 1e6), "NumPy finite coefficients should be reasonable"
+
+        # Residuals should be very close (this is the key equivalence check)
+        # Both approaches should produce the same fitted values and residuals
+        np.testing.assert_array_almost_equal(
+            rust_resid, numpy_resid, decimal=5,
+            err_msg="Residuals should match despite different coefficient representations"
+        )
+
+    def test_multiperiod_did_design_residuals_equivalence(self):
+        """Test both backends produce equivalent residuals for MultiPeriodDiD-like matrices.
+
+        For full-rank designs, both backends should produce identical results.
+        The design matrix in this test is typically full-rank.
+        """
+        import warnings
+        from diff_diff._rust_backend import solve_ols as rust_fn
+        from diff_diff.linalg import _solve_ols_numpy as numpy_fn
+
+        np.random.seed(42)
+        n = 200
+        n_periods = 5
+
+        # Create MultiPeriodDiD-like design matrix
+        intercept = np.ones(n)
+        period_assignment = np.random.randint(0, n_periods, n)
+        period_dummies = np.zeros((n, n_periods - 1))
+        for i in range(1, n_periods):
+            period_dummies[:, i - 1] = (period_assignment == i).astype(float)
+
+        treated = np.random.binomial(1, 0.5, n)
+        post = (period_assignment >= 3).astype(float)
+        treat_post = treated * post
+
+        X = np.column_stack([intercept, period_dummies, treat_post])
+
+        true_effect = 2.5
+        y = (
+            1.0
+            + 0.5 * period_dummies[:, 0]
+            + 0.3 * period_dummies[:, 1]
+            + 0.7 * period_dummies[:, 2]
+            + 0.9 * period_dummies[:, 3]
+            + true_effect * treat_post
+            + np.random.randn(n) * 0.5
+        )
+
+        rust_coeffs, rust_resid, _ = rust_fn(X, y, None, True)
+
+        with warnings.catch_warnings():
+            warnings.simplefilter("ignore")  # May or may not warn depending on rank
+            numpy_coeffs, numpy_resid, _ = numpy_fn(X, y, cluster_ids=None)
+
+        # Rust should produce finite treatment effect
+        rust_effect = rust_coeffs[-1]
+        assert np.isfinite(rust_effect), "Rust treatment effect should be finite"
+        assert abs(rust_effect - true_effect) < 2.0, (
+            f"Rust treatment effect {rust_effect} too far from true {true_effect}"
+        )
+
+        # NumPy treatment effect should be close (may be finite or NaN depending on rank)
+        numpy_effect = numpy_coeffs[-1]
+        if np.isfinite(numpy_effect):
+            assert abs(numpy_effect - true_effect) < 2.0, (
+                f"NumPy treatment effect {numpy_effect} too far from true {true_effect}"
+            )
+            # Effects should be close to each other
+            assert abs(rust_effect - numpy_effect) < 0.5, (
+                f"Rust ({rust_effect}) and NumPy ({numpy_effect}) effects should match"
+            )
+
+        # Residuals should be very close (key equivalence check)
+        np.testing.assert_array_almost_equal(
+            rust_resid, numpy_resid, decimal=5,
+            err_msg="Residuals should match for MultiPeriodDiD-like design"
+        )
+
     # =========================================================================
     # Robust VCoV Equivalence
     # =========================================================================
@@ -563,6 +783,89 @@ def test_simplex_projection_match(self):
                 err_msg=f"Simplex projection mismatch for input {v}"
             )
 
+    def test_nan_vcov_fallback_to_python(self):
+        """Test that NaN vcov from Rust backend triggers fallback to Python.
+
+        When Rust SVD detects rank-deficiency that Python QR missed (due to
+        different numerical properties), the vcov matrix may contain NaN values.
+        The high-level solve_ols should detect this and fall back to Python's
+        R-style handling, ensuring the user never receives silent NaN SEs.
+
+        The key behavior being tested:
+        1. When Rust returns NaN vcov, we emit a warning and re-run Python
+        2. The Python re-run does fresh rank detection (not using cached info)
+        3. R-style handling is applied: NaN coefficients for dropped columns
+        """
+        import warnings
+        from diff_diff.linalg import solve_ols
+
+        # Create an ill-conditioned matrix that might cause QR/SVD disagreement.
+        # The condition number is extremely high, which may cause the Rust SVD
+        # to detect numerical issues that QR doesn't catch.
+        np.random.seed(42)
+        n = 100
+
+        # Create a matrix with near-perfect but not exact collinearity.
+        # This is on the boundary where QR/SVD might disagree.
+        X = np.random.randn(n, 4)
+        # Make column 3 almost (but not exactly) a linear combination of 0-2
+        X[:, 3] = X[:, 0] + X[:, 1] + X[:, 2] + np.random.randn(n) * 1e-12
+
+        y = np.random.randn(n)
+
+        # Capture any warnings that might be emitted
+        with warnings.catch_warnings(record=True) as w:
+            warnings.simplefilter("always")
+            coeffs, residuals, vcov = solve_ols(X, y)
+
+        # Check if fallback warning was emitted
+        fallback_warning_emitted = any(
+            "Re-running with Python backend" in str(warning.message)
+            for warning in w
+        )
+
+        # Key invariants that must hold regardless of which backend is used:
+        # 1. Coefficients must be finite (either via Rust SVD or Python R-style)
+        finite_coeffs = coeffs[np.isfinite(coeffs)]
+        assert len(finite_coeffs) >= 3, \
+            "At least 3 coefficients should be finite (identifiable)"
+        assert np.all(np.abs(finite_coeffs) < 1e10), \
+            f"Finite coefficients should be reasonable, got {finite_coeffs}"
+
+        # 2. If vcov has any finite values, they should correspond to finite coefficients
+        if vcov is not None:
+            finite_coef_mask = np.isfinite(coeffs)
+            for i in range(len(coeffs)):
+                if finite_coef_mask[i]:
+                    # This coefficient's variance should be finite
+                    var_i = vcov[i, i]
+                    assert np.isfinite(var_i) or np.isnan(var_i), \
+                        f"Variance for finite coef {i} should be finite or NaN (dropped)"
+
+        # 3. Residuals must always be finite
+        assert np.all(np.isfinite(residuals)), "Residuals should be finite"
+
+        # 4. R-style consistency: NaN coefficients must have NaN vcov diagonal
+        if vcov is not None:
+            nan_coef_indices = set(np.where(np.isnan(coeffs))[0])
+            nan_vcov_diag_indices = set(np.where(np.isnan(np.diag(vcov)))[0])
+
+            # NaN in vcov diagonal should correspond exactly to NaN coefficients
+            assert nan_vcov_diag_indices == nan_coef_indices, \
+                f"NaN vcov diagonal {nan_vcov_diag_indices} should match " \
+                f"NaN coefficients {nan_coef_indices}"
+
+        # 5. If fallback warning was emitted, R-style handling MUST have occurred
+        # This verifies that the fallback actually applies R-style NaN handling
+        # (not minimum-norm solution which would have all finite coefficients)
+        if fallback_warning_emitted:
+            assert np.any(np.isnan(coeffs)), \
+                "Fallback warning emitted but no NaN coefficients - " \
+                "R-style handling was not applied"
+            assert vcov is not None and np.any(np.isnan(vcov)), \
+                "Fallback warning emitted but vcov has no NaN - " \
+                "R-style handling was not applied"
+
 
 @pytest.mark.skipif(not HAS_RUST_BACKEND, reason="Rust backend not available")
 class TestTROPRustBackend:
diff --git a/tests/test_staggered.py b/tests/test_staggered.py
index f78f756a..39fda7fa 100644
--- a/tests/test_staggered.py
+++ b/tests/test_staggered.py
@@ -685,8 +685,12 @@ def test_near_collinear_covariates(self):
         """Test that near-collinear covariates are handled gracefully."""
         data = generate_staggered_data_with_covariates(seed=42)
 
-        # Add a near-collinear covariate (x1 + small noise)
-        data['x1_copy'] = data['x1'] + np.random.randn(len(data)) * 1e-8
+        # Add a near-collinear covariate (x1 + noise above rank detection tolerance)
+        # The rank detection tolerance is 1e-07 (matching R's qr()), so we use noise
+        # of 1e-5 which is above the tolerance but still creates high collinearity.
+        # With noise < 1e-07, the column would be considered linearly dependent.
+        np.random.seed(42)
+        data['x1_copy'] = data['x1'] + np.random.randn(len(data)) * 1e-5
 
         cs = CallawaySantAnna(estimation_method='reg')
         results = cs.fit(
@@ -698,7 +702,7 @@ def test_near_collinear_covariates(self):
             covariates=['x1', 'x1_copy']  # Nearly collinear
         )
 
-        # Should still produce valid results (lstsq handles this)
+        # Should still produce valid results (noise is above tolerance)
         assert results.overall_att is not None
         assert np.isfinite(results.overall_att)
 
@@ -726,6 +730,61 @@ def test_missing_values_in_covariates_warning(self):
         assert results.overall_att is not None
         assert results.overall_se > 0
 
+    def test_rank_deficient_action_error_raises(self):
+        """Test that rank_deficient_action='error' raises ValueError on collinear data."""
+        data = generate_staggered_data_with_covariates(seed=42)
+
+        # Add a covariate that is perfectly collinear with x1
+        data["x1_dup"] = data["x1"].copy()
+
+        cs = CallawaySantAnna(
+            estimation_method="reg",  # Use regression method to test OLS path
+            rank_deficient_action="error"
+        )
+        with pytest.raises(ValueError, match="rank-deficient"):
+            cs.fit(
+                data,
+                outcome='outcome',
+                unit='unit',
+                time='time',
+                first_treat='first_treat',
+                covariates=['x1', 'x1_dup']
+            )
+
+    def test_rank_deficient_action_silent_no_warning(self):
+        """Test that rank_deficient_action='silent' produces no warning."""
+        import warnings
+
+        data = generate_staggered_data_with_covariates(seed=42)
+
+        # Add a covariate that is perfectly collinear with x1
+        data["x1_dup"] = data["x1"].copy()
+
+        cs = CallawaySantAnna(
+            estimation_method="reg",  # Use regression method to test OLS path
+            rank_deficient_action="silent"
+        )
+
+        with warnings.catch_warnings(record=True) as w:
+            warnings.simplefilter("always")
+            results = cs.fit(
+                data,
+                outcome='outcome',
+                unit='unit',
+                time='time',
+                first_treat='first_treat',
+                covariates=['x1', 'x1_dup']
+            )
+
+            # No warnings about rank deficiency should be emitted
+            rank_warnings = [x for x in w if "Rank-deficient" in str(x.message)
+                           or "rank-deficient" in str(x.message).lower()]
+            assert len(rank_warnings) == 0, f"Expected no rank warnings, got {rank_warnings}"
+
+        # Should still get valid results
+        assert results is not None
+        assert results.overall_att is not None
+
 
 class TestCallawaySantAnnaBootstrap:
     """Tests for Callaway-Sant'Anna multiplier bootstrap inference."""
diff --git a/tests/test_sun_abraham.py b/tests/test_sun_abraham.py
index 0096bf8a..c34ae26a 100644
--- a/tests/test_sun_abraham.py
+++ b/tests/test_sun_abraham.py
@@ -730,3 +730,56 @@ def test_cohort_level_dataframe(self):
         assert "relative_period" in df_cohort.columns
         assert "effect" in df_cohort.columns
         assert "weight" in df_cohort.columns
+
+    def test_rank_deficient_action_error_raises(self):
+        """Test that rank_deficient_action='error' raises ValueError on collinear data."""
+        data = generate_staggered_data(seed=42)
+
+        # Add covariates that are perfectly collinear
+        np.random.seed(42)
+        data["cov1"] = np.random.randn(len(data))
+        data["cov1_dup"] = data["cov1"].copy()
+
+        sa = SunAbraham(rank_deficient_action="error")
+        with pytest.raises(ValueError, match="rank-deficient"):
+            sa.fit(
+                data,
+                outcome="outcome",
+                unit="unit",
+                time="time",
+                first_treat="first_treat",
+                covariates=["cov1", "cov1_dup"]
+            )
+
+    def test_rank_deficient_action_silent_no_warning(self):
+        """Test that rank_deficient_action='silent' produces no warning."""
+        import warnings
+
+        data = generate_staggered_data(seed=42)
+
+        # Add covariates that are perfectly collinear
+        np.random.seed(42)
+        data["cov1"] = np.random.randn(len(data))
+        data["cov1_dup"] = data["cov1"].copy()
+
+        sa = SunAbraham(rank_deficient_action="silent")
+
+        with warnings.catch_warnings(record=True) as w:
+            warnings.simplefilter("always")
+            results = sa.fit(
+                data,
+                outcome="outcome",
+                unit="unit",
+                time="time",
+                first_treat="first_treat",
+                covariates=["cov1", "cov1_dup"]
+            )
+
+            # No warnings about rank deficiency should be emitted
+            rank_warnings = [x for x in w if "Rank-deficient" in str(x.message)
+                           or "rank-deficient" in str(x.message).lower()]
+            assert len(rank_warnings) == 0, f"Expected no rank warnings, got {rank_warnings}"
+
+        # Should still get valid results
+        assert results is not None
+        assert results.overall_att is not None
diff --git a/tests/test_triple_diff.py b/tests/test_triple_diff.py
index 4bbde03a..02d77b32 100644
--- a/tests/test_triple_diff.py
+++ b/tests/test_triple_diff.py
@@ -853,3 +853,108 @@ def test_repr_does_not_raise(self, simple_ddd_data):
         # Should not raise
         repr_str = repr(results)
         assert "TripleDifferenceResults" in repr_str
+
+
+# =============================================================================
+# Rank Deficiency Tests
+# =============================================================================
+
+
+class TestRankDeficientAction:
+    """Tests for rank_deficient_action parameter handling."""
+
+    @pytest.fixture
+    def ddd_data_with_covariates(self):
+        """Create DDD data with covariates for testing."""
+        np.random.seed(42)
+        n = 400
+        data = pd.DataFrame({
+            "group": np.repeat([0, 1], n // 2),
+            "partition": np.tile(np.repeat([0, 1], n // 4), 2),
+            "time": np.tile([0, 1], n // 2),
+            "x1": np.random.randn(n),
+        })
+
+        # Generate outcome with effect
+        data["outcome"] = (
+            1.0
+            + 0.5 * data["x1"]
+            + 0.5 * data["group"]
+            + 0.3 * data["partition"]
+            + 0.2 * data["time"]
+            + 2.0 * data["group"] * data["partition"] * data["time"]
+            + np.random.randn(n) * 0.5
+        )
+
+        return data
+
+    def test_rank_deficient_action_error_raises(self, ddd_data_with_covariates):
+        """Test that rank_deficient_action='error' raises ValueError on collinear data."""
+        # Add a covariate that is perfectly collinear with x1
+        ddd_data_with_covariates["x1_dup"] = ddd_data_with_covariates["x1"].copy()
+
+        ddd = TripleDifference(
+            estimation_method="reg",  # Use regression method to test OLS path
+            rank_deficient_action="error"
+        )
+        with pytest.raises(ValueError, match="rank-deficient"):
+            ddd.fit(
+                ddd_data_with_covariates,
+                outcome="outcome",
+                group="group",
+                partition="partition",
+                time="time",
+                covariates=["x1", "x1_dup"]
+            )
+
+    def test_rank_deficient_action_silent_no_warning(self, ddd_data_with_covariates):
+        """Test that rank_deficient_action='silent' produces no warning."""
+        import warnings
+
+        # Add a covariate that is perfectly collinear with x1
+        ddd_data_with_covariates["x1_dup"] = ddd_data_with_covariates["x1"].copy()
+
+        ddd = TripleDifference(
+            estimation_method="reg",  # Use regression method to test OLS path
+            rank_deficient_action="silent"
+        )
+
+        with warnings.catch_warnings(record=True) as w:
+            warnings.simplefilter("always")
+            results = ddd.fit(
+                ddd_data_with_covariates,
+                outcome="outcome",
+                group="group",
+                partition="partition",
+                time="time",
+                covariates=["x1", "x1_dup"]
+            )
+
+            # No warnings about rank deficiency should be emitted
+            rank_warnings = [x for x in w if "Rank-deficient" in str(x.message)
+                           or "rank-deficient" in str(x.message).lower()]
+            assert len(rank_warnings) == 0, f"Expected no rank warnings, got {rank_warnings}"
+
+        # Should still get valid results
+        assert results is not None
+        assert results.att is not None
+
+    def test_convenience_function_passes_rank_deficient_action(self, ddd_data_with_covariates):
+        """Test that triple_difference() convenience function passes rank_deficient_action."""
+        from diff_diff import triple_difference
+
+        # Add a covariate that is perfectly collinear with x1
+        ddd_data_with_covariates["x1_dup"] = ddd_data_with_covariates["x1"].copy()
+
+        # Should raise with "error" action
+        with pytest.raises(ValueError, match="rank-deficient"):
+            triple_difference(
+                ddd_data_with_covariates,
+                outcome="outcome",
+                group="group",
+                partition="partition",
+                time="time",
+                estimation_method="reg",
+                covariates=["x1", "x1_dup"],
+                rank_deficient_action="error"
+            )