diff --git a/METHODOLOGY_REVIEW.md b/METHODOLOGY_REVIEW.md
index 24903bd3..3a7bf07e 100644
--- a/METHODOLOGY_REVIEW.md
+++ b/METHODOLOGY_REVIEW.md
@@ -24,7 +24,7 @@ Each estimator in diff-diff should be periodically reviewed to ensure:
 | MultiPeriodDiD | `estimators.py` | `fixest::feols()` | **Complete** | 2026-02-02 |
 | TwoWayFixedEffects | `twfe.py` | `fixest::feols()` | **Complete** | 2026-02-08 |
 | CallawaySantAnna | `staggered.py` | `did::att_gt()` | **Complete** | 2026-01-24 |
-| SunAbraham | `sun_abraham.py` | `fixest::sunab()` | Not Started | - |
+| SunAbraham | `sun_abraham.py` | `fixest::sunab()` | **Complete** | 2026-02-15 |
 | SyntheticDiD | `synthetic_did.py` | `synthdid::synthdid_estimate()` | **Complete** | 2026-02-10 |
 | TripleDifference | `triple_diff.py` | (forthcoming) | Not Started | - |
 | TROP | `trop.py` | (forthcoming) | Not Started | - |
@@ -294,14 +294,88 @@ variables appear to the left of the `|` separator.
 | Module | `sun_abraham.py` |
 | Primary Reference | Sun & Abraham (2021) |
 | R Reference | `fixest::sunab()` |
-| Status | Not Started |
-| Last Review | - |
+| Status | **Complete** |
+| Last Review | 2026-02-15 |
+
+**Verified Components:**
+- [x] Saturated TWFE regression with cohort × relative-time interactions
+- [x] Within-transformation for unit and time fixed effects
+- [x] Interaction-weighted event study effects (δ̂_e = Σ_g ŵ_{g,e} × δ̂_{g,e})
+- [x] IW weights match event-time sample shares (n_{g,e} / Σ_g n_{g,e})
+- [x] Overall ATT as weighted average of post-treatment effects
+- [x] Delta method SE for aggregated effects (Var = w' Σ w)
+- [x] Cluster-robust SEs at unit level
+- [x] Reference period normalized to zero (e=-1 excluded from design matrix)
+- [x] R comparison: ATT matches `fixest::sunab()` within machine precision (<1e-11)
+- [x] R comparison: SE matches within 0.3% (small scale) / 0.1% (1k scale)
+- [x] R comparison: Event study effects correlation = 1.000000
+- [x] R comparison: Event study max diff < 1e-11
+- [x] Bootstrap inference (pairs bootstrap)
+- [x] Rank deficiency handling (warn/error/silent)
+- [x] All REGISTRY.md edge cases tested
+
+**Test Coverage:**
+- 43 tests in `tests/test_sun_abraham.py` (36 existing + 7 methodology verification)
+- R benchmark tests via `benchmarks/run_benchmarks.py --estimator sunab`
+
+**R Comparison Results:**
+- Overall ATT matches within machine precision (diff < 1e-11 at both scales)
+- Cluster-robust SE matches within 0.3% (well within 1% threshold)
+- Event study effects match perfectly (correlation 1.0, max diff < 1e-11)
+- Validated at small (200 units) and 1k (1000 units) scales
 
 **Corrections Made:**
-- (None yet)
+1. **DF adjustment for absorbed FE** (`sun_abraham.py`, `_fit_saturated_regression()`):
+   Added `df_adjustment = n_units + n_times - 1` to `LinearRegression.fit()` to account
+   for absorbed unit and time fixed effects in degrees of freedom. Unlike TWFE (which uses
+   `-2` plus an explicit intercept column), SunAbraham's saturated regression has no
+   intercept, so all absorbed df must come from the adjustment. Affects t-distribution DoF
+   for cohort-level p-values/CIs (slightly larger p-values, slightly wider CIs) but does
+   NOT change VCV or SE values.
+
+2. **NaN return for no post-treatment effects** (`sun_abraham.py`, `_compute_overall_att()`):
+   Changed return from `(0.0, 0.0)` to `(np.nan, np.nan)` when no post-treatment effects
+   exist. All downstream inference fields (t_stat, p_value, conf_int) correctly propagate
+   NaN via existing guards in `fit()`.
+
+3. **Deprecation warnings for unused parameters** (`sun_abraham.py`, `fit()`):
+   Added `FutureWarning` for `min_pre_periods` and `min_post_periods` parameters that
+   are accepted but never used (no-op). These will be removed in a future version.
+
+4. **Removed event-time truncation at [-20, 20]** (`sun_abraham.py`):
+   Removed the hardcoded cap `max(min(...), -20)` / `min(max(...), 20)` to match
+   R's `fixest::sunab()` which has no such limit. All available relative times are
+   now estimated.
+
+5. **Warning for variance fallback path** (`sun_abraham.py`, `_compute_overall_att()`):
+   Added `UserWarning` when the full weight vector cannot be constructed and a
+   simplified variance (ignoring covariances between periods) is used as fallback.
+
+6. **IW weights use event-time sample shares** (`sun_abraham.py`, `_compute_iw_effects()`):
+   Changed IW weights from `n_g / Σ_g n_g` (cohort sizes) to `n_{g,e} / Σ_g n_{g,e}`
+   (per-event-time observation counts) to match the REGISTRY.md formula. For balanced
+   panels these are identical; for unbalanced panels the new formula correctly reflects
+   actual sample composition at each event-time. Added unbalanced panel test.
+
+7. **Normalize `np.inf` never-treated encoding** (`sun_abraham.py`, `fit()`):
+   `first_treat=np.inf` (documented as valid for never-treated) was included in
+   `treatment_groups` and `_rel_time` via `> 0` checks, producing `-inf` event times.
+   Fixed by normalizing `np.inf` to `0` immediately after computing `_never_treated`.
+   Same fix applied to `staggered.py` (`CallawaySantAnna`).
 
 **Outstanding Concerns:**
-- (None yet)
+- **Inference distribution**: Cohort-level p-values use t-distribution (via
+  `LinearRegression.get_inference()`), while aggregated event study and overall ATT
+  p-values use normal distribution (via `compute_p_value()`). This is asymptotically
+  equivalent and standard for delta-method-aggregated quantities. R's fixest uses
+  t-distribution at all levels, so aggregated p-values may differ slightly for small
+  samples — this is a documented deviation.
+
+**Deviations from R's fixest::sunab():**
+1. **NaN for no post-treatment effects**: Python returns `(NaN, NaN)` for overall ATT/SE
+   when no post-treatment effects exist. R would error.
+2. **Normal distribution for aggregated inference**: Aggregated p-values use normal
+   distribution (asymptotically equivalent). R uses t-distribution.
 
 ---
 
diff --git a/benchmarks/R/benchmark_sunab.R b/benchmarks/R/benchmark_sunab.R
new file mode 100644
index 00000000..9869a2ea
--- /dev/null
+++ b/benchmarks/R/benchmark_sunab.R
@@ -0,0 +1,131 @@
+#!/usr/bin/env Rscript
+# Benchmark: Sun-Abraham interaction-weighted estimator (R `fixest::sunab()`)
+#
+# This uses fixest::sunab() with unit+time FE and unit-level clustering,
+# matching the Python SunAbraham estimator's approach.
+#
+# Usage:
+#   Rscript benchmark_sunab.R --data path/to/data.csv --output path/to/results.json
+
+library(fixest)
+library(jsonlite)
+library(data.table)
+
+# Parse command line arguments
+args <- commandArgs(trailingOnly = TRUE)
+
+parse_args <- function(args) {
+  result <- list(
+    data = NULL,
+    output = NULL
+  )
+
+  i <- 1
+  while (i <= length(args)) {
+    if (args[i] == "--data") {
+      result$data <- args[i + 1]
+      i <- i + 2
+    } else if (args[i] == "--output") {
+      result$output <- args[i + 1]
+      i <- i + 2
+    } else {
+      i <- i + 1
+    }
+  }
+
+  if (is.null(result$data) || is.null(result$output)) {
+    stop("Usage: Rscript benchmark_sunab.R --data <path> --output <path>")
+  }
+
+  return(result)
+}
+
+config <- parse_args(args)
+
+# Load data
+message(sprintf("Loading data from: %s", config$data))
+data <- fread(config$data)
+
+# Convert first_treat to double before assigning Inf (integer column can't hold Inf)
+data[, first_treat := as.double(first_treat)]
+# Convert never-treated coding: first_treat=0 -> Inf (R's convention for never-treated)
+data[first_treat == 0, first_treat := Inf]
+
+# Run benchmark
+message("Running Sun-Abraham estimation with fixest::sunab()...")
+start_time <- Sys.time()
+
+# Sun-Abraham with unit+time FE, clustered at unit level
+# sunab(cohort, period) creates the interaction-weighted estimator
+model <- feols(
+  outcome ~ sunab(first_treat, time) | unit + time,
+  data = data,
+  cluster = ~unit
+)
+
+estimation_time <- as.numeric(difftime(Sys.time(), start_time, units = "secs"))
+
+# Extract event study effects (per-relative-period IW coefficients)
+es_coefs <- coef(model)
+es_ses <- se(model)
+
+# Build event study list
+event_study <- list()
+coef_names <- names(es_coefs)
+for (i in seq_along(es_coefs)) {
+  name <- coef_names[i]
+  # fixest sunab names coefficients like "time::-4" or "time::2"
+  event_time <- as.numeric(gsub("^time::(-?[0-9]+)$", "\\1", name))
+
+  event_study[[length(event_study) + 1]] <- list(
+    event_time = event_time,
+    att = unname(es_coefs[i]),
+    se = unname(es_ses[i])
+  )
+}
+
+# Aggregate to get overall ATT (weighted by observation count per cell)
+# aggregate() returns a matrix with columns: Estimate, Std. Error, t value, Pr(>|t|)
+agg_result <- aggregate(model, agg = "ATT")
+
+overall_att <- agg_result[1, "Estimate"]
+overall_se <- agg_result[1, "Std. Error"]
+overall_pvalue <- agg_result[1, "Pr(>|t|)"]
+
+message(sprintf("Overall ATT: %.6f (SE: %.6f)", overall_att, overall_se))
+
+# Format output
+results <- list(
+  estimator = "fixest::sunab()",
+  cluster = "unit",
+
+  # Overall ATT (aggregated)
+  overall_att = overall_att,
+  overall_se = overall_se,
+  overall_pvalue = overall_pvalue,
+
+  # Event study effects
+  event_study = event_study,
+
+  # Timing
+  timing = list(
+    estimation_seconds = estimation_time,
+    total_seconds = estimation_time
+  ),
+
+  # Metadata
+  metadata = list(
+    r_version = R.version.string,
+    fixest_version = as.character(packageVersion("fixest")),
+    n_units = length(unique(data$unit)),
+    n_periods = length(unique(data$time)),
+    n_obs = nrow(data),
+    n_event_study_coefs = length(es_coefs)
+  )
+)
+
+# Write output
+message(sprintf("Writing results to: %s", config$output))
+write_json(results, config$output, auto_unbox = TRUE, pretty = TRUE, digits = 15)
+
+message(sprintf("Completed in %.3f seconds", estimation_time))
diff --git a/benchmarks/python/benchmark_sun_abraham.py b/benchmarks/python/benchmark_sun_abraham.py
new file mode 100644
index 00000000..87cb4574
--- /dev/null
+++ b/benchmarks/python/benchmark_sun_abraham.py
@@ -0,0 +1,136 @@
+#!/usr/bin/env python3
+"""
+Benchmark: SunAbraham interaction-weighted estimator (diff-diff SunAbraham class).
+
+This benchmarks the SunAbraham estimator with cluster-robust SEs,
+matching R's fixest::sunab() approach.
+
+Usage:
+    python benchmark_sun_abraham.py --data path/to/data.csv --output path/to/results.json
+"""
+
+import argparse
+import json
+import os
+import sys
+from pathlib import Path
+
+# IMPORTANT: Parse --backend and set environment variable BEFORE importing diff_diff
+def _get_backend_from_args():
+    """Parse --backend argument without importing diff_diff."""
+    parser = argparse.ArgumentParser(add_help=False)
+    parser.add_argument("--backend", default="auto", choices=["auto", "python", "rust"])
+    args, _ = parser.parse_known_args()
+    return args.backend
+
+_requested_backend = _get_backend_from_args()
+if _requested_backend in ("python", "rust"):
+    os.environ["DIFF_DIFF_BACKEND"] = _requested_backend
+
+# NOW import diff_diff and other dependencies (will see the env var)
+import numpy as np
+import pandas as pd
+
+# Add parent to path for imports
+sys.path.insert(0, str(Path(__file__).parent.parent.parent))
+
+from diff_diff import SunAbraham, HAS_RUST_BACKEND
+from benchmarks.python.utils import Timer
+
+
+def parse_args():
+    parser = argparse.ArgumentParser(description="Benchmark SunAbraham estimator")
+    parser.add_argument("--data", required=True, help="Path to input CSV data")
+    parser.add_argument("--output", required=True, help="Path to output JSON results")
+    parser.add_argument(
+        "--backend", default="auto", choices=["auto", "python", "rust"],
+        help="Backend to use: auto (default), python (pure Python), rust (Rust backend)"
+    )
+    return parser.parse_args()
+
+
+def get_actual_backend() -> str:
+    """Return the actual backend being used based on HAS_RUST_BACKEND."""
+    return "rust" if HAS_RUST_BACKEND else "python"
+
+
+def main():
+    args = parse_args()
+
+    actual_backend = get_actual_backend()
+    print(f"Using backend: {actual_backend}")
+
+    # Load data
+    print(f"Loading data from: {args.data}")
+    data = pd.read_csv(args.data)
+
+    # Run benchmark using SunAbraham (analytical SEs, no bootstrap)
+    print("Running Sun-Abraham estimation...")
+
+    sa = SunAbraham(control_group="never_treated", n_bootstrap=0)
+
+    with Timer() as timer:
+        results = sa.fit(
+            data,
+            outcome="outcome",
+            unit="unit",
+            time="time",
+            first_treat="first_treat",
+        )
+
+    overall_att = results.overall_att
+    overall_se = results.overall_se
+    overall_pvalue = results.overall_p_value
+
+    # Extract event study effects
+    event_study = []
+    for e in sorted(results.event_study_effects.keys()):
+        eff = results.event_study_effects[e]
+        event_study.append({
+            "event_time": int(e),
+            "att": float(eff["effect"]),
+            "se": float(eff["se"]),
+        })
+
+    total_time = timer.elapsed
+
+    # Build output
+    output = {
+        "estimator": "diff_diff.SunAbraham",
+        "backend": actual_backend,
+        "cluster": "unit",
+        # Overall ATT
+        "overall_att": float(overall_att),
+        "overall_se": float(overall_se),
+        "overall_pvalue": float(overall_pvalue),
+        # Event study effects
+        "event_study": event_study,
+        # Timing
+        "timing": {
+            "estimation_seconds": total_time,
+            "total_seconds": total_time,
+        },
+        # Metadata
+        "metadata": {
+            "n_units": len(data["unit"].unique()),
+            "n_periods": len(data["time"].unique()),
+            "n_obs": len(data),
+            "n_groups": len(results.groups),
+            "n_event_study_effects": len(event_study),
+        },
+    }
+
+    # Write output
+    print(f"Writing results to: {args.output}")
+    output_path = Path(args.output)
+    output_path.parent.mkdir(parents=True, exist_ok=True)
+    with open(output_path, "w") as f:
+        json.dump(output, f, indent=2)
+
+    print(f"Overall ATT: {overall_att:.6f} (SE: {overall_se:.6f})")
+    print(f"Completed in {total_time:.3f} seconds")
+    return output
+
+
+if __name__ == "__main__":
+    main()
diff --git a/benchmarks/run_benchmarks.py b/benchmarks/run_benchmarks.py
index a7cd9b11..f5a05ddc 100644
--- a/benchmarks/run_benchmarks.py
+++ b/benchmarks/run_benchmarks.py
@@ -1090,6 +1090,139 @@ def run_imputation_benchmark(
     return results
 
 
+def run_sunab_benchmark(
+    data_path: Path,
+    name: str = "sunab",
+    scale: str = "small",
+    n_replications: int = 1,
+    backends: Optional[List[str]] = None,
+) -> Dict[str, Any]:
+    """Run Sun-Abraham benchmarks (Python and R) with replications."""
+    print(f"\n{'='*60}")
+    print(f"SUN-ABRAHAM BENCHMARK ({scale})")
+    print(f"{'='*60}")
+
+    if backends is None:
+        backends = ["python", "rust"]
+
+    timeouts = TIMEOUT_CONFIGS.get(scale, TIMEOUT_CONFIGS["small"])
+    results = {
+        "name": name,
+        "scale": scale,
+        "n_replications": n_replications,
+        "python_pure": None,
+        "python_rust": None,
+        "r": None,
+        "comparison": None,
+    }
+
+    # Run Python benchmark for each backend
+    for backend in backends:
+        backend_label = f"python_{'pure' if backend == 'python' else backend}"
+        print(f"\nRunning Python (diff_diff.SunAbraham, backend={backend}) - {n_replications} replications...")
+        py_output = RESULTS_DIR / "accuracy" / f"{backend_label}_{name}_{scale}.json"
+        py_output.parent.mkdir(parents=True, exist_ok=True)
+
+        py_timings = []
+        py_result = None
+        for rep in range(n_replications):
+            try:
+                py_result = run_python_benchmark(
+                    "benchmark_sun_abraham.py", data_path, py_output,
+                    timeout=timeouts["python"],
+                    backend=backend,
+                )
+                py_timings.append(py_result["timing"]["total_seconds"])
+                if rep == 0:
+                    print(f"  ATT: {py_result['overall_att']:.4f}")
+                    print(f"  SE:  {py_result['overall_se']:.4f}")
+                print(f"  Rep {rep+1}/{n_replications}: {py_timings[-1]:.3f}s")
+            except Exception as e:
+                print(f"  Rep {rep+1} failed: {e}")
+
+        if py_result and py_timings:
+            timing_stats = compute_timing_stats(py_timings)
+            py_result["timing"] = timing_stats
+            results[backend_label] = py_result
+            print(f"  Mean time: {timing_stats['stats']['mean']:.3f}s ± {timing_stats['stats']['std']:.3f}s")
+
+    # For backward compatibility, also store as "python" (use rust if available)
+    if results.get("python_rust"):
+        results["python"] = results["python_rust"]
+    elif results.get("python_pure"):
+        results["python"] = results["python_pure"]
+
+    # R benchmark with replications
+    print(f"\nRunning R (fixest::sunab) - {n_replications} replications...")
+    r_output = RESULTS_DIR / "accuracy" / f"r_{name}_{scale}.json"
+
+    r_timings = []
+    r_result = None
+    for rep in range(n_replications):
+        try:
+            r_result = run_r_benchmark(
+                "benchmark_sunab.R", data_path, r_output,
+                timeout=timeouts["r"]
+            )
+            r_timings.append(r_result["timing"]["total_seconds"])
+            if rep == 0:
+                print(f"  ATT: {r_result['overall_att']:.4f}")
+                print(f"  SE:  {r_result['overall_se']:.4f}")
+            print(f"  Rep {rep+1}/{n_replications}: {r_timings[-1]:.3f}s")
+        except Exception as e:
+            print(f"  Rep {rep+1} failed: {e}")
+
+    if r_result and r_timings:
+        timing_stats = compute_timing_stats(r_timings)
+        r_result["timing"] = timing_stats
+        results["r"] = r_result
+        print(f"  Mean time: {timing_stats['stats']['mean']:.3f}s ± {timing_stats['stats']['std']:.3f}s")
+
+    # Compare results
+    if results.get("python") and results.get("r"):
+        print("\nComparison (Python vs R):")
+        comparison = compare_estimates(
+            results["python"], results["r"], "SunAbraham", scale=scale,
+            se_rtol=0.01,
+            python_pure_results=results.get("python_pure"),
+            python_rust_results=results.get("python_rust"),
+        )
+        results["comparison"] = comparison
+        print(f"  ATT diff: {comparison.att_diff:.2e}")
+        print(f"  SE rel diff: {comparison.se_rel_diff:.1%}")
+        print(f"  Status: {'PASS' if comparison.passed else 'FAIL'}")
+
+        # Event study comparison
+        py_effects = results["python"].get("event_study", [])
+        r_effects = results["r"].get("event_study", [])
+        if py_effects and r_effects:
+            corr, max_diff, all_close = compare_event_study(py_effects, r_effects)
+            print(f"  Event study correlation: {corr:.6f}")
+            print(f"  Event study max diff: {max_diff:.2e}")
+            print(f"  Event study all close: {all_close}")
+
+    # Print timing comparison table
+    print("\nTiming Comparison:")
+    print(f"  {'Backend':<15} {'Time (s)':<12} {'vs R':<12} {'vs Pure Python':<15}")
+    print(f"  {'-'*54}")
+
+    r_mean = results["r"]["timing"]["stats"]["mean"] if results["r"] else None
+    pure_mean = results["python_pure"]["timing"]["stats"]["mean"] if results.get("python_pure") else None
+    rust_mean = results["python_rust"]["timing"]["stats"]["mean"] if results.get("python_rust") else None
+
+    if r_mean:
+        print(f"  {'R':<15} {r_mean:<12.3f} {'1.00x':<12} {'-':<15}")
+    if pure_mean:
+        r_speedup = f"{r_mean/pure_mean:.2f}x" if r_mean else "-"
+        print(f"  {'Python (pure)':<15} {pure_mean:<12.3f} {r_speedup:<12} {'1.00x':<15}")
+    if rust_mean:
+        r_speedup = f"{r_mean/rust_mean:.2f}x" if r_mean else "-"
+        pure_speedup = f"{pure_mean/rust_mean:.2f}x" if pure_mean else "-"
+        print(f"  {'Python (rust)':<15} {rust_mean:<12.3f} {r_speedup:<12} {pure_speedup:<15}")
+
+    return results
+
+
 def main():
     parser = argparse.ArgumentParser(
         description="Run diff-diff benchmarks against R packages"
@@ -1101,7 +1234,7 @@ def main():
     )
     parser.add_argument(
         "--estimator",
-        choices=["callaway", "synthdid", "basic", "twfe", "multiperiod", "imputation"],
+        choices=["callaway", "synthdid", "basic", "twfe", "multiperiod", "imputation", "sunab"],
         help="Run specific estimator benchmark",
     )
     parser.add_argument(
@@ -1223,6 +1356,17 @@ def main():
                 )
                 all_results.append(results)
 
+        if args.all or args.estimator == "sunab":
+            # Sun-Abraham uses the same staggered data as Callaway-Sant'Anna
+            stag_key = f"staggered_{scale}"
+            if stag_key in datasets:
+                results = run_sunab_benchmark(
+                    datasets[stag_key],
+                    scale=scale,
+                    n_replications=args.replications,
+                )
+                all_results.append(results)
+
     # Generate summary report
     if all_results:
         print(f"\n{'='*60}")
diff --git a/diff_diff/staggered.py b/diff_diff/staggered.py
index f01400e0..f2eb0829 100644
--- a/diff_diff/staggered.py
+++ b/diff_diff/staggered.py
@@ -415,6 +415,7 @@ def _precompute_structures(
             cohort_masks[g] = (unit_cohorts == g)
 
         # Never-treated mask
+        # np.inf was normalized to 0 in fit(), so the np.inf check is defensive only
         never_treated_mask = (unit_cohorts == 0) | (unit_cohorts == np.inf)
 
         # Pre-compute covariate matrices by time period if needed
@@ -639,13 +640,15 @@ def fit(
         # This avoids hardcoding column names in internal methods
         df['first_treat'] = df[first_treat]
 
+        # Never-treated indicator (must precede treatment_groups to exclude np.inf)
+        df['_never_treated'] = (df[first_treat] == 0) | (df[first_treat] == np.inf)
+        # Normalize np.inf → 0 so all downstream `> 0` checks exclude never-treated
+        df.loc[df[first_treat] == np.inf, first_treat] = 0
+
         # Identify groups and time periods
         time_periods = sorted(df[time].unique())
         treatment_groups = sorted([g for g in df[first_treat].unique() if g > 0])
 
-        # Never-treated indicator (first_treat = 0 or inf)
-        df['_never_treated'] = (df[first_treat] == 0) | (df[first_treat] == np.inf)
-
         # Get unique units
         unit_info = df.groupby(unit).agg({
             first_treat: 'first',
diff --git a/diff_diff/sun_abraham.py b/diff_diff/sun_abraham.py
index 9cd0a6a9..79e3bad7 100644
--- a/diff_diff/sun_abraham.py
+++ b/diff_diff/sun_abraham.py
@@ -456,9 +456,9 @@ def fit(
         covariates : list, optional
             List of covariate column names to include in regression.
         min_pre_periods : int, default=1
-            Minimum number of pre-treatment periods to include in event study.
+            **Deprecated**: Accepted but ignored. Will be removed in a future version.
         min_post_periods : int, default=1
-            Minimum number of post-treatment periods to include in event study.
+            **Deprecated**: Accepted but ignored. Will be removed in a future version.
 
         Returns
         -------
@@ -470,6 +470,22 @@ def fit(
         ValueError
             If required columns are missing or data validation fails.
         """
+        # Deprecation warnings for unimplemented parameters
+        if min_pre_periods != 1:
+            warnings.warn(
+                "min_pre_periods is not yet implemented and will be ignored. "
+                "This parameter will be removed in a future version.",
+                FutureWarning,
+                stacklevel=2,
+            )
+        if min_post_periods != 1:
+            warnings.warn(
+                "min_post_periods is not yet implemented and will be ignored. "
+                "This parameter will be removed in a future version.",
+                FutureWarning,
+                stacklevel=2,
+            )
+
         # Validate inputs
         required_cols = [outcome, unit, time, first_treat]
         if covariates:
@@ -486,13 +502,15 @@ def fit(
         df[time] = pd.to_numeric(df[time])
         df[first_treat] = pd.to_numeric(df[first_treat])
 
+        # Never-treated indicator (must precede treatment_groups to exclude np.inf)
+        df["_never_treated"] = (df[first_treat] == 0) | (df[first_treat] == np.inf)
+        # Normalize np.inf → 0 so all downstream `> 0` checks exclude never-treated
+        df.loc[df[first_treat] == np.inf, first_treat] = 0
+
         # Identify groups and time periods
         time_periods = sorted(df[time].unique())
         treatment_groups = sorted([g for g in df[first_treat].unique() if g > 0])
 
-        # Never-treated indicator
-        df["_never_treated"] = (df[first_treat] == 0) | (df[first_treat] == np.inf)
-
         # Get unique units
         unit_info = (
             df.groupby(unit)
@@ -533,9 +551,9 @@ def fit(
 
         all_rel_times_sorted = sorted(all_rel_times)
 
-        # Filter to reasonable range
-        min_rel = max(min(all_rel_times_sorted), -20)  # cap at -20
-        max_rel = min(max(all_rel_times_sorted), 20)   # cap at +20
+        # Use full range of relative times (no artificial truncation, matches R's fixest::sunab())
+        min_rel = min(all_rel_times_sorted)
+        max_rel = max(all_rel_times_sorted)
 
         # Reference period: last pre-treatment period (typically -1)
         self._reference_period = -1 - self.anticipation
@@ -765,12 +783,18 @@ def _fit_saturated_regression(
 
         # Fit OLS using LinearRegression helper (more stable than manual X'X inverse)
         cluster_ids = df_demeaned[cluster_var].values
+
+        # Degrees of freedom adjustment for absorbed unit and time fixed effects
+        n_units_fe = df[unit].nunique()
+        n_times_fe = df[time].nunique()
+        df_adj = n_units_fe + n_times_fe - 1
+
         reg = LinearRegression(
             include_intercept=False,  # Already demeaned, no intercept needed
             robust=True,
             cluster_ids=cluster_ids,
             rank_deficient_action=self.rank_deficient_action,
-        ).fit(X, y)
+        ).fit(X, y, df_adjustment=df_adj)
 
         coefficients = reg.coefficients_
         vcov = reg.vcov_
@@ -821,7 +845,8 @@ def _compute_iw_effects(
 
         β_e = Σ_g w_{g,e} × δ_{g,e}
 
-        where w_{g,e} is the share of cohort g among treated units at relative time e.
+        where w_{g,e} = n_{g,e} / Σ_g n_{g,e} is the share of observations from cohort g
+        at event-time e among all treated observations at that event-time.
 
         Returns
         -------
@@ -833,9 +858,8 @@ def _compute_iw_effects(
         event_study_effects: Dict[int, Dict[str, Any]] = {}
         cohort_weights: Dict[int, Dict[Any, float]] = {}
 
-        # Get cohort sizes
-        unit_cohorts = df.groupby(unit)[first_treat].first()
-        cohort_sizes = unit_cohorts[unit_cohorts > 0].value_counts().to_dict()
+        # Pre-compute per-event-time observation counts: n_{g,e}
+        event_time_counts = df[df[first_treat] > 0].groupby([first_treat, "_rel_time"]).size()
 
         for e in rel_periods:
             # Get cohorts that have observations at this relative time
@@ -847,13 +871,13 @@ def _compute_iw_effects(
             if not cohorts_at_e:
                 continue
 
-            # Compute IW weights: share of each cohort among those observed at e
+            # Compute IW weights: n_{g,e} / Σ_g n_{g,e}
             weights = {}
             total_size = 0
             for g in cohorts_at_e:
-                n_g = cohort_sizes.get(g, 0)
-                weights[g] = n_g
-                total_size += n_g
+                n_g_e = event_time_counts.get((g, e), 0)
+                weights[g] = n_g_e
+                total_size += n_g_e
 
             if total_size == 0:
                 continue
@@ -915,7 +939,7 @@ def _compute_overall_att(
         ]
 
         if not post_effects:
-            return 0.0, 0.0
+            return np.nan, np.nan
 
         # Weight by number of treated observations at each relative time
         post_weights = []
@@ -948,7 +972,13 @@ def _compute_overall_att(
                         overall_weights_by_coef[key] += period_weight * cw
 
         if not overall_weights_by_coef:
-            # Fallback to simple variance calculation
+            # Fallback to simplified variance that ignores covariances between periods
+            warnings.warn(
+                "Could not construct full weight vector for overall ATT SE. "
+                "Using simplified variance that ignores covariances between periods.",
+                UserWarning,
+                stacklevel=2,
+            )
             overall_var = float(
                 np.sum((post_weights ** 2) * np.array([eff["se"] ** 2 for _, eff in post_effects]))
             )
@@ -1029,6 +1059,7 @@ def _run_bootstrap(
                 df_b[time] - df_b[first_treat],
                 np.nan
             )
+            # np.inf was normalized to 0 in fit(), so the np.inf check is defensive only
             df_b["_never_treated"] = (
                 (df_b[first_treat] == 0) | (df_b[first_treat] == np.inf)
             )
@@ -1113,11 +1144,16 @@ def _run_bootstrap(
             event_study_p_values[e] = p_value
 
         # Overall ATT statistics
-        overall_se = float(np.std(bootstrap_overall, ddof=1))
-        overall_ci = self._compute_percentile_ci(bootstrap_overall, self.alpha)
-        overall_p = self._compute_bootstrap_pvalue(
-            original_overall_att, bootstrap_overall
-        )
+        if not np.isfinite(original_overall_att):
+            overall_se = np.nan
+            overall_ci = (np.nan, np.nan)
+            overall_p = np.nan
+        else:
+            overall_se = float(np.std(bootstrap_overall, ddof=1))
+            overall_ci = self._compute_percentile_ci(bootstrap_overall, self.alpha)
+            overall_p = self._compute_bootstrap_pvalue(
+                original_overall_att, bootstrap_overall
+            )
 
         return SABootstrapResults(
             n_bootstrap=self.n_bootstrap,
diff --git a/docs/methodology/REGISTRY.md b/docs/methodology/REGISTRY.md
index 378c25d0..4f81ebc0 100644
--- a/docs/methodology/REGISTRY.md
+++ b/docs/methodology/REGISTRY.md
@@ -424,6 +424,10 @@ where weights ŵ_{g,e} = n_{g,e} / Σ_g n_{g,e} (sample share of cohort g at eve
 - Single cohort: reduces to standard event study
 - Cohorts with no observations at some event-times: weighted appropriately
 - Extrapolation beyond observed event-times: not estimated
+- Event-time range: no artificial cap (estimates all available relative times, matching R's `fixest::sunab()`)
+- No post-treatment effects: returns `(NaN, NaN)` for overall ATT/SE; all inference fields (t_stat, p_value, conf_int) propagate NaN via `np.isfinite()` guards
+- `min_pre_periods`/`min_post_periods` parameters: deprecated (accepted but ignored, emit `FutureWarning`)
+- Variance fallback: when full weight vector cannot be constructed for overall ATT SE, uses simplified variance (ignores covariances between periods) with `UserWarning`
 - Rank-deficient design matrix (covariate collinearity):
   - Detection: Pivoted QR decomposition with tolerance `1e-07` (R's `qr()` default)
   - Handling: Warns and drops linearly dependent columns, sets NA for dropped coefficients (R-style, matches `lm()`)
@@ -434,17 +438,25 @@ where weights ŵ_{g,e} = n_{g,e} / Σ_g n_{g,e} (sample share of cohort g at eve
   - Bootstrap inference: p_value and CI computed from bootstrap distribution, may be valid even when SE/t_stat is NaN (only NaN if <50% of bootstrap samples are valid)
   - Applies to overall ATT, per-effect event study, and aggregated event study
   - **Note**: Defensive enhancement matching CallawaySantAnna behavior; R's `fixest::sunab()` may produce Inf/NaN without warning
+- Inference distribution:
+  - Cohort-level p-values: t-distribution (via `LinearRegression.get_inference()`)
+  - Aggregated event study and overall ATT p-values: normal distribution (via `compute_p_value()`)
+  - This is asymptotically equivalent and standard for delta-method-aggregated quantities
+  - **Deviation from R**: R's fixest uses t-distribution at all levels; aggregated p-values may differ slightly for small samples
 
 **Reference implementation(s):**
 - R: `fixest::sunab()` (Laurent Bergé's implementation)
 - Stata: `eventstudyinteract`
 
 **Requirements checklist:**
-- [ ] Never-treated units required as controls
-- [ ] Interaction weights sum to 1 within each relative time period
-- [ ] Reference period defaults to e=-1, coefficient normalized to zero
-- [ ] Cohort-specific effects recoverable from results
-- [ ] Cluster-robust SEs with delta method for aggregates
+- [x] Never-treated units required as controls
+- [x] Interaction weights sum to 1 within each relative time period
+- [x] Reference period defaults to e=-1, coefficient normalized to zero
+- [x] Cohort-specific effects recoverable from results
+- [x] Cluster-robust SEs with delta method for aggregates
+- [x] R comparison: ATT matches within machine precision (<1e-11)
+- [x] R comparison: SE matches within 0.3% (well within 1% threshold)
+- [x] R comparison: Event study effects match perfectly (correlation 1.0)
 
 ---
 
diff --git a/tests/test_staggered.py b/tests/test_staggered.py
index 71e01f50..79651454 100644
--- a/tests/test_staggered.py
+++ b/tests/test_staggered.py
@@ -102,6 +102,29 @@ def test_zero_treatment_effect(self):
         # Effect should be close to zero
         assert abs(results.overall_att) < 3 * results.overall_se
 
+    def test_never_treated_inf_encoding(self):
+        """Test that first_treat=np.inf is handled as never-treated, not as a cohort."""
+        data = generate_staggered_data(n_units=200, n_periods=10, n_cohorts=3, seed=42)
+
+        cs = CallawaySantAnna(n_bootstrap=0)
+        results_zero = cs.fit(
+            data.copy(), outcome="outcome", unit="unit", time="time", first_treat="first_treat"
+        )
+
+        # Re-encode never-treated from 0 to np.inf (cast to float first for pandas compat)
+        data_inf = data.copy()
+        data_inf["first_treat"] = data_inf["first_treat"].astype(float)
+        data_inf.loc[data_inf["first_treat"] == 0, "first_treat"] = np.inf
+
+        results_inf = cs.fit(
+            data_inf, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
+        )
+
+        # Results should be identical
+        assert np.isclose(results_inf.overall_att, results_zero.overall_att), (
+            f"ATT differs: inf={results_inf.overall_att}, zero={results_zero.overall_att}"
+        )
+
     def test_event_study_aggregation(self):
         """Test event study aggregation."""
         data = generate_staggered_data()
diff --git a/tests/test_sun_abraham.py b/tests/test_sun_abraham.py
index cbf9079d..ce60fb0b 100644
--- a/tests/test_sun_abraham.py
+++ b/tests/test_sun_abraham.py
@@ -1065,3 +1065,418 @@ def test_aggregated_event_study_tstat_nan(self):
                     f"Aggregated t_stat for e={e} should be effect/SE={expected_t}, "
                     f"got {t_stat}"
                 )
+
+
+class TestSunAbrahamMethodology:
+    """Tests for methodology review fixes (Steps 5a-5e)."""
+
+    def test_no_post_effects_returns_nan(self):
+        """Test that no post-treatment effects returns NaN for overall ATT/SE (Step 5b).
+
+        When there are no post-treatment periods, overall_att and overall_se should be NaN,
+        and all downstream inference fields should propagate NaN correctly.
+        """
+        # Create data where all periods are pre-treatment
+        np.random.seed(42)
+        n_units = 40
+        n_periods = 6
+
+        units = np.repeat(np.arange(n_units), n_periods)
+        times = np.tile(np.arange(n_periods), n_units)
+
+        # All treated units have first_treat at period 100 (well beyond data range)
+        first_treat = np.zeros(n_units)
+        first_treat[12:] = 100  # treated at period 100, but data only goes to period 5
+        first_treat_expanded = np.repeat(first_treat, n_periods)
+
+        unit_fe = np.repeat(np.random.randn(n_units), n_periods)
+        time_fe = np.tile(np.arange(n_periods) * 0.1, n_units)
+        outcomes = unit_fe + time_fe + np.random.randn(len(units)) * 0.3
+
+        data = pd.DataFrame({
+            "unit": units,
+            "time": times,
+            "outcome": outcomes,
+            "first_treat": first_treat_expanded.astype(int),
+        })
+
+        sa = SunAbraham(n_bootstrap=0)
+        results = sa.fit(
+            data, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
+        )
+
+        # Overall ATT and SE should be NaN
+        assert np.isnan(results.overall_att), (
+            f"Expected NaN overall_att, got {results.overall_att}"
+        )
+        assert np.isnan(results.overall_se), (
+            f"Expected NaN overall_se, got {results.overall_se}"
+        )
+        # Downstream inference should propagate NaN
+        assert np.isnan(results.overall_t_stat), (
+            f"Expected NaN overall_t_stat, got {results.overall_t_stat}"
+        )
+        assert np.isnan(results.overall_p_value), (
+            f"Expected NaN overall_p_value, got {results.overall_p_value}"
+        )
+        assert np.isnan(results.overall_conf_int[0]) and np.isnan(results.overall_conf_int[1]), (
+            f"Expected (NaN, NaN) overall_conf_int, got {results.overall_conf_int}"
+        )
+
+    def test_no_post_effects_bootstrap_returns_nan(self, ci_params):
+        """Test that no post-treatment effects returns NaN even with bootstrap.
+
+        When there are no post-treatment periods, overall_att/se/t_stat/p_value/ci
+        should all be NaN. The bootstrap path must not overwrite NaN with non-NaN
+        values (regression test for P0 bug where _compute_bootstrap_pvalue returned
+        1/(B+1) instead of NaN when original_effect was NaN).
+        """
+        # Create data where all periods are pre-treatment
+        np.random.seed(42)
+        n_units = 40
+        n_periods = 6
+
+        units = np.repeat(np.arange(n_units), n_periods)
+        times = np.tile(np.arange(n_periods), n_units)
+
+        # All treated units have first_treat at period 100 (well beyond data range)
+        first_treat = np.zeros(n_units)
+        first_treat[12:] = 100  # treated at period 100, but data only goes to period 5
+        first_treat_expanded = np.repeat(first_treat, n_periods)
+
+        unit_fe = np.repeat(np.random.randn(n_units), n_periods)
+        time_fe = np.tile(np.arange(n_periods) * 0.1, n_units)
+        outcomes = unit_fe + time_fe + np.random.randn(len(units)) * 0.3
+
+        data = pd.DataFrame({
+            "unit": units,
+            "time": times,
+            "outcome": outcomes,
+            "first_treat": first_treat_expanded.astype(int),
+        })
+
+        n_boot = ci_params.bootstrap(50)
+        sa = SunAbraham(n_bootstrap=n_boot, seed=42)
+        results = sa.fit(
+            data, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
+        )
+
+        # All overall inference fields should be NaN
+        assert np.isnan(results.overall_att), (
+            f"Expected NaN overall_att, got {results.overall_att}"
+        )
+        assert np.isnan(results.overall_se), (
+            f"Expected NaN overall_se, got {results.overall_se}"
+        )
+        assert np.isnan(results.overall_t_stat), (
+            f"Expected NaN overall_t_stat, got {results.overall_t_stat}"
+        )
+        assert np.isnan(results.overall_p_value), (
+            f"Expected NaN overall_p_value with bootstrap, got {results.overall_p_value}"
+        )
+        assert np.isnan(results.overall_conf_int[0]) and np.isnan(results.overall_conf_int[1]), (
+            f"Expected (NaN, NaN) overall_conf_int, got {results.overall_conf_int}"
+        )
+
+    def test_deprecated_min_pre_periods_warning(self):
+        """Test that min_pre_periods emits FutureWarning (Step 5c)."""
+        data = generate_staggered_data(seed=42)
+
+        sa = SunAbraham(n_bootstrap=0)
+        with pytest.warns(FutureWarning, match="min_pre_periods"):
+            sa.fit(
+                data, outcome="outcome", unit="unit", time="time",
+                first_treat="first_treat", min_pre_periods=2,
+            )
+
+    def test_deprecated_min_post_periods_warning(self):
+        """Test that min_post_periods emits FutureWarning (Step 5c)."""
+        data = generate_staggered_data(seed=42)
+
+        sa = SunAbraham(n_bootstrap=0)
+        with pytest.warns(FutureWarning, match="min_post_periods"):
+            sa.fit(
+                data, outcome="outcome", unit="unit", time="time",
+                first_treat="first_treat", min_post_periods=2,
+            )
+
+    def test_event_time_no_truncation(self):
+        """Test that event times beyond ±20 are estimated (Step 5d).
+
+        Creates data with event times spanning beyond ±20 and verifies
+        that effects are estimated for all available relative times.
+        """
+        np.random.seed(42)
+        n_units = 60
+        n_periods = 50  # 50 periods to get event times beyond ±20
+
+        units = np.repeat(np.arange(n_units), n_periods)
+        times = np.tile(np.arange(1, n_periods + 1), n_units)
+
+        # 30% never treated, rest treated at period 25 (giving rel times from -24 to +25)
+        first_treat = np.zeros(n_units)
+        first_treat[18:] = 25
+        first_treat_expanded = np.repeat(first_treat, n_periods)
+
+        unit_fe = np.repeat(np.random.randn(n_units) * 2, n_periods)
+        time_fe = np.tile(np.arange(1, n_periods + 1) * 0.1, n_units)
+        post = (times >= first_treat_expanded) & (first_treat_expanded > 0)
+        outcomes = unit_fe + time_fe + 2.0 * post + np.random.randn(len(units)) * 0.3
+
+        data = pd.DataFrame({
+            "unit": units,
+            "time": times,
+            "outcome": outcomes,
+            "first_treat": first_treat_expanded.astype(int),
+        })
+
+        sa = SunAbraham(n_bootstrap=0)
+        results = sa.fit(
+            data, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
+        )
+
+        # Verify that event times beyond ±20 are present
+        event_times = sorted(results.event_study_effects.keys())
+        assert min(event_times) < -20, (
+            f"Expected event times < -20, got min={min(event_times)}"
+        )
+        assert max(event_times) > 20, (
+            f"Expected event times > 20, got max={max(event_times)}"
+        )
+
+    def test_df_adjustment_sets_regression_df(self):
+        """Test that df_adjustment for absorbed FE is applied correctly (Step 5a).
+
+        After fitting, the internal LinearRegression's df_ should account for
+        absorbed unit and time fixed effects.
+        """
+        from unittest.mock import patch
+        from diff_diff.linalg import LinearRegression
+
+        data = generate_staggered_data(n_units=100, n_periods=8, seed=42)
+        captured_df = {}
+
+        original_fit = LinearRegression.fit
+
+        # Wraps LinearRegression.fit as an unbound method replacement.
+        # self_reg is the LinearRegression instance (not the test class self).
+        # SunAbraham currently calls LinearRegression.fit exactly once in
+        # _fit_saturated_regression(); if that changes, this test captures only
+        # the last call's state.
+        def capturing_fit(self_reg, X, y, **kwargs):
+            result = original_fit(self_reg, X, y, **kwargs)
+            captured_df['df'] = self_reg.df_
+            captured_df['n_obs'] = self_reg.n_obs_
+            captured_df['n_params_effective'] = self_reg.n_params_effective_
+            captured_df['df_adjustment'] = kwargs.get('df_adjustment', 0)
+            return result
+
+        sa = SunAbraham(n_bootstrap=0)
+        with patch.object(LinearRegression, 'fit', capturing_fit):
+            results = sa.fit(data, outcome="outcome", unit="unit",
+                            time="time", first_treat="first_treat")
+
+        # Verify df_adjustment was passed and applied
+        n_units = data["unit"].nunique()
+        n_times = data["time"].nunique()
+        expected_df_adj = n_units + n_times - 1
+
+        assert captured_df['df_adjustment'] == expected_df_adj, (
+            f"Expected df_adjustment={expected_df_adj}, got {captured_df['df_adjustment']}"
+        )
+        expected_df = captured_df['n_obs'] - captured_df['n_params_effective'] - expected_df_adj
+        assert captured_df['df'] == expected_df, (
+            f"Expected df={expected_df}, got {captured_df['df']}"
+        )
+        assert captured_df['df'] > 0, "Regression df must be positive"
+
+    def test_variance_fallback_warning(self):
+        """Test that the variance fallback path emits a warning (Step 5e).
+
+        Mocks the overall_weights_by_coef to be empty to trigger the fallback.
+        """
+        import warnings
+        from unittest.mock import patch
+
+        data = generate_staggered_data(seed=42)
+
+        sa = SunAbraham(n_bootstrap=0)
+
+        # Patch _compute_overall_att to simulate the fallback path
+        original_method = sa._compute_overall_att
+
+        def patched_compute_overall_att(df, first_treat, event_study_effects,
+                                        cohort_effects, cohort_weights,
+                                        vcov_cohort, coef_index_map):
+            # Pass an empty coef_index_map to trigger the fallback
+            return original_method(
+                df, first_treat, event_study_effects,
+                cohort_effects, cohort_weights,
+                vcov_cohort, {},  # Empty coef_index_map forces fallback
+            )
+
+        with patch.object(sa, '_compute_overall_att', side_effect=patched_compute_overall_att):
+            with warnings.catch_warnings(record=True) as w:
+                warnings.simplefilter("always")
+                results = sa.fit(
+                    data, outcome="outcome", unit="unit", time="time",
+                    first_treat="first_treat",
+                )
+
+                fallback_warnings = [
+                    x for x in w
+                    if "simplified variance" in str(x.message).lower()
+                ]
+                assert len(fallback_warnings) > 0, (
+                    "Expected warning about simplified variance fallback"
+                )
+
+        # The result should still have a positive SE (simplified variance)
+        assert results.overall_se > 0, (
+            f"Expected positive SE from fallback, got {results.overall_se}"
+        )
+
+    def test_iw_weights_match_cohort_shares(self):
+        """Test that IW weights match event-time sample shares.
+
+        For each relative period, Σ_g w_{g,e} = 1.0 and individual weights
+        match n_{g,e} / Σ_g n_{g,e} (sample share of cohort g at event-time e).
+        """
+        data = generate_staggered_data(n_units=200, n_periods=10, n_cohorts=3, seed=42)
+
+        sa = SunAbraham(n_bootstrap=0)
+        results = sa.fit(
+            data, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
+        )
+
+        for e, weights in results.cohort_weights.items():
+            # Weights should sum to 1
+            total = sum(weights.values())
+            assert abs(total - 1.0) < 1e-10, (
+                f"Weights for e={e} sum to {total}, expected 1.0"
+            )
+
+            # Individual weights should match event-time sample shares
+            cohort_counts = {}
+            for g in weights.keys():
+                cohort_counts[g] = len(
+                    data[
+                        (data["first_treat"] == g)
+                        & (data["time"] - data["first_treat"] == e)
+                    ]
+                )
+            total_count = sum(cohort_counts.values())
+            for g, w in weights.items():
+                expected_w = cohort_counts[g] / total_count
+                assert abs(w - expected_w) < 1e-10, (
+                    f"Weight for cohort {g} at e={e}: got {w}, expected {expected_w}"
+                )
+
+    def test_iw_weights_unbalanced_panel(self):
+        """Test that IW weights use event-time counts, not cohort sizes, for unbalanced panels."""
+        data = generate_staggered_data(n_units=200, n_periods=10, n_cohorts=3, seed=42)
+
+        # Make panel unbalanced by dropping some observations from one cohort
+        # at specific time periods
+        cohorts = data.groupby("unit")["first_treat"].first()
+        first_cohort = sorted(cohorts[cohorts > 0].unique())[0]
+        units_in_first_cohort = cohorts[cohorts == first_cohort].index.tolist()
+
+        # Drop ~half the units from first cohort at the last time period
+        units_to_drop = units_in_first_cohort[: len(units_in_first_cohort) // 2]
+        max_time = data["time"].max()
+        drop_mask = data["unit"].isin(units_to_drop) & (data["time"] == max_time)
+        data_unbal = data[~drop_mask].copy()
+
+        sa = SunAbraham(n_bootstrap=0)
+        results = sa.fit(
+            data_unbal,
+            outcome="outcome",
+            unit="unit",
+            time="time",
+            first_treat="first_treat",
+        )
+
+        # Find an event-time where the dropped observations cause n_{g,e} != n_g
+        # The dropped units are from the first cohort at max_time
+        affected_e = max_time - first_cohort
+
+        assert affected_e in results.cohort_weights, (
+            f"Expected event-time {affected_e} in cohort_weights but not found"
+        )
+
+        weights = results.cohort_weights[affected_e]
+        # Verify weights use actual observation counts, not total cohort sizes
+        cohort_counts = {}
+        for g in weights.keys():
+            cohort_counts[g] = len(
+                data_unbal[
+                    (data_unbal["first_treat"] == g)
+                    & (data_unbal["time"] - data_unbal["first_treat"] == affected_e)
+                ]
+            )
+        total_count = sum(cohort_counts.values())
+        for g, w in weights.items():
+            expected_w = cohort_counts[g] / total_count
+            assert abs(w - expected_w) < 1e-10, (
+                f"Weight for cohort {g} at e={affected_e}: got {w}, expected {expected_w}"
+            )
+
+    def test_never_treated_inf_encoding(self):
+        """Test that first_treat=np.inf is handled as never-treated, not as a cohort."""
+        data = generate_staggered_data(n_units=200, n_periods=10, n_cohorts=3, seed=42)
+
+        # Run with first_treat=0 as baseline
+        sa = SunAbraham(n_bootstrap=0)
+        results_zero = sa.fit(
+            data.copy(), outcome="outcome", unit="unit", time="time", first_treat="first_treat"
+        )
+
+        # Re-encode never-treated from 0 to np.inf (cast to float first for pandas compat)
+        data_inf = data.copy()
+        data_inf["first_treat"] = data_inf["first_treat"].astype(float)
+        data_inf.loc[data_inf["first_treat"] == 0, "first_treat"] = np.inf
+
+        results_inf = sa.fit(
+            data_inf, outcome="outcome", unit="unit", time="time", first_treat="first_treat"
+        )
+
+        # np.inf must not appear as a cohort in weights
+        for e, weights in results_inf.cohort_weights.items():
+            assert np.inf not in weights, (
+                f"np.inf found as cohort key in weights at e={e}"
+            )
+
+        # No ±inf in event study periods
+        for e in results_inf.event_study_effects.keys():
+            assert np.isfinite(e), f"Non-finite event time {e} in event study"
+
+        # np.inf must not appear in results.groups
+        assert np.inf not in results_inf.groups, (
+            f"np.inf found in results.groups: {results_inf.groups}"
+        )
+
+        # Results should be identical to first_treat=0 encoding
+        assert np.isclose(results_inf.overall_att, results_zero.overall_att), (
+            f"ATT differs: inf={results_inf.overall_att}, zero={results_zero.overall_att}"
+        )
+        assert np.isclose(results_inf.overall_se, results_zero.overall_se), (
+            f"SE differs: inf={results_inf.overall_se}, zero={results_zero.overall_se}"
+        )
+
+    def test_all_never_treated_inf_raises(self):
+        """Test that all-never-treated data with np.inf encoding raises ValueError."""
+        data = generate_staggered_data(n_units=100, n_periods=10, n_cohorts=3, seed=42)
+        # Set ALL units to never-treated via np.inf (cast to float first for pandas compat)
+        data["first_treat"] = data["first_treat"].astype(float)
+        data["first_treat"] = np.inf
+
+        sa = SunAbraham(n_bootstrap=0)
+        with pytest.raises(ValueError, match="No treated units found"):
+            sa.fit(
+                data,
+                outcome="outcome",
+                unit="unit",
+                time="time",
+                first_treat="first_treat",
+            )