diff --git a/README.md b/README.md
index fccc4dc3..ae28d3cc 100644
--- a/README.md
+++ b/README.md
@@ -70,7 +70,7 @@ Signif. codes: '***' 0.001, '**' 0.01, '*' 0.05, '.' 0.1
 - **Wild cluster bootstrap**: Valid inference with few clusters (<50) using Rademacher, Webb, or Mammen weights
 - **Panel data support**: Two-way fixed effects estimator for panel designs
 - **Multi-period analysis**: Event-study style DiD with period-specific treatment effects
-- **Staggered adoption**: Callaway-Sant'Anna (2021), Sun-Abraham (2021), Borusyak-Jaravel-Spiess (2024) imputation, Two-Stage DiD (Gardner 2022), and Stacked DiD (Wing, Freedman & Hollingsworth 2024) estimators for heterogeneous treatment timing
+- **Staggered adoption**: Callaway-Sant'Anna (2021), Sun-Abraham (2021), Borusyak-Jaravel-Spiess (2024) imputation, Two-Stage DiD (Gardner 2022), Stacked DiD (Wing, Freedman & Hollingsworth 2024), and Efficient DiD (Chen, Sant'Anna & Xie 2025) estimators for heterogeneous treatment timing
 - **Triple Difference (DDD)**: Ortiz-Villavicencio & Sant'Anna (2025) estimators with proper covariate handling
 - **Synthetic DiD**: Combined DiD with synthetic control for improved robustness
 - **Triply Robust Panel (TROP)**: Factor-adjusted DiD with synthetic weights (Athey et al. 2025)
@@ -125,6 +125,7 @@ We provide Jupyter notebook tutorials in `docs/tutorials/`:
 | `11_imputation_did.ipynb` | Imputation DiD (Borusyak et al. 2024), pre-trend test, efficiency comparison |
 | `12_two_stage_did.ipynb` | Two-Stage DiD (Gardner 2022), GMM sandwich variance, per-observation effects |
 | `13_stacked_did.ipynb` | Stacked DiD (Wing et al. 2024), Q-weights, sub-experiment inspection, trimming, clean control definitions |
+| `15_efficient_did.ipynb` | Efficient DiD (Chen et al. 2025), optimal weighting, PT-All vs PT-Post, efficiency gains, bootstrap inference |
 
 ## Data Preparation
 
@@ -1071,6 +1072,56 @@ results = stacked_did(
 )
 ```
 
+### Efficient DiD (Chen, Sant'Anna & Xie 2025)
+
+Efficient DiD achieves the semiparametric efficiency bound for ATT estimation in staggered adoption designs. It optimally weights across all valid comparison groups and baselines via the inverse covariance matrix Omega*, producing tighter confidence intervals than standard estimators like Callaway-Sant'Anna when the stronger PT-All assumption holds.
+
+```python
+from diff_diff import EfficientDiD, generate_staggered_data
+
+# Generate sample data
+data = generate_staggered_data(n_units=300, n_periods=10,
+                                cohort_periods=[4, 6, 8], seed=42)
+
+# Fit with PT-All (overidentified, tighter SEs)
+edid = EfficientDiD(pt_assumption="all")
+results = edid.fit(data, outcome='outcome', unit='unit',
+                   time='period', first_treat='first_treat',
+                   aggregate='all')
+results.print_summary()
+
+# PT-Post mode (matches CS for post-treatment effects)
+edid_post = EfficientDiD(pt_assumption="post")
+results_post = edid_post.fit(data, outcome='outcome', unit='unit',
+                              time='period', first_treat='first_treat')
+```
+
+**Parameters:**
+
+```python
+EfficientDiD(
+    pt_assumption='all',            # 'all' (overidentified) or 'post' (matches CS post-treatment ATT)
+    alpha=0.05,                     # Significance level
+    n_bootstrap=0,                  # Bootstrap iterations (0 = analytical only)
+    bootstrap_weights='rademacher', # 'rademacher', 'mammen', or 'webb'
+    seed=None,                      # Random seed
+    anticipation=0,                 # Anticipation periods
+)
+```
+
+> **Note:** Phase 1 supports the no-covariates path only. Use CallawaySantAnna with
+> `estimation_method='dr'` if you need covariate adjustment.
+
+**When to use Efficient DiD vs Callaway-Sant'Anna:**
+
+| Aspect | Efficient DiD | Callaway-Sant'Anna |
+|--------|--------------|-------------------|
+| Approach | Optimal EIF-based weighting | Separate 2x2 DiD aggregation |
+| PT assumption | PT-All (stronger) or PT-Post | Conditional PT |
+| Efficiency | Achieves semiparametric bound | Not efficient |
+| Covariates | Not yet (Phase 2) | Supported (OR, IPW, DR) |
+| When to choose | Maximum efficiency, PT-All credible | Covariates needed, weaker PT |
+
 ### Triple Difference (DDD)
 
 Triple Difference (DDD) is used when treatment requires satisfying two criteria: belonging to a treated **group** AND being in an eligible **partition**. The `TripleDifference` class implements the methodology from Ortiz-Villavicencio & Sant'Anna (2025), which correctly handles covariate adjustment (unlike naive implementations).
diff --git a/docs/api/efficient_did.rst b/docs/api/efficient_did.rst
new file mode 100644
index 00000000..0216b08d
--- /dev/null
+++ b/docs/api/efficient_did.rst
@@ -0,0 +1,150 @@
+Efficient Difference-in-Differences
+====================================
+
+Semiparametrically efficient ATT estimator for staggered adoption designs
+from Chen, Sant'Anna & Xie (2025).
+
+This module implements the efficiency-bound-attaining estimator that:
+
+1. **Achieves the semiparametric efficiency bound** for ATT(g,t) estimation
+2. **Optimally weights** across comparison groups and baselines via the
+   inverse covariance matrix Ω*
+3. **Supports two PT assumptions**: PT-All (overidentified, tighter SEs) and
+   PT-Post (just-identified, matches CS for post-treatment effects)
+4. **Uses EIF-based inference** for analytical standard errors and multiplier
+   bootstrap
+
+.. note::
+
+   Phase 1 supports the **no-covariates** path only. The with-covariates
+   path (Phase 2) will be added in a future version.
+
+**When to use EfficientDiD:**
+
+- Staggered adoption design where you want **maximum efficiency**
+- You believe parallel trends holds across all pre-treatment periods (PT-All)
+- You want tighter confidence intervals than Callaway-Sant'Anna
+- You need a formal efficiency benchmark for comparing estimators
+
+**Reference:** Chen, X., Sant'Anna, P. H. C., & Xie, H. (2025). Efficient
+Difference-in-Differences and Event Study Estimators.
+
+.. module:: diff_diff.efficient_did
+
+EfficientDiD
+-------------
+
+Main estimator class for Efficient Difference-in-Differences.
+
+.. autoclass:: diff_diff.EfficientDiD
+   :members:
+   :undoc-members:
+   :show-inheritance:
+   :inherited-members:
+
+   .. rubric:: Methods
+
+   .. autosummary::
+
+      ~EfficientDiD.fit
+      ~EfficientDiD.get_params
+      ~EfficientDiD.set_params
+
+EfficientDiDResults
+-------------------
+
+Results container for Efficient DiD estimation.
+
+.. autoclass:: diff_diff.efficient_did_results.EfficientDiDResults
+   :members:
+   :undoc-members:
+   :show-inheritance:
+
+   .. rubric:: Methods
+
+   .. autosummary::
+
+      ~EfficientDiDResults.summary
+      ~EfficientDiDResults.print_summary
+      ~EfficientDiDResults.to_dataframe
+
+EDiDBootstrapResults
+--------------------
+
+Bootstrap inference results for Efficient DiD.
+
+.. autoclass:: diff_diff.efficient_did_bootstrap.EDiDBootstrapResults
+   :members:
+   :undoc-members:
+   :show-inheritance:
+
+Example Usage
+-------------
+
+Basic usage::
+
+    from diff_diff import EfficientDiD, generate_staggered_data
+
+    data = generate_staggered_data(n_units=300, n_periods=10,
+                                    cohort_periods=[4, 6, 8], seed=42)
+
+    edid = EfficientDiD(pt_assumption="all")
+    results = edid.fit(data, outcome='outcome', unit='unit',
+                       time='period', first_treat='first_treat',
+                       aggregate='all')
+    results.print_summary()
+
+PT-Post mode (matches CS for post-treatment ATT)::
+
+    edid_post = EfficientDiD(pt_assumption="post")
+    results_post = edid_post.fit(data, outcome='outcome', unit='unit',
+                                  time='period', first_treat='first_treat',
+                                  aggregate='all')
+    print(f"PT-All ATT:  {results.overall_att:.4f} (SE={results.overall_se:.4f})")
+    print(f"PT-Post ATT: {results_post.overall_att:.4f} (SE={results_post.overall_se:.4f})")
+
+Bootstrap inference::
+
+    edid_boot = EfficientDiD(pt_assumption="all", n_bootstrap=999, seed=42)
+    results_boot = edid_boot.fit(data, outcome='outcome', unit='unit',
+                                  time='period', first_treat='first_treat',
+                                  aggregate='all')
+    print(f"Bootstrap SE: {results_boot.overall_se:.4f}")
+    print(f"Bootstrap CI: [{results_boot.overall_conf_int[0]:.4f}, "
+          f"{results_boot.overall_conf_int[1]:.4f}]")
+
+Comparison with Other Staggered Estimators
+------------------------------------------
+
+.. list-table::
+   :header-rows: 1
+   :widths: 20 27 27 26
+
+   * - Feature
+     - EfficientDiD
+     - CallawaySantAnna
+     - ImputationDiD
+   * - Approach
+     - Optimal EIF-based weighting
+     - Separate 2x2 DiD aggregation
+     - Impute Y(0) via FE model
+   * - PT assumption
+     - PT-All (stronger) or PT-Post
+     - Conditional PT
+     - Strict exogeneity
+   * - Efficiency
+     - Achieves semiparametric bound
+     - Not efficient
+     - Efficient under homogeneity
+   * - Covariates
+     - Not yet (Phase 2)
+     - Supported (OR, IPW, DR)
+     - Supported
+   * - Bootstrap
+     - Multiplier bootstrap (EIF)
+     - Multiplier bootstrap
+     - Multiplier bootstrap
+   * - PT-Post equivalence
+     - Matches CS post-treatment ATT(g,t)
+     - Baseline
+     - Different framework
diff --git a/docs/api/index.rst b/docs/api/index.rst
index 4f41b4bd..5a57ee66 100644
--- a/docs/api/index.rst
+++ b/docs/api/index.rst
@@ -23,6 +23,7 @@ Core estimator classes for DiD analysis:
    diff_diff.TripleDifference
    diff_diff.TROP
    diff_diff.ContinuousDiD
+   diff_diff.EfficientDiD
 
 Results Classes
 ---------------
@@ -49,6 +50,8 @@ Result containers returned by estimators:
    diff_diff.trop.TROPResults
    diff_diff.ContinuousDiDResults
    diff_diff.DoseResponseCurve
+   diff_diff.EfficientDiDResults
+   diff_diff.EDiDBootstrapResults
 
 Visualization
 -------------
@@ -195,6 +198,7 @@ Detailed documentation by module:
    triple_diff
    trop
    continuous_did
+   efficient_did
    results
    visualization
    diagnostics
diff --git a/docs/choosing_estimator.rst b/docs/choosing_estimator.rst
index 2261acee..7b5af2fb 100644
--- a/docs/choosing_estimator.rst
+++ b/docs/choosing_estimator.rst
@@ -16,7 +16,7 @@ Start here and follow the questions:
 1. **Is treatment staggered?** (Different units treated at different times)
 
    - **No** → Go to question 2
-   - **Yes** → Use :class:`~diff_diff.CallawaySantAnna`
+   - **Yes** → Use :class:`~diff_diff.CallawaySantAnna` (or :class:`~diff_diff.EfficientDiD` for tighter SEs under PT-All)
 
 2. **Do you have panel data?** (Multiple observations per unit over time)
 
@@ -63,6 +63,10 @@ Quick Reference
      - Few treated units, many controls
      - Synthetic parallel trends
      - ATT with unit/time weights
+   * - ``EfficientDiD``
+     - Staggered adoption with optimal efficiency
+     - PT-All (overidentified) or PT-Post
+     - Group-time ATT(g,t), aggregations
    * - ``ContinuousDiD``
      - Continuous dose / treatment intensity
      - Strong Parallel Trends (SPT) for dose-response; PT for binarized ATT
@@ -214,6 +218,32 @@ Use :class:`~diff_diff.ContinuousDiD` when:
    print(f"Overall ATT: {results.overall_att:.3f}")
    att_curve = results.dose_response_att.to_dataframe()
 
+Efficient DiD
+~~~~~~~~~~~~~
+
+Use :class:`~diff_diff.EfficientDiD` when:
+
+- You have staggered adoption and want **maximum statistical efficiency**
+- You believe parallel trends holds across all pre-treatment periods (PT-All)
+- You want tighter confidence intervals than Callaway-Sant'Anna
+- You need a formal efficiency benchmark for comparing estimators
+
+.. note::
+
+   Phase 1 supports the **no-covariates** path only. If you need covariate
+   adjustment, use :class:`~diff_diff.CallawaySantAnna` with ``estimation_method='dr'``
+   or :class:`~diff_diff.ImputationDiD`.
+
+.. code-block:: python
+
+   from diff_diff import EfficientDiD
+
+   edid = EfficientDiD(pt_assumption="all")  # or "post" for post-treatment CS match
+   results = edid.fit(data, outcome='y', unit='unit_id',
+                      time='period', first_treat='first_treat',
+                      aggregate='all')
+   results.print_summary()
+
 Common Pitfalls
 ---------------
 
diff --git a/docs/tutorials/15_efficient_did.ipynb b/docs/tutorials/15_efficient_did.ipynb
new file mode 100644
index 00000000..f5d8fea3
--- /dev/null
+++ b/docs/tutorials/15_efficient_did.ipynb
@@ -0,0 +1,588 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "264408ff",
+   "metadata": {},
+   "source": [
+    "# Efficient DiD (Chen, Sant'Anna & Xie 2025)\n",
+    "\n",
+    "This tutorial demonstrates the `EfficientDiD` estimator, which implements the semiparametrically efficient ATT estimator from Chen, Sant'Anna & Xie (2025).\n",
+    "\n",
+    "**What EDiD does:** Standard staggered DiD estimators like Callaway-Sant'Anna use one comparison per target ATT(g,t). When parallel trends holds across *all* pre-treatment periods (PT-All), this leaves valid information on the table. EDiD optimally weights across all valid comparison groups and baselines to achieve the **semiparametric efficiency bound** --- the tightest possible confidence intervals.\n",
+    "\n",
+    "**When to use EDiD:**\n",
+    "- Staggered adoption design where you want **maximum statistical efficiency**\n",
+    "- You believe parallel trends holds across all pre-treatment periods (PT-All)\n",
+    "- You want tighter confidence intervals than Callaway-Sant'Anna\n",
+    "- You need a formal efficiency benchmark for comparing estimators\n",
+    "\n",
+    "**Topics covered:**\n",
+    "1. Basic usage and overall ATT\n",
+    "2. Group-time effects\n",
+    "3. PT-All vs PT-Post assumptions\n",
+    "4. Demonstrating efficiency gains over Callaway-Sant'Anna\n",
+    "5. Event study aggregation and visualization\n",
+    "6. Group-level aggregation\n",
+    "7. Bootstrap inference and weight distributions\n",
+    "8. Diagnostics: efficient weights and condition numbers\n",
+    "9. Anticipation periods\n",
+    "10. Three-way comparison: EDiD vs CS vs ImputationDiD\n",
+    "\n",
+    "*Prerequisites: [Tutorial 02](02_staggered_did.ipynb) (Staggered DiD) and [Tutorial 04](04_parallel_trends.ipynb) (Parallel Trends).*\n",
+    "\n",
+    "*See also: [Tutorial 11](11_imputation_did.ipynb) for Imputation DiD, [Tutorial 13](13_stacked_did.ipynb) for Stacked DiD.*"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4df7de15",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "import pandas as pd\n",
+    "\n",
+    "from diff_diff import (\n",
+    "    EfficientDiD, CallawaySantAnna, ImputationDiD,\n",
+    "    generate_staggered_data,\n",
+    ")\n",
+    "\n",
+    "# For nicer plots (optional)\n",
+    "try:\n",
+    "    import matplotlib.pyplot as plt\n",
+    "    plt.style.use('seaborn-v0_8-whitegrid')\n",
+    "    HAS_MATPLOTLIB = True\n",
+    "except ImportError:\n",
+    "    HAS_MATPLOTLIB = False\n",
+    "    print(\"matplotlib not installed - visualization examples will be skipped\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4d734cd9",
+   "metadata": {},
+   "source": [
+    "## What Makes EDiD Different?\n",
+    "\n",
+    "Consider a staggered adoption design with cohorts treated at periods 3, 5, and 7, plus a never-treated group. To estimate ATT(g=5, t=6), **Callaway-Sant'Anna** uses a single 2x2 comparison:\n",
+    "\n",
+    "> *Compare the outcome change from period 4 to 6 for cohort 5 versus the never-treated group.*\n",
+    "\n",
+    "But under **PT-All** (parallel trends across all pre-treatment periods), there are *additional* valid comparisons. Cohort 7 is also untreated at period 6, so it can serve as a comparison group too. And periods 2 and 3 can serve as additional valid baselines beyond CS's default period 4. (Period 1 is excluded --- it is the fixed $Y_1$ reference used in every comparison's differencing, so using it as a baseline adds no information.)\n",
+    "\n",
+    "Each of these comparisons provides an unbiased estimate of ATT(g=5, t=6), but with different variances. **EDiD finds the optimal linear combination** --- the one that minimizes variance --- by computing the inverse covariance matrix of these \"generated outcomes\" (the paper calls this $\\Omega^*$).\n",
+    "\n",
+    "The result: **matching post-treatment ATT(g,t) with CS under PT-Post**, but **tighter standard errors under PT-All** because EDiD exploits the overidentification.\n",
+    "\n",
+    "> **Key equation (for the curious):** The efficient weight vector is $w^* = \\frac{\\mathbf{1}' \\Omega^{*-1}}{\\mathbf{1}' \\Omega^{*-1} \\mathbf{1}}$, where $\\Omega^*$ is the covariance matrix of the generated outcomes across all valid (comparison group, baseline) pairs. This is the classic GLS optimal weighting. See REGISTRY.md or the paper for full derivations."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ca2db785",
+   "metadata": {},
+   "source": [
+    "## Data Setup\n",
+    "\n",
+    "We use `generate_staggered_data()` to create a balanced panel with 3 treatment cohorts, a never-treated group, and a known treatment effect of 2.0."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "22a3880c",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "data = generate_staggered_data(n_units=300, n_periods=10, treatment_effect=2.0,\n",
+    "                               dynamic_effects=False, seed=42)\n",
+    "\n",
+    "print(f\"Shape: {data.shape}\")\n",
+    "print(f\"Cohorts: {sorted(data['first_treat'].unique())}\")\n",
+    "print(f\"Periods: {sorted(data['period'].unique())}\")\n",
+    "print(f\"Units per cohort:\")\n",
+    "print(data.groupby('first_treat')['unit'].nunique().to_string())\n",
+    "print()\n",
+    "data.head(10)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "00f2b2ec",
+   "metadata": {},
+   "source": [
+    "## Basic Estimation\n",
+    "\n",
+    "The `EfficientDiD` API follows the same pattern as other staggered estimators: create the estimator, call `fit()`, and inspect results. The key parameter is `pt_assumption` --- we start with `\"all\"` (the default) which uses all valid pre-treatment periods for tighter inference."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5c9547bd",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "edid = EfficientDiD(pt_assumption=\"all\")\n",
+    "results = edid.fit(data, outcome='outcome', unit='unit', time='period',\n",
+    "                   first_treat='first_treat', aggregate='all')\n",
+    "results.print_summary()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9023ba23",
+   "metadata": {},
+   "source": [
+    "## Group-Time Effects\n",
+    "\n",
+    "Like Callaway-Sant'Anna, EDiD estimates ATT(g,t) for each (cohort, time period) pair. These are the building blocks for all aggregations. Use `to_dataframe(level='group_time')` to access them."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "db4c59f4",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "gt_df = results.to_dataframe(level='group_time')\n",
+    "gt_df"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e1ad14f5",
+   "metadata": {},
+   "source": [
+    "## PT-All vs PT-Post\n",
+    "\n",
+    "EDiD supports two parallel trends assumptions:\n",
+    "\n",
+    "- **PT-All** (`pt_assumption=\"all\"`): Parallel trends holds across *all* pre-treatment periods. The model is overidentified --- more valid comparisons exist than needed --- and EDiD exploits this for tighter SEs.\n",
+    "- **PT-Post** (`pt_assumption=\"post\"`): Parallel trends holds only from `g-1` onward (the weaker, standard assumption). EDiD uses a single baseline (`g-1`) per cohort, matching `CallawaySantAnna(control_group='never_treated')` for post-treatment ATT(g,t). Pre-treatment diagnostics may differ from CS's default `base_period='varying'`.\n",
+    "\n",
+    "PT-All is the default because it delivers efficiency gains when the assumption holds. Use PT-Post if you're concerned about violations in early pre-treatment periods."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "35f70199",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Fit under both assumptions\n",
+    "results_all = EfficientDiD(pt_assumption=\"all\").fit(\n",
+    "    data, outcome='outcome', unit='unit', time='period',\n",
+    "    first_treat='first_treat', aggregate='all')\n",
+    "\n",
+    "results_post = EfficientDiD(pt_assumption=\"post\").fit(\n",
+    "    data, outcome='outcome', unit='unit', time='period',\n",
+    "    first_treat='first_treat', aggregate='all')\n",
+    "\n",
+    "# Compare with Callaway-Sant'Anna\n",
+    "results_cs = CallawaySantAnna().fit(\n",
+    "    data, outcome='outcome', unit='unit', time='period',\n",
+    "    first_treat='first_treat')\n",
+    "\n",
+    "print(\"PT-All vs PT-Post vs Callaway-Sant'Anna\")\n",
+    "print(\"=\" * 65)\n",
+    "print(f\"{'Estimator':<25} {'ATT':>10} {'SE':>10} {'CI Width':>12}\")\n",
+    "print(\"-\" * 65)\n",
+    "for name, r in [(\"EDiD (PT-All)\", results_all),\n",
+    "                (\"EDiD (PT-Post)\", results_post),\n",
+    "                (\"CallawaySantAnna\", results_cs)]:\n",
+    "    ci_width = r.overall_conf_int[1] - r.overall_conf_int[0]\n",
+    "    print(f\"{name:<25} {r.overall_att:>10.4f} {r.overall_se:>10.4f} {ci_width:>12.4f}\")\n",
+    "print()\n",
+    "print(\"PT-Post and CS produce identical post-treatment ATTs.\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "27e17bd9",
+   "metadata": {},
+   "source": [
+    "## Demonstrating Efficiency Gains\n",
+    "\n",
+    "The efficiency gain from PT-All is not a one-off coincidence --- it holds systematically across datasets. Here we run a small Monte Carlo to show that EDiD (PT-All) consistently produces smaller SEs than Callaway-Sant'Anna."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e1ea9738",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "n_seeds = 10\n",
+    "se_edid_list = []\n",
+    "se_cs_list = []\n",
+    "\n",
+    "for seed in range(n_seeds):\n",
+    "    sim_data = generate_staggered_data(n_units=200, n_periods=8,\n",
+    "                                       treatment_effect=2.0,\n",
+    "                                       dynamic_effects=False, seed=seed)\n",
+    "    r_edid = EfficientDiD(pt_assumption=\"all\").fit(\n",
+    "        sim_data, outcome='outcome', unit='unit', time='period',\n",
+    "        first_treat='first_treat')\n",
+    "    r_cs = CallawaySantAnna().fit(\n",
+    "        sim_data, outcome='outcome', unit='unit', time='period',\n",
+    "        first_treat='first_treat')\n",
+    "    se_edid_list.append(r_edid.overall_se)\n",
+    "    se_cs_list.append(r_cs.overall_se)\n",
+    "\n",
+    "se_edid = np.array(se_edid_list)\n",
+    "se_cs = np.array(se_cs_list)\n",
+    "\n",
+    "print(\"Efficiency Comparison: EDiD (PT-All) vs CallawaySantAnna\")\n",
+    "print(\"=\" * 55)\n",
+    "print(f\"{'Metric':<30} {'EDiD':>10} {'CS':>10}\")\n",
+    "print(\"-\" * 55)\n",
+    "print(f\"{'Mean SE':<30} {se_edid.mean():>10.4f} {se_cs.mean():>10.4f}\")\n",
+    "print(f\"{'Median SE':<30} {np.median(se_edid):>10.4f} {np.median(se_cs):>10.4f}\")\n",
+    "print(f\"{'Mean SE ratio (EDiD/CS)':<30} {(se_edid / se_cs).mean():>10.4f}\")\n",
+    "print()\n",
+    "print(f\"EDiD SEs are on average {(1 - (se_edid / se_cs).mean()) * 100:.1f}% \"\n",
+    "      f\"smaller than CS SEs across {n_seeds} simulations.\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "cb620c75",
+   "metadata": {},
+   "source": [
+    "## Event Study Aggregation\n",
+    "\n",
+    "Event study effects aggregate ATT(g,t) by relative time $e = t - g$, averaging across cohorts at each horizon. This shows how treatment effects evolve over time since adoption. Pre-treatment coefficients ($e < 0$) serve as a diagnostic for parallel trends."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f894c40a",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "edid_es = EfficientDiD(pt_assumption=\"all\")\n",
+    "results_es = edid_es.fit(data, outcome='outcome', unit='unit', time='period',\n",
+    "                         first_treat='first_treat', aggregate='event_study')\n",
+    "\n",
+    "es_df = results_es.to_dataframe(level='event_study')\n",
+    "es_df"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "85d6719c",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "if HAS_MATPLOTLIB:\n",
+    "    fig, ax = plt.subplots(figsize=(10, 6))\n",
+    "    ax.errorbar(es_df['relative_period'], es_df['effect'],\n",
+    "                yerr=[es_df['effect'] - es_df['conf_int_lower'],\n",
+    "                      es_df['conf_int_upper'] - es_df['effect']],\n",
+    "                fmt='o-', capsize=4, color='steelblue', label='EDiD (PT-All)')\n",
+    "    ax.axhline(y=0, color='black', linestyle='--', linewidth=0.8)\n",
+    "    ax.axvline(x=-0.5, color='red', linestyle=':', linewidth=0.8, label='Treatment onset')\n",
+    "    ax.set_xlabel('Relative Period (e = t - g)')\n",
+    "    ax.set_ylabel('Effect')\n",
+    "    ax.set_title('Efficient DiD Event Study')\n",
+    "    ax.legend()\n",
+    "    plt.tight_layout()\n",
+    "    plt.show()\n",
+    "else:\n",
+    "    print(\"Install matplotlib to see visualizations: pip install matplotlib\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "cdc7d91b",
+   "metadata": {},
+   "source": [
+    "## Group-Level Aggregation\n",
+    "\n",
+    "Group aggregation averages post-treatment effects within each cohort, showing how the treatment effect varies by adoption timing."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "d6cc6f6b",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "grp_df = results.to_dataframe(level='group')\n",
+    "grp_df"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a89342da",
+   "metadata": {},
+   "source": [
+    "## Bootstrap Inference\n",
+    "\n",
+    "EDiD supports multiplier bootstrap for inference. The bootstrap perturbs the influence function values with random weights to obtain bootstrap distributions of all parameters.\n",
+    "\n",
+    "Three weight distributions are available:\n",
+    "- **Rademacher** (default): $\\pm 1$ with equal probability --- standard choice, works well in most settings\n",
+    "- **Mammen**: Two-point distribution that matches third moments\n",
+    "- **Webb**: Six-point distribution with wider support"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "25449111",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Analytical vs bootstrap inference\n",
+    "results_boot = EfficientDiD(pt_assumption=\"all\", n_bootstrap=499, seed=42).fit(\n",
+    "    data, outcome='outcome', unit='unit', time='period',\n",
+    "    first_treat='first_treat', aggregate='all')\n",
+    "\n",
+    "print(\"Analytical vs Bootstrap Inference\")\n",
+    "print(\"=\" * 70)\n",
+    "print(f\"{'Method':<20} {'ATT':>10} {'SE':>10} {'CI Lower':>12} {'CI Upper':>12}\")\n",
+    "print(\"-\" * 70)\n",
+    "print(f\"{'Analytical':<20} {results.overall_att:>10.4f} {results.overall_se:>10.4f} \"\n",
+    "      f\"{results.overall_conf_int[0]:>12.4f} {results.overall_conf_int[1]:>12.4f}\")\n",
+    "print(f\"{'Bootstrap (499)':<20} {results_boot.overall_att:>10.4f} {results_boot.overall_se:>10.4f} \"\n",
+    "      f\"{results_boot.overall_conf_int[0]:>12.4f} {results_boot.overall_conf_int[1]:>12.4f}\")\n",
+    "print()\n",
+    "\n",
+    "# Compare weight distributions\n",
+    "print(\"Bootstrap Weight Distributions\")\n",
+    "print(\"-\" * 45)\n",
+    "for wt in ['rademacher', 'mammen', 'webb']:\n",
+    "    r = EfficientDiD(pt_assumption=\"all\", n_bootstrap=499,\n",
+    "                     bootstrap_weights=wt, seed=42).fit(\n",
+    "        data, outcome='outcome', unit='unit', time='period',\n",
+    "        first_treat='first_treat')\n",
+    "    print(f\"{wt:<15} SE={r.overall_se:.4f}  \"\n",
+    "          f\"CI=[{r.overall_conf_int[0]:.4f}, {r.overall_conf_int[1]:.4f}]\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4ff64c97",
+   "metadata": {},
+   "source": [
+    "## Diagnostics: Efficient Weights and Condition Numbers\n",
+    "\n",
+    "EDiD exposes two diagnostic quantities:\n",
+    "\n",
+    "- **`efficient_weights`**: The optimal weight vector for each (g, t) target. These weights sum to 1 and show how much each (comparison group, baseline) pair contributes to the estimate.\n",
+    "- **`omega_condition_numbers`**: The condition number of the $\\Omega^*$ covariance matrix for each target. High condition numbers (> 100) indicate near-singular matrices where the weight estimates may be unstable."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "2576e628",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "if results.efficient_weights:\n",
+    "    print(\"Efficient Weights by (g, t)\")\n",
+    "    print(\"=\" * 55)\n",
+    "    for (g, t), w in sorted(results.efficient_weights.items()):\n",
+    "        print(f\"  (g={int(g)}, t={int(t)}): {len(w)} weights, sum={w.sum():.4f}\")\n",
+    "\n",
+    "print()\n",
+    "\n",
+    "if results.omega_condition_numbers:\n",
+    "    print(\"Omega* Condition Numbers\")\n",
+    "    print(\"=\" * 55)\n",
+    "    for (g, t), cond in sorted(results.omega_condition_numbers.items()):\n",
+    "        flag = \"  << HIGH\" if cond > 100 else \"\"\n",
+    "        print(f\"  (g={int(g)}, t={int(t)}): {cond:.2f}{flag}\")\n",
+    "    print()\n",
+    "    print(\"Condition numbers measure matrix stability. Values > 100 may\")\n",
+    "    print(\"indicate near-singular covariance and less reliable weights.\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "1593c1c2",
+   "metadata": {},
+   "source": [
+    "## Anticipation\n",
+    "\n",
+    "If treatment effects begin before the official treatment date (e.g., firms adjust behavior in anticipation of a policy), use `anticipation=k` to shift the effective treatment boundary forward by `k` periods. This reclassifies periods `e >= -k` as post-treatment."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1781e0d0",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "r_no_antic = EfficientDiD(pt_assumption=\"all\").fit(\n",
+    "    data, outcome='outcome', unit='unit', time='period',\n",
+    "    first_treat='first_treat', aggregate='all')\n",
+    "\n",
+    "r_antic = EfficientDiD(pt_assumption=\"all\", anticipation=1).fit(\n",
+    "    data, outcome='outcome', unit='unit', time='period',\n",
+    "    first_treat='first_treat', aggregate='all')\n",
+    "\n",
+    "print(\"Anticipation Comparison\")\n",
+    "print(\"=\" * 55)\n",
+    "print(f\"{'Setting':<30} {'ATT':>10} {'SE':>10}\")\n",
+    "print(\"-\" * 55)\n",
+    "print(f\"{'No anticipation':<30} {r_no_antic.overall_att:>10.4f} {r_no_antic.overall_se:>10.4f}\")\n",
+    "print(f\"{'1-period anticipation':<30} {r_antic.overall_att:>10.4f} {r_antic.overall_se:>10.4f}\")\n",
+    "print()\n",
+    "print(\"Anticipation shifts the effective treatment boundary forward,\")\n",
+    "print(\"reclassifying the period before treatment as post-treatment.\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4e9fa28e",
+   "metadata": {},
+   "source": [
+    "## Comparison: EDiD vs Callaway-Sant'Anna vs Imputation DiD\n",
+    "\n",
+    "These three estimators address TWFE bias in staggered settings via different approaches:\n",
+    "\n",
+    "- **EfficientDiD**: Optimal EIF-based weighting across all valid comparisons\n",
+    "- **CallawaySantAnna**: Separate 2x2 DiD regressions, then aggregate\n",
+    "- **ImputationDiD**: Impute Y(0) via a fixed effects model, compute unit-level effects\n",
+    "\n",
+    "Under the DGP used here (homogeneous effects, PT holds everywhere), all three should produce similar point estimates. The key difference is in standard errors: EDiD (PT-All) should be the tightest."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "cb054d26",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "edid_r = EfficientDiD(pt_assumption=\"all\").fit(\n",
+    "    data, outcome='outcome', unit='unit', time='period',\n",
+    "    first_treat='first_treat', aggregate='all')\n",
+    "cs_r = CallawaySantAnna().fit(\n",
+    "    data, outcome='outcome', unit='unit', time='period',\n",
+    "    first_treat='first_treat')\n",
+    "imp_r = ImputationDiD().fit(\n",
+    "    data, outcome='outcome', unit='unit', time='period',\n",
+    "    first_treat='first_treat')\n",
+    "\n",
+    "print(\"Estimator Comparison (True effect = 2.0)\")\n",
+    "print(\"=\" * 70)\n",
+    "print(f\"{'Estimator':<25} {'ATT':>10} {'SE':>10} {'p-value':>10} {'CI Width':>12}\")\n",
+    "print(\"-\" * 70)\n",
+    "for name, r in [(\"EfficientDiD (PT-All)\", edid_r),\n",
+    "                (\"CallawaySantAnna\", cs_r),\n",
+    "                (\"ImputationDiD\", imp_r)]:\n",
+    "    ci_width = r.overall_conf_int[1] - r.overall_conf_int[0]\n",
+    "    print(f\"{name:<25} {r.overall_att:>10.4f} {r.overall_se:>10.4f} \"\n",
+    "          f\"{r.overall_p_value:>10.4f} {ci_width:>12.4f}\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1257e262",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Side-by-side event study comparison\n",
+    "edid_es_r = EfficientDiD(pt_assumption=\"all\").fit(\n",
+    "    data, outcome='outcome', unit='unit', time='period',\n",
+    "    first_treat='first_treat', aggregate='event_study')\n",
+    "cs_es_r = CallawaySantAnna().fit(\n",
+    "    data, outcome='outcome', unit='unit', time='period',\n",
+    "    first_treat='first_treat', aggregate='event_study')\n",
+    "imp_es_r = ImputationDiD().fit(\n",
+    "    data, outcome='outcome', unit='unit', time='period',\n",
+    "    first_treat='first_treat', aggregate='event_study')\n",
+    "\n",
+    "edid_es_df = edid_es_r.to_dataframe(level='event_study')\n",
+    "cs_es_df = cs_es_r.to_dataframe(level='event_study')\n",
+    "imp_es_df = imp_es_r.to_dataframe(level='event_study')\n",
+    "\n",
+    "if HAS_MATPLOTLIB:\n",
+    "    fig, ax = plt.subplots(figsize=(10, 6))\n",
+    "    offset = 0.15\n",
+    "\n",
+    "    ax.errorbar(edid_es_df['relative_period'] - offset, edid_es_df['effect'],\n",
+    "                yerr=[edid_es_df['effect'] - edid_es_df['conf_int_lower'],\n",
+    "                      edid_es_df['conf_int_upper'] - edid_es_df['effect']],\n",
+    "                fmt='o-', capsize=3, color='steelblue', label='EfficientDiD (PT-All)')\n",
+    "    ax.errorbar(cs_es_df['relative_period'], cs_es_df['effect'],\n",
+    "                yerr=[cs_es_df['effect'] - cs_es_df['conf_int_lower'],\n",
+    "                      cs_es_df['conf_int_upper'] - cs_es_df['effect']],\n",
+    "                fmt='s-', capsize=3, color='darkorange', label='CallawaySantAnna')\n",
+    "    ax.errorbar(imp_es_df['relative_period'] + offset, imp_es_df['effect'],\n",
+    "                yerr=[imp_es_df['effect'] - imp_es_df['conf_int_lower'],\n",
+    "                      imp_es_df['conf_int_upper'] - imp_es_df['effect']],\n",
+    "                fmt='^-', capsize=3, color='forestgreen', label='ImputationDiD')\n",
+    "\n",
+    "    ax.axhline(y=0, color='black', linestyle='--', linewidth=0.8)\n",
+    "    ax.axvline(x=-0.5, color='red', linestyle=':', linewidth=0.8)\n",
+    "    ax.set_xlabel('Relative Period (e = t - g)')\n",
+    "    ax.set_ylabel('Effect')\n",
+    "    ax.set_title('Event Study Comparison: EDiD vs CS vs ImputationDiD')\n",
+    "    ax.legend()\n",
+    "    plt.tight_layout()\n",
+    "    plt.show()\n",
+    "else:\n",
+    "    print(\"Install matplotlib to see visualizations: pip install matplotlib\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ef99ee47",
+   "metadata": {},
+   "source": [
+    "## Summary\n",
+    "\n",
+    "**Key takeaways:**\n",
+    "\n",
+    "1. EDiD achieves the **semiparametric efficiency bound** for ATT estimation in staggered designs\n",
+    "2. Under **PT-All**, EDiD exploits overidentification for tighter SEs than CS\n",
+    "3. Under **PT-Post**, EDiD matches CS for post-treatment ATT(g,t); pre-treatment diagnostics use a fixed baseline and may differ from CS's default varying baseline\n",
+    "4. The efficiency gain comes from optimally weighting across all valid (comparison group, baseline) pairs\n",
+    "5. **Event study** and **group** aggregations work just like CS\n",
+    "6. **Multiplier bootstrap** provides robust inference with Rademacher, Mammen, or Webb weights\n",
+    "7. **Condition numbers** flag potentially unstable weight matrices\n",
+    "8. **Anticipation** shifts the effective treatment boundary for pre-treatment effects\n",
+    "9. Phase 1 is **no-covariates only** --- Phase 2 will add covariate support\n",
+    "10. When in doubt, run both EDiD and CS --- if ATTs agree, report EDiD for tighter CIs\n",
+    "\n",
+    "**Parameter reference:**\n",
+    "\n",
+    "| Parameter | Default | Description |\n",
+    "|-----------|---------|-------------|\n",
+    "| `pt_assumption` | `\"all\"` | `\"all\"` (overidentified) or `\"post\"` (just-identified, matches CS post-treatment ATT) |\n",
+    "| `alpha` | `0.05` | Significance level |\n",
+    "| `n_bootstrap` | `0` | Number of bootstrap iterations (0 = analytical only) |\n",
+    "| `bootstrap_weights` | `\"rademacher\"` | Bootstrap weight distribution: `\"rademacher\"`, `\"mammen\"`, `\"webb\"` |\n",
+    "| `seed` | `None` | Random seed for reproducibility |\n",
+    "| `anticipation` | `0` | Anticipation periods |\n",
+    "\n",
+    "**Reference:** Chen, X., Sant'Anna, P. H. C., & Xie, H. (2025). Efficient Difference-in-Differences and Event Study Estimators.\n",
+    "\n",
+    "*See also: [Choosing an Estimator](../choosing_estimator.rst) for guidance on when to use EDiD vs other estimators.*"
+   ]
+  }
+ ],
+ "metadata": {
+  "language_info": {
+   "name": "python"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/docs/tutorials/README.md b/docs/tutorials/README.md
index c26dcde0..201da9b4 100644
--- a/docs/tutorials/README.md
+++ b/docs/tutorials/README.md
@@ -43,6 +43,14 @@ Testing assumptions and diagnostics:
 - Event study as a diagnostic
 - What to do if parallel trends fails
 
+### 15. Efficient DiD (`15_efficient_did.ipynb`)
+Efficient Difference-in-Differences (Chen, Sant'Anna & Xie 2025):
+- Optimal weighting across comparison groups and baselines
+- PT-All vs PT-Post assumptions
+- Efficiency gains vs Callaway-Sant'Anna
+- Event study and group-level aggregation
+- Bootstrap inference and diagnostics
+
 ## Running the Notebooks
 
 1. Install diff-diff with dependencies: