Skip to content

Commit a4b3881

Browse files
igerberclaude
andcommitted
Address AI review: tighten fixture guard, derive truth, add provenance
P2: Assert exact cohort counts (656/252/176/163/65) and wave support since the CSV fixture is deterministic — approximate tolerances could mask fixture drift. P3: Derive _TRUE_ES_AVG_COMPUSTAT programmatically from DGP parameters instead of hard-coding, so changes to the DGP definition propagate automatically. P3: Add tests/data/README.md documenting the HRS fixture source, sample selection steps, and expected counts for future audit/rebuild. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent cbedde9 commit a4b3881

File tree

2 files changed

+66
-12
lines changed

2 files changed

+66
-12
lines changed

tests/data/README.md

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
# Test Data Fixtures
2+
3+
## hrs_edid_validation.csv
4+
5+
**Source:** Dobkin, C., Finkelstein, A., Kluender, R., & Notowidigdo, M. J. (2018).
6+
"The Economic Consequences of Hospital Admissions." *American Economic Review*, 108(2), 308-352.
7+
Replication kit: https://www.openicpsr.org/openicpsr/project/116186/version/V1/view
8+
9+
**Sample selection:** Follows Sun & Abraham (2021), as used by Chen, Sant'Anna & Xie (2025)
10+
Section 6:
11+
12+
1. Read `HRS_long.dta` from the Dobkin et al. replication kit
13+
2. Keep waves 7-11, retain only individuals present in all 5 waves
14+
3. Filter to ever-hospitalized individuals with `first_hosp >= 8`
15+
4. Filter to ages 50-59 at hospitalization (`age_hosp`)
16+
5. Drop wave 11 (no valid comparison group)
17+
6. Recode `first_hosp == 11` as never-treated (`inf`)
18+
19+
**Expected counts:**
20+
21+
| Column | Values |
22+
|--------|--------|
23+
| Total individuals | 656 |
24+
| Waves | 7, 8, 9, 10 |
25+
| Rows | 2,624 |
26+
| G=8 | 252 |
27+
| G=9 | 176 |
28+
| G=10 | 163 |
29+
| G=inf | 65 |
30+
31+
**Columns:** `unit` (hhidpn), `time` (wave), `outcome` (oop_spend, 2005 dollars), `first_treat` (first_hosp)
32+
33+
**Regeneration:** Requires the Dobkin et al. replication kit (`.gitignore`d as `replication_data/`).
34+
The extraction logic is documented in the plan file and was executed as a one-time preprocessing step.

tests/test_efficient_did_validation.py

Lines changed: 32 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -142,10 +142,28 @@ def _compute_es_avg(result):
142142
return np.mean(list(es.values()))
143143

144144

145-
# Ground truth ES_avg for Compustat DGP (see plan for derivation)
146-
_TRUE_ES_AVG_COMPUSTAT = np.mean(
147-
[0.1235, 0.247, 0.3705, 0.494, 0.770, 0.924, 1.078]
148-
)
145+
# Ground truth derived from DGP parameters (not hard-coded)
146+
_ATT_COEFS = {5: 0.154, 8: 0.093} # ATT(g,t) = coef * (t - g + 1) for t >= g
147+
_N_PERIODS = 11
148+
149+
150+
def _true_es_avg_from_dgp():
151+
"""Derive ES_avg from DGP treatment effect parameters."""
152+
max_e = {g: _N_PERIODS - g for g in _ATT_COEFS}
153+
all_e = range(0, max(max_e.values()) + 1)
154+
es_values = []
155+
for e in all_e:
156+
contributing = [
157+
coef * (e + 1)
158+
for g, coef in _ATT_COEFS.items()
159+
if e <= max_e[g]
160+
]
161+
if contributing:
162+
es_values.append(np.mean(contributing))
163+
return np.mean(es_values)
164+
165+
166+
_TRUE_ES_AVG_COMPUSTAT = _true_es_avg_from_dgp()
149167

150168

151169
def _true_overall_att_compustat():
@@ -226,24 +244,26 @@ class TestHRSReplication:
226244
"""Validate EDiD against Table 6 of Chen, Sant'Anna & Xie (2025)."""
227245

228246
def test_sample_selection_yields_expected_counts(self, hrs_data):
247+
# Fixture is deterministic — assert exact counts
229248
n_units = hrs_data["unit"].nunique()
230-
assert abs(n_units - 652) <= 10, f"Expected ~652 units, got {n_units}"
249+
assert n_units == 656, f"Expected 656 units, got {n_units}"
231250

232251
groups = hrs_data.groupby("unit")["first_treat"].first()
233252

234-
# Check 4 groups exist
235253
finite_groups = sorted(g for g in groups.unique() if np.isfinite(g))
236254
assert finite_groups == [8, 9, 10], f"Expected groups [8,9,10], got {finite_groups}"
237255
assert any(np.isinf(g) for g in groups.unique()), "Missing never-treated group"
238256

239-
# Check approximate sizes
240-
for g, expected in [(8, 252), (9, 176), (10, 163)]:
257+
expected_sizes = {8: 252, 9: 176, 10: 163}
258+
for g, expected in expected_sizes.items():
241259
actual = (groups == g).sum()
242-
assert abs(actual - expected) <= 15, (
243-
f"G={g}: expected ~{expected}, got {actual}"
244-
)
260+
assert actual == expected, f"G={g}: expected {expected}, got {actual}"
245261
n_inf = groups.apply(np.isinf).sum()
246-
assert abs(n_inf - 65) <= 10, f"G=inf: expected ~65, got {n_inf}"
262+
assert n_inf == 65, f"G=inf: expected 65, got {n_inf}"
263+
264+
assert sorted(hrs_data["time"].unique()) == [7, 8, 9, 10], (
265+
f"Expected waves [7,8,9,10], got {sorted(hrs_data['time'].unique())}"
266+
)
247267

248268
def test_group_time_effects_match_table6(self, edid_hrs_result):
249269
for (g, t), (expected_effect, _) in TABLE6_EDID.items():

0 commit comments

Comments
 (0)