Add DGP functions to prep.py for all supported DiD designs#80
Conversation
Consolidate Data Generating Process functions from tutorials and tests into diff_diff/prep.py as reusable library utilities: - generate_staggered_data(): Staggered adoption DiD (CallawaySantAnna, SunAbraham) - generate_factor_data(): Factor model data (TROP, SyntheticDiD) - generate_ddd_data(): Triple Difference (DDD) designs - generate_panel_data(): Panel data with optional parallel trends violations - generate_event_study_data(): Event study with simultaneous treatment Changes: - Add 5 new DGP functions to diff_diff/prep.py with full documentation - Export new functions from diff_diff/__init__.py - Add 33 tests covering all new functions in tests/test_prep.py - Update test files to use library functions where compatible - Update tutorials 02, 04, 07, 08, 10 to import from library - Fix pre-existing API bug in tutorial 07 (show_mdv -> mdv parameter) Users can now generate synthetic data via: from diff_diff import generate_staggered_data, generate_factor_data, ... Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Code Review: PR #80 - Add DGP functions to prep.py for all supported DiD designsAuthor: igerber Executive SummaryThis PR consolidates Data Generating Process (DGP) functions from tutorials and tests into the main library ( Part 1: Methodology ReviewStatistical CorrectnessAll 5 new DGP functions implement correct data generating processes:
Bug Fix ValidationThe PR fixes a legitimate API bug in tutorial 07 where Part 2: Issues FoundCritical IssuesNone. Medium IssuesNone. Minor Issues
Part 3: Security AssessmentNo security issues identified. The new functions:
Part 4: Documentation AssessmentExcellent documentation quality:
Minor gap: The CLAUDE.md file was not updated to document the new functions in the Module Structure section. However, since the functions follow the existing pattern in Part 5: Performance AssessmentNo performance concerns:
Part 6: Maintainability AssessmentPositive:
Code organization:
RecommendationsMust Fix (before merge)None. Should Fix
Nice to Have
Final Assessment
Overall Verdict: Approved The PR achieves its goal of consolidating DGP functions into a reusable library API. The implementation is statistically correct, well-documented, and thoroughly tested. The tutorial updates reduce code duplication and the bug fix in tutorial 07 is legitimate. All 236 tests pass (94 in test_prep.py + 142 in related test files). Review generated by Claude Code |
- Add new DGP functions to CLAUDE.md Module Structure section - Restore trailing newlines to modified notebook files - Add RuntimeWarnings investigation items to TODO.md Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Consolidate Data Generating Process functions from tutorials and tests into diff_diff/prep.py as reusable library utilities:
Changes: