forked from sunlabuiuc/PyHealth
-
Notifications
You must be signed in to change notification settings - Fork 0
Closed
Description
HALO PyHealth Integration - Task Overview
This issue tracks the integration of the HALO synthetic data generator with PyHealth standards. The work is broken down into 9 sequential tasks.
Goal
Integrate HALO to follow PyHealth conventions:
- Use `BaseModel` for the model class
- Use task functions instead of custom datasets
- Use `NestedSequenceProcessor` for data handling
- Follow PyHealth example patterns
- Maintain HALO's custom training loop
Architecture
Hybrid Integration Approach:
- ✅ HALO inherits from `BaseModel` for compatibility
- ✅ Uses PyHealth's task/processor infrastructure for data loading
- ✅ Preserves custom training loop (doesn't use Trainer)
- ✅ Keeps proven HALO transformer implementation unchanged
Tasks
Phase 1: Merge with Master
- [halo-pr-prep] Task 1: Complete upstream/master merge #8 - Task 1: Complete upstream/master merge
⚠️ CRITICAL
Phase 2: Create Task Function
- [halo-pr-prep] Task 2: Create halo_generation task function #9 - Task 2: Create halo_generation task function
Phase 3: Refactor HALO Model
- [halo-pr-prep] Task 3: Refactor HALO model to inherit BaseModel #10 - Task 3: Refactor HALO model to inherit BaseModel
Phase 4: Update Examples
- [halo-pr-prep] Task 4: Update training example script #11 - Task 4: Update training example script
- [halo-pr-prep] Task 5: Update generation example script #12 - Task 5: Update generation example script
Phase 5: Cleanup
- [halo-pr-prep] Task 6: Remove old HALO_MIMIC3Dataset class #13 - Task 6: Remove old HALO_MIMIC3Dataset class
- [halo-pr-prep] Task 7: Update docstrings and documentation #14 - Task 7: Update docstrings and documentation
Phase 6: Testing
- [halo-pr-prep] Task 8: Create end-to-end integration test #15 - Task 8: Create end-to-end integration test
Phase 7: Final Verification
- [halo-pr-prep] Task 9: Run all tests and final verification #16 - Task 9: Run all tests and final verification
Dependencies
```
Task 1 (merge) → Task 2 (task function) → Task 3 (refactor model)
↓
Task 4 (training example)
↓
Task 5 (generation example)
↓
Task 6 (cleanup)
↓
Task 8 (integration tests) ←──────────── Task 7 (docs)
↓
Task 9 (final verification)
```
Documentation
- Implementation Plan: `docs/plans/2026-02-16-halo-pyhealth-integration.md`
- Design Document: `docs/plans/2026-02-16-halo-pyhealth-integration-design.md`
- Branch: `halo-pr-528`
Current Status
- Missing 9 processor files including `NestedSequenceProcessor`
- Broken imports in `pyhealth/processors/init.py`
- Must be fixed before proceeding to Task 2
Success Criteria
- HALO inherits from `BaseModel`
- Task function registered in `pyhealth/tasks/init.py`
- Uses `NestedSequenceProcessor` from master
- Examples run without errors
- Old `HALO_MIMIC3Dataset` removed
- All tests pass
- Ready for PR to master
Labels
All tasks are tagged with `halo-pr-prep` for tracking.
Related
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels