Skip to content

[halo-pr-prep] Overview: HALO PyHealth Integration Tasks #17

@jalengg

Description

@jalengg

HALO PyHealth Integration - Task Overview

This issue tracks the integration of the HALO synthetic data generator with PyHealth standards. The work is broken down into 9 sequential tasks.

Goal

Integrate HALO to follow PyHealth conventions:

  • Use `BaseModel` for the model class
  • Use task functions instead of custom datasets
  • Use `NestedSequenceProcessor` for data handling
  • Follow PyHealth example patterns
  • Maintain HALO's custom training loop

Architecture

Hybrid Integration Approach:

  • ✅ HALO inherits from `BaseModel` for compatibility
  • ✅ Uses PyHealth's task/processor infrastructure for data loading
  • ✅ Preserves custom training loop (doesn't use Trainer)
  • ✅ Keeps proven HALO transformer implementation unchanged

Tasks

Phase 1: Merge with Master

Phase 2: Create Task Function

Phase 3: Refactor HALO Model

Phase 4: Update Examples

Phase 5: Cleanup

Phase 6: Testing

Phase 7: Final Verification

Dependencies

```
Task 1 (merge) → Task 2 (task function) → Task 3 (refactor model)

Task 4 (training example)

Task 5 (generation example)

Task 6 (cleanup)

Task 8 (integration tests) ←──────────── Task 7 (docs)

Task 9 (final verification)
```

Documentation

  • Implementation Plan: `docs/plans/2026-02-16-halo-pyhealth-integration.md`
  • Design Document: `docs/plans/2026-02-16-halo-pyhealth-integration-design.md`
  • Branch: `halo-pr-528`

Current Status

⚠️ Task 1 has CRITICAL issues - The merge is incomplete:

  • Missing 9 processor files including `NestedSequenceProcessor`
  • Broken imports in `pyhealth/processors/init.py`
  • Must be fixed before proceeding to Task 2

Success Criteria

  • HALO inherits from `BaseModel`
  • Task function registered in `pyhealth/tasks/init.py`
  • Uses `NestedSequenceProcessor` from master
  • Examples run without errors
  • Old `HALO_MIMIC3Dataset` removed
  • All tests pass
  • Ready for PR to master

Labels

All tasks are tagged with `halo-pr-prep` for tracking.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions