forked from sunlabuiuc/PyHealth
-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Tracking issue for integrating MedGAN into PyHealth 2.0 standards. Follows the same 9-task pattern used for previous model integrations.
Background
MedGAN is a GAN-based EHR synthetic data generator using:
- A two-phase training process: pretrain autoencoder first, then adversarial training
- Standard BCE loss (NOT WGAN — discriminator uses sigmoid + BCE)
- Minibatch averaging in the discriminator for training stability
multi_hotschema: flat ICD code lists → binary vectors
Source: pyhealth/models/generators/medgan.py (ported from corgan-medgan-port branch)
Key differences from CorGAN
| CorGAN | MedGAN | |
|---|---|---|
| GAN loss | Wasserstein (WGAN) | BCE (standard) |
| Discriminator | WGAN critic, no sigmoid | Sigmoid + minibatch averaging |
| Training | Single phase | Two-phase: AE pretrain then GAN |
| Autoencoder | No AE | Linear AE for latent space |
Tasks
- T1 — Merge upstream + copy medgan.py into worktree
- T2 — Create
MedGANGenerationMIMIC3(BaseTask)task function - T3 — Refactor
MedGANto remove DummyWrapper, addtrain_model()/synthesize_dataset() - T4 — Update training example
- T5 — Update generation example
- T6 — Check/remove bespoke dataset class (likely no-op)
- T7 — Update docstrings
- T8 — Integration tests (8 MECE in-memory + 4 MIMIC-III skip-gracefully)
- T9 — Final verification + push to
jalengg/PyHealth:medgan-pr-integration
Branch
medgan-pr-integration (worktree at ~/.config/superpowers/worktrees/PyHealth/medgan-pr-integration)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels