Skip to content

feat: Ghost State Hydrator - Remote Investigation State Hydration#103

Merged
bordumb merged 11 commits intomainfrom
feat/fn-49-ghost-state-hydrator
Jan 29, 2026
Merged

feat: Ghost State Hydrator - Remote Investigation State Hydration#103
bordumb merged 11 commits intomainfrom
feat/fn-49-ghost-state-hydrator

Conversation

@bordumb
Copy link
Owner

@bordumb bordumb commented Jan 29, 2026

Summary

Implements epic fn-49 "Ghost State Hydrator" - enabling engineers to hydrate production investigation state into local JupyterLab notebooks for debugging.

  • Add InvestigationSnapshot model for state serialization with schema versioning
  • Integrate snapshot capture into Temporal workflow at each checkpoint
  • Build streaming snapshot download API with tenant auth
  • Implement SDK load_snapshot() deserializer with lazy DataFrame loading
  • Add JupyterLab magic commands (%dataing hydrate, %dataing diff, %dataing list)
  • Add snapshot diff comparison with markdown/HTML export

Test plan

  • Unit tests for snapshot model serialization
  • Unit tests for SDK deserializer (32 tests)
  • Unit tests for diff comparison (28 tests)
  • Unit tests for magic commands (10 tests, IPython-dependent)
  • All 70 SDK tests pass

🤖 Generated with Claude Code

bordumb and others added 7 commits January 28, 2026 23:18
- Add InvestigationSnapshot Pydantic model with version field for forward compatibility
- Add EnvironmentMetadata to capture Python version, platform, package versions
- Add SnapshotCheckpoint enum (start, hypothesis_generated, evidence_collected, complete, failed)
- Add LineageSnapshot for capturing upstream/downstream dependencies
- Add SampleDataReference for externally stored large datasets
- Include size estimation and oversized detection methods
- Add 18 unit tests covering all models and methods

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Create SnapshotStore protocol with LocalSnapshotStore and S3SnapshotStore
- Add capture_snapshot Temporal activity for non-blocking state capture
- Integrate snapshot capture at START, HYPOTHESIS_GENERATED, COMPLETE checkpoints
- Add enable_snapshots flag to InvestigationInput
- Add snapshot_paths to InvestigationResult for tracking captured snapshots

Implements fn-49.2: Snapshot capture in Temporal workflow

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add GET /investigations/{id}/snapshots to list available snapshots
- Add GET /investigations/{id}/snapshots/{checkpoint} to download
- Support gzip compression via Accept-Encoding header
- Include X-Snapshot-Version, X-Snapshot-Checkpoint headers
- Enforce tenant isolation for snapshot access
- Add unit tests for snapshot models

Implements fn-49.3: Snapshot Download API

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Create dataing.sdk package with load_snapshot() -> HydratedState
- HydratedState contains alert, hypotheses, evidence, synthesis
- NavigableSchema with .tables['name'].columns access
- QueryableLineage with .upstream() and .downstream() queries
- Lazy DataFrame deserialization for large sample data
- Version compatibility warnings for schema mismatches
- Comprehensive unit tests (33 tests)

Implements fn-49.4: Snapshot Deserializer in SDK

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add IPython magic commands: %dataing hydrate, list, help
- Add Recent Investigations section to JupyterLab sidebar widget
- Add Hydrate button for each investigation
- Wire up kernel execution for in-notebook hydration
- Support --checkpoint, --namespace, --overwrite options
- Variables injected: dataing_alert, dataing_hypotheses, etc.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add compare_snapshots() function to diff investigation states across
checkpoints. Includes diff dataclasses for schema, hypotheses, evidence,
synthesis, and dataframes. Supports markdown and HTML export for Jupyter
notebooks. Adds %dataing diff magic command.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@vercel
Copy link

vercel bot commented Jan 29, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
dataing Ready Ready Preview, Comment Jan 29, 2026 0:27am
dataing-app Ready Ready Preview, Comment Jan 29, 2026 0:27am
dataing-docs Ready Ready Preview, Comment Jan 29, 2026 0:27am

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Refactor magic.py to use typed stub functions at module level that get
reassigned when IPython is available. Add mypy override for sdk.magic
module to handle warn_unused_ignores since type ignores are only needed
when IPython is installed.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Create DataingMagics instance with shell=None to avoid traitlets
validation error, then manually set shell attribute for tests.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Patch dataing.sdk.snapshot.load_snapshot instead of dataing.sdk.magic.load_snapshot
since load_snapshot is imported inside the _hydrate method from
dataing.sdk.snapshot, not at the magic module level.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@bordumb bordumb merged commit 494dd5d into main Jan 29, 2026
6 checks passed
@bordumb bordumb deleted the feat/fn-49-ghost-state-hydrator branch January 29, 2026 00:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant