Skip to content

feat: add optional DataFog redaction for RunRecorder#19

Draft
sidmohan0 wants to merge 2 commits intosauravvenkat:mainfrom
sidmohan0:codex/forkline-datafog-recorder-redaction
Draft

feat: add optional DataFog redaction for RunRecorder#19
sidmohan0 wants to merge 2 commits intosauravvenkat:mainfrom
sidmohan0:codex/forkline-datafog-recorder-redaction

Conversation

@sidmohan0
Copy link

Summary

This PR adds an optional, backward-compatible DataFog integration to Forkline’s RunRecorder storage redaction boundary (recorder path only).

Why this PR

  • Preserves current RedactionPolicy behavior and deterministic redaction.
  • Adds an opt-in semantic layer for free-text values via DataFog.
  • Keeps DataFog out of the default runtime dependency path.

What changed

Code

  • forkline/storage/recorder.py

    • Added constructor options:
      • enable_datafog: bool = False
      • datafog_mode: str = "redact"
      • datafog_entity_types: Optional[list[str]] = None
    • Applies DataFog post redaction_policy.redact(...) when enabled.
    • Raises clear error when opt-in is enabled but DataFog is unavailable/misconfigured.
  • forkline/storage/datafog_adapter.py (new)

    • Optional DataFog loading + adapter utilities.
    • Applies DataFog recursively to string leaves.
    • Keeps behavior explicit and fail-fast.
  • tests/unit/test_redaction_policy.py

    • Added TestRecorderIntegration::test_recorder_applies_datafog_when_enabled.
    • Verifies policy-first redaction + opt-in semantic redaction + deterministic output.

Backward compatibility

  • Default remains safe/off behavior: enable_datafog=False.
  • Existing callers do not need any code changes.
  • Existing redaction policy remains primary and unchanged.

Validation

  • python -m black --check .
  • python -m ruff check .
  • python -m unittest discover -s tests

Results (on this branch)

  • black --check: ✅
  • ruff check: ✅
  • unit tests: Ran 188 tests in 0.331s

Implementation details

  • Targeted to RunRecorder only (not tracer path) for PR scope and safety.
  • Optional and explicit integration pattern to reduce breakage risk.

Follow-up (optional)

  • Add optional support in tracer/storage SQLiteStore path if required by downstream consumers.

@sidmohan0
Copy link
Author

hey @sauravvenkat kept it opt-in but lmk if you'd prefer it handled a different way

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant