Skip to content

Add structured metadata to log records for programmatic consumption #386

@andreatgretel

Description

@andreatgretel

Priority Level

Medium (Nice to have)

Is your feature request related to a problem? Please describe.

Downstream tools and pipelines that build on Data Designer currently have no programmatic way to access per-call telemetry (model used, token counts, latency, retries, failure reasons). The only signals available are human-readable log messages, which means consumers must either:

  • Parse log strings with brittle regex to extract stats
  • Set DD's log level to INFO/DEBUG and hope the right messages propagate
  • Infer failures indirectly (e.g. by diffing input vs output record IDs)

This makes it difficult to build observability dashboards, compute cost estimates, track model usage across runs, or diagnose slow pipelines — all without resorting to fragile log parsing.

Describe the solution you'd like

Attach structured metadata to Python LogRecord objects via the extra parameter at key call sites. This requires no new dependencies and no changes to existing log output — the human-readable messages stay exactly as they are.

Concretely, add a namespaced extra dict (e.g. dd_event) to log calls at these sites:

  1. Per-LLM call — model alias, input/output tokens, latency, retries, status (success/failed/filtered), error message if any
  2. Record failure — record ID, column, failure reason, attempt number
  3. Model usage summary — aggregate token counts and request counts per model at workflow end

Example:

# Existing log line stays unchanged for human readers
logger.debug(
    "LLM call to %s completed in %dms",
    model_alias, latency_ms,
    extra={"dd_event": {
        "type": "llm_call",
        "model": model_alias,
        "column": column_name,
        "record_id": record_id,
        "input_tokens": 285,
        "output_tokens": 130,
        "latency_ms": 340,
        "retries": 0,
        "status": "success",
    }},
)

Consumers attach a lightweight handler to collect events:

class EventCollector(logging.Handler):
    def __init__(self):
        super().__init__()
        self.events = []

    def emit(self, record):
        event = getattr(record, "dd_event", None)
        if event:
            self.events.append(event)

This enables several use cases without DD needing to expose new public API:

  • Live monitoring — attach a handler during create()/preview() to stream events as they happen
  • Offline analytics — write events to JSONL for post-hoc cost analysis, latency profiling, or model comparison
  • Observability integration — forward events to Prometheus, OpenTelemetry, or custom dashboards
  • Debugging — filter events by record ID to trace a single record's journey through the pipeline

Describe alternatives you've considered

  • Structured fields on CreateResult/PreviewResult (e.g. result.model_usage, result.failed_records). This would be cleaner for aggregate stats but doesn't support live streaming and requires new public API surface. Could complement the logging approach for summary data.

  • Callback / event hook parameter on create()/preview() (e.g. on_event=my_handler). More explicit contract with typed event dataclasses, but a larger API change and less idiomatic Python than logging.

  • Adopting structlog for structured logging throughout DD. Powerful but adds a dependency and is a bigger architectural change. The extra approach is a stepping stone that's compatible with a future structlog migration.

Additional context

Using a single namespaced key (dd_event) rather than flat extra keys avoids collisions with LogRecord's built-in attributes and with any other libraries that use extra. The convention is simple: if dd_event is present on a record, it contains a typed dict with a "type" discriminator field.

This is a non-breaking, additive change — existing users who don't attach a custom handler see zero difference in behavior or output.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions