Add structured metadata to log records for programmatic consumption

## Priority Level

Medium (Nice to have)

## Is your feature request related to a problem? Please describe.

Downstream tools and pipelines that build on Data Designer currently have no programmatic way to access per-call telemetry (model used, token counts, latency, retries, failure reasons). The only signals available are human-readable log messages, which means consumers must either:

- Parse log strings with brittle regex to extract stats
- Set DD's log level to INFO/DEBUG and hope the right messages propagate
- Infer failures indirectly (e.g. by diffing input vs output record IDs)

This makes it difficult to build observability dashboards, compute cost estimates, track model usage across runs, or diagnose slow pipelines — all without resorting to fragile log parsing.

## Describe the solution you'd like

Attach structured metadata to Python `LogRecord` objects via the `extra` parameter at key call sites. This requires no new dependencies and no changes to existing log output — the human-readable messages stay exactly as they are.

Concretely, add a namespaced `extra` dict (e.g. `dd_event`) to log calls at these sites:

1. **Per-LLM call** — model alias, input/output tokens, latency, retries, status (success/failed/filtered), error message if any
2. **Record failure** — record ID, column, failure reason, attempt number
3. **Model usage summary** — aggregate token counts and request counts per model at workflow end

Example:

```python
# Existing log line stays unchanged for human readers
logger.debug(
    "LLM call to %s completed in %dms",
    model_alias, latency_ms,
    extra={"dd_event": {
        "type": "llm_call",
        "model": model_alias,
        "column": column_name,
        "record_id": record_id,
        "input_tokens": 285,
        "output_tokens": 130,
        "latency_ms": 340,
        "retries": 0,
        "status": "success",
    }},
)
```

Consumers attach a lightweight handler to collect events:

```python
class EventCollector(logging.Handler):
    def __init__(self):
        super().__init__()
        self.events = []

    def emit(self, record):
        event = getattr(record, "dd_event", None)
        if event:
            self.events.append(event)
```

This enables several use cases without DD needing to expose new public API:
- **Live monitoring** — attach a handler during `create()`/`preview()` to stream events as they happen
- **Offline analytics** — write events to JSONL for post-hoc cost analysis, latency profiling, or model comparison
- **Observability integration** — forward events to Prometheus, OpenTelemetry, or custom dashboards
- **Debugging** — filter events by record ID to trace a single record's journey through the pipeline

## Describe alternatives you've considered

- **Structured fields on `CreateResult`/`PreviewResult`** (e.g. `result.model_usage`, `result.failed_records`). This would be cleaner for aggregate stats but doesn't support live streaming and requires new public API surface. Could complement the logging approach for summary data.

- **Callback / event hook parameter** on `create()`/`preview()` (e.g. `on_event=my_handler`). More explicit contract with typed event dataclasses, but a larger API change and less idiomatic Python than logging.

- **Adopting `structlog`** for structured logging throughout DD. Powerful but adds a dependency and is a bigger architectural change. The `extra` approach is a stepping stone that's compatible with a future structlog migration.

## Additional context

Using a single namespaced key (`dd_event`) rather than flat `extra` keys avoids collisions with `LogRecord`'s built-in attributes and with any other libraries that use `extra`. The convention is simple: if `dd_event` is present on a record, it contains a typed dict with a `"type"` discriminator field.

This is a non-breaking, additive change — existing users who don't attach a custom handler see zero difference in behavior or output.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add structured metadata to log records for programmatic consumption #386

Priority Level

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add structured metadata to log records for programmatic consumption #386

Description

Priority Level

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions