Skip to content

Improve record failure logging for max_tokens-truncated parse/recipe failures #411

@eric-tramel

Description

@eric-tramel

Summary

When a record fails because the model response was truncated at max_tokens and the response recipe/parser then fails, the record-failure log is too generic. Users see a dropped-record warning and a parse/recipe failure, but not the actionable cause that the model likely stopped due to token budget exhaustion.

Current paths

  • packages/data-designer-engine/src/data_designer/engine/dataset_builders/column_wise_builder.py:487
    _worker_error_callback() only logs a generic dropped-record warning.
  • packages/data-designer-engine/src/data_designer/engine/models/facade.py:323
  • packages/data-designer-engine/src/data_designer/engine/models/facade.py:421
    Parse/recipe failures are wrapped as GenerationValidationFailureError after retries/restarts.
  • packages/data-designer-engine/src/data_designer/engine/models/errors.py:216
    The validation failure text has a generic max_tokens hint, but it does not specifically call out truncation-driven parse failures at the record level.
  • packages/data-designer-engine/src/data_designer/engine/models/clients/parsing.py:35
    Raw provider responses are already preserved in ChatCompletionResponse.raw, so the finish/stop reason may be recoverable from response metadata.

Requested change

If the completion stop reason indicates token-limit truncation (finish_reason == "length" or equivalent provider-specific stop_reason) and the failure is a parse/recipe failure, emit a more specific error/log message.

The message should say the response appears to have been cut off by max_tokens, that this caused the parse/recipe failure, and that the user should increase inference_parameters.max_tokens in the model config.

Acceptance criteria

  • Record-failure logging explicitly mentions max_tokens truncation when both conditions are true:
    1. the stop reason indicates length/max-tokens termination
    2. the failure is parse/recipe-related
  • The message recommends increasing max_tokens in the model config.
  • Generic parse-failure messaging remains unchanged when truncation is not indicated.
  • Tests cover the targeted logging/error text for this path.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions