-
Notifications
You must be signed in to change notification settings - Fork 47
Open
Description
Description
When using ChatDatabricks.with_structured_output() with method="json_schema", the API returns errors because the constructed response_format payload is missing required fields that the OpenAI-compatible API expects. There are three cascading issues, all stemming from the same root cause.
from pydantic import BaseModel, Field
from databricks_langchain import ChatDatabricks
class TerminationReason(BaseModel):
"""Structured termination reason of a conversation between agent and user."""
non_english: bool = Field(description="Whether or not the conversation is in English")
frustration: bool = Field(description="If user sounds frustrated, angry or threatening, return True.")
conversation = [
{"role": "user", "content": "I'm really upset that my order hasn't arrived yet."},
{"role": "ai", "content": "I'm sorry to hear that. Let me check the status for you."}
]
llm = ChatDatabricks(endpoint="databricks-gpt-5")
result = llm.with_structured_output(
schema=TerminationReason,
method="json_schema",
).invoke(conversation)
Errors
- Missing response_format.json_schema.name
BadRequestError: Error code: 400"Missing required parameter: 'response_format.json_schema.name'."
The API requires a name field in the json_schema object. The current code does not include one.
- Missing additionalProperties: false (if name is added in response format)
When strict: true, the OpenAI API Spec requires additionalProperties: false at every object-level node. model_json_schema() does not include this by default.
BadRequestError: Error code: 400"Invalid schema for response_format 'json_schema': In context=(),'additionalProperties' is required to be supplied and to be false."
- required array mismatch (if name and additionalProperties both are specified)
BadRequestError: Error code: 400"Invalid schema for response_format 'generic-schema-name': In context=(),'required' is required to be supplied and to be an array including every keyin properties. Extra required key 'x' supplied."
Root Cause
The current implementation constructs the response_format manually using raw model_json_schema() output, which is not compliant with the OpenAI structured output API requirements:
response_format = {
"type": "json_schema",
"json_schema": {
"strict": True,
"schema": (pydantic_schema.model_json_schema() if pydantic_schema else schema),
},
}
The errors raised also depends on which model is being used and the provider. For example:-
- with databricks-gpt-oss only adding name field fixes everything
- with databricks-gpt-5... OpenAI enforces other attributes and they fail.
Possible Fixes
- Adding required fields directly within the response_format
- How langchain-openai handles this - uses
_convert_to_openai_response_formatwhich usesconvert_to_openai_functionfrom langchain_core with strict=True which handles all three requirements:
name: extracted from the Pydantic class name or JSON schema title key
additionalProperties: false: recursively set on all object nodes via _recursive_set_additional_properties_false
required: set to exactly list(properties.keys()) so it matches every property
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels