Skip to content

Fixes #211 with_structured_output for json_schema method#238

Open
WIll-Xu35 wants to merge 2 commits intodatabricks:mainfrom
WIll-Xu35:fix_with_structured_output
Open

Fixes #211 with_structured_output for json_schema method#238
WIll-Xu35 wants to merge 2 commits intodatabricks:mainfrom
WIll-Xu35:fix_with_structured_output

Conversation

@WIll-Xu35
Copy link

@WIll-Xu35 WIll-Xu35 commented Dec 12, 2025

Fixes #211

Add "name" key for response_format. It can optional input with key "schema_name". Otherwise default to "json_schema"

Change schema value type from tuple to dict.

@WIll-Xu35 WIll-Xu35 changed the title Fix with_structured_output for json_schema method Fixes #211 with_structured_output for json_schema method Dec 12, 2025
Copy link
Contributor

@jennsun jennsun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! It looks like this only fixes the first of several cascading errors users hit with method="json_schema" (as reported in #211 (comment))

When strict: true is set, the OpenAI-compatible API also requires:

  1. additionalProperties: false at every object-level node in the schema
  2. required to match all keys in properties (which can break with nested models / $ref)

raw model_json_schema() doesn't satisfy either of these, so users will still hit 400 errors after this fix. Could you take a look at my comment here, and include test coverage (unit tests + manual testing info in PR description) after you have made updates?

also make sure you rebase newest changes as there is a current conflict

response_format = {
"type": "json_schema",
"json_schema": {
"name": kwargs.get("schema_name", "json_schema"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the name should be derived and not hardcoded - here you should use convert_to_openai_function() ,(add an additional import from the langchain core function calling class for from langchain_core.utils.function_calling import convert_to_openai_function . we should also do this to address the other cascading errors (see my overall review comment)

can refer to how the partner library is doing this as a reference which calls convert_to_openai_function:
https://github.com/langchain-ai/langchain/blob/202d7f6c4a2ca8c7e5949d935bcf0ba9b0c23fb0/libs/partners/openai/langchain_openai/chat_models/base.py#L1449

suggested fix may look something like this, but please be sure to test!

from langchain_core.utils.function_calling import (
      convert_to_openai_function,  # add this import
      convert_to_openai_tool,
  )

  # ... in with_structured_output, json_schema branch:

  elif method == "json_schema":
      if schema is None:
          raise ValueError(
              "schema must be specified when method is 'json_schema'. Received None."
          )
      function = convert_to_openai_function(
          pydantic_schema if pydantic_schema else schema, strict=True
      )
      function["schema"] = function.pop("parameters")
      response_format = {"type": "json_schema", "json_schema": function}
      llm = self.bind(response_format=response_format)

schema.model_json_schema() if is_pydantic_schema else schema # type: ignore[union-attr]
),
"schema": schema.model_json_schema() if is_pydantic_schema else schema, # type: ignore[union-attr]
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the pr is also missing unit test coverage - we have an existing test at tests/unit_tests/test_chat_models.py (test_chat_model_with_structured_output)

in which this line
assert bind["response_format"]["json_schema"]["schema"] == JSON_SCHEMA would likely fail.

can you update this test and include asserts for the name/strict/schema fields? may look something like:

elif method == "json_schema":
      js = bind["response_format"]["json_schema"]
      assert js["name"] == "AnswerWithJustification"
      assert js["strict"] is True
      assert js["schema"]["additionalProperties"] is False

can you also add the follow unit tests:

  • pydantic schema produces a valid response_format
  • raw dict schema produces a valid response_format
  • nested Pydantic model to validate $ref dereferencing?

@jennsun
Copy link
Contributor

jennsun commented Mar 2, 2026

see #351

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

missing "name" for ChatDatabricks.with_structured_output(method="json_schema")

2 participants