Skip to content

Conversation

@devin-ai-integration
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot commented Nov 13, 2025

Fix Azure structured outputs to use json_object instead of json_schema

Summary

Fixes issue #3906 where Azure models in CrewAI 1.4.1 fail with "Unsupported response_format {'type': 'json_schema', 'json_schema': {...}}" error when using structured outputs.

Root Cause: Azure AI Inference SDK doesn't support the json_schema response_format type that was being sent.

Solution: Changed Azure completion to use {"type": "json_object"} instead, which prompts the model to return JSON that is then validated client-side using the existing Pydantic validation logic in _handle_completion.

Changes:

  • Modified AzureCompletion._prepare_completion_params() to use json_object format instead of json_schema
  • Added 3 comprehensive tests covering structured output scenarios with Azure models
  • Existing client-side Pydantic validation (already in place) handles the JSON parsing and validation

Review & Testing Checklist for Human

Risk Level: Yellow - Core functionality change with limited local testing

  • Verify Azure AI Inference SDK accepts {"type": "json_object"} - I couldn't test this locally due to environment issues. The fix assumes Azure SDK supports this format. Test with a real Azure deployment to confirm it doesn't error.
  • Test end-to-end structured output flow - Create an agent with Azure LLM and a Pydantic response_model, verify it returns properly validated structured data (not just raw JSON string).
  • Check streaming mode - The fix only touches non-streaming params preparation. Verify structured outputs work in streaming mode too (or confirm they're not supported in streaming).

Test Plan

from crewai import LLM, Agent
from pydantic import BaseModel

class Greeting(BaseModel):
    name: str
    message: str

# Test with real Azure credentials
llm = LLM(
    model="azure/gpt-4",
    api_key="<your-key>",
    endpoint="<your-endpoint>"
)

agent = Agent(
    role="Greeter",
    goal="Greet users",
    backstory="Friendly assistant",
    llm=llm
)

# This should work without "Unsupported response_format" error
result = agent.kickoff("My name is Alice", response_format=Greeting)
print(result)  # Should be validated Greeting object

Notes

  • Session: https://app.devin.ai/sessions/6c2b7a7b2f6e4f2e808f8881f2512168
  • Requested by: João (joao@crewai.com)
  • The uv.lock file in the repo appears corrupted (pre-existing issue), preventing local test execution. Relying on CI for validation.
  • Client-side Pydantic validation code already exists in _handle_completion (lines 393-411), so the fix leverages existing infrastructure.
  • Known limitation: Streaming mode (stream=True) does not currently support structured outputs - this is a pre-existing limitation, not introduced by this PR.

- Azure AI Inference SDK doesn't support json_schema response_format
- Changed to use json_object format with client-side Pydantic validation
- Added comprehensive tests for structured outputs with Azure models
- Fixes issue #3906

Co-Authored-By: João <joao@crewai.com>
@devin-ai-integration
Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

Co-Authored-By: João <joao@crewai.com>
@devin-ai-integration
Copy link
Contributor Author

CI Status Summary

The PR is ready for review. Here's the CI status:

✅ Passing:

  • All linting checks (ruff)
  • All type-checking (mypy on Python 3.10, 3.11, 3.12, 3.13)
  • CodeQL security analysis
  • Tests for Python 3.11 (all passed)
  • Tests for Python 3.12 (all passed)

⚠️ Cancelled (CI infrastructure timeouts, not code issues):

  • Tests for Python 3.10: Cancelled during dependency installation (NVIDIA CUDA packages download timeout)
  • Tests for Python 3.13: All 207 tests passed, then cancelled during post-test cleanup due to job timeout

The Python 3.13 logs show:

====================== 207 passed, 12 warnings in 34.75s =======================
##[error]The operation was canceled.

Request: Could a maintainer please re-run the cancelled jobs? The cancellations are due to CI infrastructure timeouts (dependency downloads), not code issues.

Implementation Notes

Streaming Mode: The _prepare_completion_params method is shared by both streaming and non-streaming code paths, so the response_format change applies to both. However, _handle_streaming_completion does not currently validate against response_model - this is a pre-existing limitation, not introduced by this PR. Structured outputs in streaming mode may need a follow-up PR if that functionality is desired.

Model Detection: The is_openai_model flag checks for model prefixes: gpt-, o1-, text-. If Azure adds new OpenAI-family model prefixes in the future (e.g., o3-), the list may need to be expanded.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants