| layout | default |
|---|---|
| title | Instructor Tutorial - Chapter 6: Multiple Providers |
| nav_order | 6 |
| has_children | false |
| parent | Instructor Tutorial |
Welcome to Chapter 6: Using Multiple Providers. In this part of Instructor Tutorial: Structured LLM Outputs, you will build an intuitive mental model first, then move into concrete implementation details and practical production tradeoffs.
Swap between OpenAI, Anthropic, Google, and local models (Ollama) without changing schemas.
One of Instructor's most powerful features is provider abstraction. You define your Pydantic schemas once, and then use them with any supported LLM backend -- OpenAI, Anthropic, Google Gemini, local Ollama models, or even a universal proxy like LiteLLM. Your extraction logic stays the same; only the client setup and model string change.
This matters more than you might think. In production, you'll often want to route different tasks to different models (fast models for classification, powerful models for complex extraction), fall back to a secondary provider when one goes down, or develop locally without burning API credits. Instructor makes all of this straightforward.
Instructor sits between your application code and the underlying LLM SDK. Each provider has a thin adapter that translates Instructor's unified interface into the provider's native API, and then parses the response back into your Pydantic model.
graph TD
A["Your Application Code<br/>(Pydantic models + prompts)"] --> B["Instructor Core<br/>(Validation, retries, parsing)"]
B --> C["OpenAI Adapter"]
B --> D["Anthropic Adapter"]
B --> E["Google Gemini Adapter"]
B --> F["OpenAI-Compatible Adapter<br/>(Ollama, LiteLLM, vLLM)"]
C --> G["OpenAI API<br/>GPT-4o, GPT-4o-mini"]
D --> H["Anthropic API<br/>Claude 3.5 Sonnet, Claude 3 Opus"]
E --> I["Google AI API<br/>Gemini 1.5 Pro, Gemini 1.5 Flash"]
F --> J["Local / Proxy<br/>Ollama, LiteLLM, vLLM"]
style A fill:#e8f4f8,stroke:#333
style B fill:#fff3cd,stroke:#333
The key insight is that every adapter produces the same output type -- your Pydantic model. This means your downstream code never needs to know which provider generated the data.
OpenAI is the default provider and the one you've been using throughout this tutorial. Let's look at the full configuration options.
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field
# Default setup -- reads OPENAI_API_KEY from environment
client = instructor.from_openai(OpenAI())
class ContactInfo(BaseModel):
name: str = Field(..., description="Full name of the person")
email: str = Field(..., description="Email address")
company: str = Field(..., description="Company or organization")
contact = client.responses.create(
model="gpt-4o",
messages=[
{"role": "user", "content": "John Smith works at Acme Corp, email john@acme.io"}
],
response_model=ContactInfo,
)
print(contact)
# name='John Smith' email='john@acme.io' company='Acme Corp'OpenAI offers several models, and picking the right one for your task makes a real difference in both quality and cost:
# GPT-4o -- best quality, higher cost
# Use for: complex extraction, nuanced classification, multi-step reasoning
result = client.responses.create(
model="gpt-4o",
messages=[{"role": "user", "content": complex_document}],
response_model=DetailedAnalysis,
)
# GPT-4o-mini -- fast, cheap, surprisingly capable
# Use for: simple extraction, classification, high-volume tasks
result = client.responses.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": short_text}],
response_model=SimpleLabel,
)from openai import OpenAI
# Custom timeout and retry settings at the SDK level
raw_client = OpenAI(
api_key="sk-...", # or set OPENAI_API_KEY env var
timeout=30.0, # seconds before the request times out
max_retries=2, # SDK-level retries (separate from Instructor retries)
organization="org-...", # if you belong to multiple orgs
)
client = instructor.from_openai(raw_client)
# Instructor-level retries and temperature control
result = client.responses.create(
model="gpt-4o",
messages=[{"role": "user", "content": text}],
response_model=MyModel,
max_retries=3, # Instructor will retry on validation failure
temperature=0.0, # lower = more deterministic output
)- Rate limits: OpenAI enforces per-minute token and request limits. If you're doing batch extraction, add a small delay or use exponential backoff.
- JSON mode vs. function calling: Instructor uses function calling under the hood for OpenAI, which is more reliable than raw JSON mode. You generally don't need to configure this yourself.
- Model deprecation: OpenAI periodically retires model versions. Pin to a specific snapshot (like
gpt-4o-2024-08-06) in production, but be prepared to update.
Anthropic's Claude models are excellent at following instructions and tend to produce very clean structured output. Instructor supports Claude through the from_anthropic adapter.
import instructor
from anthropic import Anthropic
from pydantic import BaseModel, Field
# Reads ANTHROPIC_API_KEY from environment
client = instructor.from_anthropic(Anthropic())
class Summary(BaseModel):
headline: str = Field(..., description="One-line summary")
key_points: list[str] = Field(..., description="3-5 bullet points")
sentiment: str = Field(..., description="positive, negative, or neutral")
summary = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[
{"role": "user", "content": "Summarize this product review: ..."}
],
response_model=Summary,
)Here's something important about how Instructor works with Anthropic. Unlike OpenAI, which has a dedicated function-calling mechanism, Anthropic uses tool use (also called tool calling) to produce structured output. Instructor translates your Pydantic model into a tool definition, asks Claude to "call" that tool with the appropriate arguments, and then parses the tool call arguments back into your model.
You don't need to configure this manually -- Instructor handles it behind the scenes. But it helps to know this is happening, because:
max_tokensis required: Anthropic's API always requires amax_tokensparameter. Forget it and you'll get an error.- Tool use adds a small overhead: The tool definition is included in the prompt, which uses tokens. Keep your models concise.
- System prompts work differently: With Anthropic, system messages go in a separate
systemparameter, not in themessageslist.
# Using a system prompt with Anthropic
summary = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
system="You are a precise data extraction assistant. Extract exactly what is asked.",
messages=[
{"role": "user", "content": document_text}
],
response_model=Summary,
)# Claude 3.5 Sonnet -- excellent balance of speed and quality
# The go-to choice for most structured extraction tasks
result = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": text}],
response_model=MyModel,
)
# Claude 3 Opus -- highest quality, slower, more expensive
# Use for: complex reasoning, ambiguous extraction, critical tasks
result = client.messages.create(
model="claude-3-opus-20240229",
max_tokens=2048,
messages=[{"role": "user", "content": complex_text}],
response_model=DetailedModel,
)- Always set
max_tokens: Unlike OpenAI, this parameter is mandatory. A good default for structured output is 1024. - No
response_formatparameter: Don't try to passresponse_format="json"-- Anthropic doesn't support it. Instructor uses tool use instead. - Different message format: Anthropic doesn't support a
"role": "system"entry inmessages. Use thesystemparameter instead.
Google's Gemini models offer strong performance and competitive pricing. Instructor supports them through the from_gemini adapter.
import instructor
from google import genai
from pydantic import BaseModel, Field
# Set up the Google AI client
# Reads GOOGLE_API_KEY from environment
google_client = genai.Client()
client = instructor.from_gemini(google_client)
class Recipe(BaseModel):
name: str
ingredients: list[str]
prep_time_minutes: int = Field(..., ge=1)
difficulty: str = Field(..., description="easy, medium, or hard")
recipe = client.chat.completions.create(
model="gemini-1.5-pro",
messages=[
{"role": "user", "content": "Give me a recipe for chocolate chip cookies"}
],
response_model=Recipe,
)
print(f"{recipe.name} -- {recipe.prep_time_minutes} min, {recipe.difficulty}")# Gemini 1.5 Pro -- strong reasoning, large context window (up to 1M tokens)
# Great for processing long documents
result = client.chat.completions.create(
model="gemini-1.5-pro",
messages=[{"role": "user", "content": long_document}],
response_model=MyModel,
)
# Gemini 1.5 Flash -- fast and cheap
# Good for high-throughput, simpler tasks
result = client.chat.completions.create(
model="gemini-1.5-flash",
messages=[{"role": "user", "content": short_text}],
response_model=SimpleModel,
)- API key setup: You need a Google AI API key (from Google AI Studio), not a GCP service account. Set it as
GOOGLE_API_KEY. - Context window advantage: Gemini 1.5 Pro supports up to 1 million tokens of context, making it ideal for extracting from very long documents.
- Slightly different behavior: Gemini's structured output can sometimes differ in edge cases from OpenAI or Anthropic. Test your specific schemas with Gemini before deploying.
Ollama lets you run open-source models locally, which is fantastic for development -- no API costs, no rate limits, and no data leaving your machine. Since Ollama exposes an OpenAI-compatible API, Instructor can talk to it through the regular OpenAI adapter.
First, install Ollama and pull a model:
# Install Ollama (macOS)
brew install ollama
# Start the Ollama server
ollama serve
# Pull a model that handles structured output well
ollama pull llama3.1
ollama pull mistralThen point the OpenAI client at your local Ollama instance:
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field
# Point the OpenAI client at Ollama's local API
ollama_client = OpenAI(
base_url="http://localhost:11434/v1",
api_key="ollama", # Ollama doesn't need a real key, but the SDK requires one
)
client = instructor.from_openai(ollama_client)
class KeyPoints(BaseModel):
points: list[str] = Field(..., min_length=1, max_length=5)
topic: str
result = client.responses.create(
model="llama3.1",
messages=[
{"role": "user", "content": "Extract key points from: AI is transforming healthcare..."}
],
response_model=KeyPoints,
)Not all open-source models handle structured output equally well. Here are some that work reliably with Instructor:
| Model | Size | Structured Output Quality | Speed | Best For |
|---|---|---|---|---|
llama3.1:8b |
4.7 GB | Good | Fast | Development, simple extraction |
llama3.1:70b |
40 GB | Very good | Slow | Complex extraction (needs lots of RAM) |
mistral |
4.1 GB | Good | Fast | General tasks, multilingual |
mixtral |
26 GB | Very good | Medium | Complex tasks on beefier hardware |
qwen2.5:7b |
4.4 GB | Good | Fast | Code-related extraction |
- Smaller models struggle with complex schemas: If your Pydantic model has deeply nested fields or many optional properties, smaller models (7B-8B) may produce validation errors. Simplify your schema or use a larger model.
- Instructor retries help a lot here: Set
max_retries=3or higher. Local models are more likely to need a second attempt at producing valid JSON. - No streaming partial objects: Some Ollama setups don't support the streaming features that Instructor relies on for partial objects. Test before depending on it.
- Cold start: The first request after pulling a model can be slow as Ollama loads it into memory. Subsequent requests are faster.
LiteLLM is a Python library that provides a unified interface to 100+ LLM providers. When combined with Instructor, it gives you an incredibly flexible setup -- one configuration layer that can route to any provider.
pip install litellm instructorimport instructor
import litellm
from pydantic import BaseModel, Field
# LiteLLM can be used with Instructor via the OpenAI-compatible interface
from openai import OpenAI
client = instructor.from_openai(
OpenAI(
api_key="anything", # LiteLLM reads provider keys from env
base_url="http://localhost:4000" # LiteLLM proxy server
)
)
# Or use LiteLLM directly without the proxy
from litellm import completion
# LiteLLM uses a provider prefix to route to the right backend
# "openai/gpt-4o" -> OpenAI
# "anthropic/claude-sonnet-4-20250514" -> Anthropic
# "gemini/gemini-1.5-pro" -> Google
class Sentiment(BaseModel):
label: str = Field(..., description="positive, negative, or neutral")
confidence: float = Field(..., ge=0.0, le=1.0)
# With the LiteLLM proxy, just switch the model string
result = client.responses.create(
model="openai/gpt-4o",
messages=[{"role": "user", "content": "I love this product!"}],
response_model=Sentiment,
)LiteLLM shines when you need to:
- Switch providers via config: Change the model string in an environment variable, and your entire app switches providers.
- Track usage across providers: LiteLLM's proxy can log all requests, making cost tracking easier.
- Access niche providers: If you need to hit Azure OpenAI, Bedrock, Vertex AI, or any of dozens of other backends, LiteLLM probably supports it.
Here's a quick reference to help you choose the right provider for each task:
| Feature | OpenAI (GPT-4o) | Anthropic (Claude 3.5 Sonnet) | Google (Gemini 1.5 Pro) | Ollama (Llama 3.1 8B) |
|---|---|---|---|---|
| Structured output quality | Excellent | Excellent | Very good | Good (with retries) |
| Pricing tier | Mid-high | Mid-high | Mid | Free (local) |
| Max context window | 128K tokens | 200K tokens | 1M tokens | Varies by model |
| Speed (time to first token) | Fast | Fast | Fast | Depends on hardware |
| Best for | General extraction, broad tool ecosystem | Instruction following, long documents | Very long documents, multimodal | Development, privacy-sensitive |
| Structured output method | Function calling | Tool use | Function calling | JSON mode / function calling |
| Instructor adapter | from_openai() |
from_anthropic() |
from_gemini() |
from_openai() (with base_url) |
In a real application, you'll often want to route different requests to different providers. Maybe enterprise customers get GPT-4o while free-tier users get GPT-4o-mini, or maybe you use Claude for summarization and GPT-4o for data extraction.
import instructor
from openai import OpenAI
from anthropic import Anthropic
from enum import Enum
class TaskType(Enum):
EXTRACTION = "extraction"
SUMMARIZATION = "summarization"
CLASSIFICATION = "classification"
class Tier(Enum):
FREE = "free"
PRO = "pro"
ENTERPRISE = "enterprise"
def get_client_and_model(task: TaskType, tier: Tier):
"""Pick the best provider based on task type and customer tier."""
if tier == Tier.FREE:
# Free tier always uses the cheapest option
return instructor.from_openai(OpenAI()), "gpt-4o-mini"
if task == TaskType.SUMMARIZATION:
# Claude excels at summarization
return instructor.from_anthropic(Anthropic()), "claude-sonnet-4-20250514"
if task == TaskType.EXTRACTION:
# GPT-4o is great at structured extraction
return instructor.from_openai(OpenAI()), "gpt-4o"
# Default fallback
return instructor.from_openai(OpenAI()), "gpt-4o-mini"
# Usage
client, model = get_client_and_model(TaskType.EXTRACTION, Tier.ENTERPRISE)Provider outages happen. Rate limits get hit. Models return garbage. A robust production system needs fallback logic. Here's a pattern that tries multiple providers in sequence, moving to the next one if the current one fails.
import instructor
import logging
from openai import OpenAI
from anthropic import Anthropic
from pydantic import BaseModel, Field
from dataclasses import dataclass
from typing import TypeVar, Type
logger = logging.getLogger(__name__)
T = TypeVar("T", bound=BaseModel)
@dataclass
class ProviderConfig:
"""Configuration for a single provider in the fallback chain."""
name: str
client: object # The Instructor-wrapped client
model: str
max_retries: int = 2
use_anthropic_api: bool = False # True if using Anthropic's messages.create
class FallbackChain:
"""Try multiple providers in sequence until one succeeds."""
def __init__(self, providers: list[ProviderConfig]):
self.providers = providers
def create(
self,
messages: list[dict],
response_model: Type[T],
**kwargs,
) -> T:
last_error = None
for provider in self.providers:
try:
logger.info(f"Trying provider: {provider.name} ({provider.model})")
if provider.use_anthropic_api:
# Anthropic uses messages.create and needs max_tokens
result = provider.client.messages.create(
model=provider.model,
max_tokens=kwargs.get("max_tokens", 1024),
messages=messages,
response_model=response_model,
max_retries=provider.max_retries,
)
else:
# OpenAI-compatible providers use responses.create
result = provider.client.responses.create(
model=provider.model,
messages=messages,
response_model=response_model,
max_retries=provider.max_retries,
)
logger.info(f"Success with provider: {provider.name}")
return result
except Exception as e:
logger.warning(f"Provider {provider.name} failed: {e}")
last_error = e
continue
# All providers failed
raise RuntimeError(
f"All {len(self.providers)} providers failed. Last error: {last_error}"
)
# Set up the fallback chain
chain = FallbackChain([
ProviderConfig(
name="openai-primary",
client=instructor.from_openai(OpenAI()),
model="gpt-4o",
max_retries=2,
),
ProviderConfig(
name="anthropic-backup",
client=instructor.from_anthropic(Anthropic()),
model="claude-sonnet-4-20250514",
max_retries=2,
use_anthropic_api=True,
),
ProviderConfig(
name="openai-fallback",
client=instructor.from_openai(OpenAI()),
model="gpt-4o-mini",
max_retries=3, # more retries for the last resort
),
])
# Use it just like a regular client
class UserProfile(BaseModel):
name: str
age: int = Field(..., ge=0, le=150)
interests: list[str]
profile = chain.create(
messages=[{"role": "user", "content": "John, 32, likes hiking and cooking"}],
response_model=UserProfile,
)
print(profile)
# name='John' age=32 interests=['hiking', 'cooking']This pattern gives you resilience without complicating your application code. The caller just calls chain.create() and doesn't need to know which provider actually handled the request.
When you support multiple providers, you need to test your schemas against each one. Small differences in how models interpret instructions can lead to surprising validation failures. Here's a practical approach using pytest.
import pytest
import instructor
from openai import OpenAI
from anthropic import Anthropic
from pydantic import BaseModel, Field
class MovieReview(BaseModel):
title: str
rating: float = Field(..., ge=1.0, le=10.0)
pros: list[str] = Field(..., min_length=1)
cons: list[str] = Field(..., min_length=1)
# Define providers as pytest fixtures
@pytest.fixture
def openai_client():
return instructor.from_openai(OpenAI()), "gpt-4o-mini", False
@pytest.fixture
def anthropic_client():
return instructor.from_anthropic(Anthropic()), "claude-sonnet-4-20250514", True
# Parameterize tests across providers
@pytest.fixture(params=["openai_client", "anthropic_client"])
def provider(request):
return request.getfixturevalue(request.param)
TEST_INPUT = """
Review: The Matrix is a groundbreaking sci-fi film. Amazing visual effects
and deep philosophical themes. However, some dialogue feels stilted and
the pacing drags in the second act. 8.5/10.
"""
def test_movie_review_extraction(provider):
"""Verify that every provider can extract a valid MovieReview."""
client, model, is_anthropic = provider
if is_anthropic:
result = client.messages.create(
model=model,
max_tokens=1024,
messages=[{"role": "user", "content": f"Extract a review: {TEST_INPUT}"}],
response_model=MovieReview,
)
else:
result = client.responses.create(
model=model,
messages=[{"role": "user", "content": f"Extract a review: {TEST_INPUT}"}],
response_model=MovieReview,
)
# Structural assertions -- these should pass regardless of provider
assert result.title, "Title should not be empty"
assert 1.0 <= result.rating <= 10.0, "Rating should be in range"
assert len(result.pros) >= 1, "Should have at least one pro"
assert len(result.cons) >= 1, "Should have at least one con"
def test_rating_accuracy(provider):
"""Check that the extracted rating is close to the stated value."""
client, model, is_anthropic = provider
if is_anthropic:
result = client.messages.create(
model=model,
max_tokens=1024,
messages=[{"role": "user", "content": f"Extract a review: {TEST_INPUT}"}],
response_model=MovieReview,
)
else:
result = client.responses.create(
model=model,
messages=[{"role": "user", "content": f"Extract a review: {TEST_INPUT}"}],
response_model=MovieReview,
)
# The review explicitly says "8.5/10", so the model should pick that up
assert 7.5 <= result.rating <= 9.5, f"Expected ~8.5, got {result.rating}"- Test structure, not exact values: Different models will phrase things differently. Assert on types, ranges, and list lengths rather than exact string matches.
- Use generous tolerances: A rating of 8.5 might be extracted as 8.0 or 9.0 by different models. Test that it's in a reasonable range.
- Mark slow tests: Provider API calls are slow. Use
@pytest.mark.slowor@pytest.mark.integrationso you can skip them during fast local iteration. - Mock for unit tests: For day-to-day development, mock the Instructor client and test your business logic separately from the LLM calls.
Switching providers doesn't have to be painful. Since Instructor keeps your schemas identical across providers, most of the work is in updating the client setup and adjusting for API differences.
The most common migration. Here's a checklist:
# BEFORE: OpenAI
import instructor
from openai import OpenAI
client = instructor.from_openai(OpenAI())
result = client.responses.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a helpful extractor."},
{"role": "user", "content": text},
],
response_model=MyModel,
)
# AFTER: Anthropic
import instructor
from anthropic import Anthropic
client = instructor.from_anthropic(Anthropic())
result = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024, # REQUIRED for Anthropic
system="You are a helpful extractor.", # system prompt moves here
messages=[
{"role": "user", "content": text}, # no system message in the list
],
response_model=MyModel,
)What changes:
from_openai()becomesfrom_anthropic()responses.create()becomesmessages.create()- Add
max_tokens(required) - Move system messages from the
messageslist to thesystemparameter - Update the model string
What stays the same:
- Your Pydantic
response_model - The
messagesformat for user/assistant turns - Instructor's
max_retriesparameter - All your downstream code that consumes the model
This is a great pattern for development: use Ollama locally and a cloud provider in production.
import os
import instructor
from openai import OpenAI
def get_client():
"""Return an Instructor client based on the environment."""
if os.getenv("ENV", "development") == "production":
# Production: use OpenAI
return instructor.from_openai(OpenAI()), "gpt-4o"
else:
# Development: use local Ollama
ollama = OpenAI(
base_url="http://localhost:11434/v1",
api_key="ollama",
)
return instructor.from_openai(ollama), "llama3.1"
client, model = get_client()
# This code works identically in both environments
result = client.responses.create(
model=model,
messages=[{"role": "user", "content": text}],
response_model=MyModel,
max_retries=3, # extra retries help with local models
)For maximum flexibility, drive your provider choice entirely from environment variables:
import os
import instructor
from pydantic import BaseModel
def create_client():
"""Create an Instructor client from environment configuration."""
provider = os.getenv("LLM_PROVIDER", "openai")
model = os.getenv("LLM_MODEL", "gpt-4o-mini")
if provider == "openai":
from openai import OpenAI
return instructor.from_openai(OpenAI()), model
elif provider == "anthropic":
from anthropic import Anthropic
return instructor.from_anthropic(Anthropic()), model
elif provider == "ollama":
from openai import OpenAI
ollama = OpenAI(
base_url=os.getenv("OLLAMA_BASE_URL", "http://localhost:11434/v1"),
api_key="ollama",
)
return instructor.from_openai(ollama), model
else:
raise ValueError(f"Unknown provider: {provider}")
# In .env for development:
# LLM_PROVIDER=ollama
# LLM_MODEL=llama3.1
#
# In .env for production:
# LLM_PROVIDER=openai
# LLM_MODEL=gpt-4o
# OPENAI_API_KEY=sk-...This way, switching providers is a config change -- no code changes, no redeployment of application logic.
- Match model strengths to tasks: Use fast, cheap models (GPT-4o-mini, Gemini Flash) for classification and routing. Save powerful models (GPT-4o, Claude Opus) for complex extraction and reasoning.
- Normalize errors across providers: Each provider has its own exception hierarchy. Wrap provider-specific exceptions in your own error types so your application code doesn't need to handle them individually.
- Store provider config externally: Keep provider name, model string, and API keys in environment variables or a config file. This lets you switch without code changes.
- Test with the cheapest model first: When developing new schemas, start with GPT-4o-mini or a local Ollama model. Only move to more expensive models once your schema is working.
- Monitor costs per provider: If you're using multiple providers in production, track cost per request. LiteLLM's proxy or simple logging can help with this.
- Pin model versions in production: Use snapshot identifiers (like
gpt-4o-2024-08-06) rather than aliases (likegpt-4o) in production to avoid unexpected behavior changes.
Previous: Chapter 5: Streaming | Next: Chapter 7: Advanced Patterns
Most teams struggle here because the hard part is not writing more code, but deciding clear boundaries for model, client, messages so behavior stays predictable as complexity grows.
In practical terms, this chapter helps you avoid three common failures:
- coupling core logic too tightly to one implementation path
- missing the handoff boundaries between setup, execution, and validation
- shipping changes without clear rollback or observability strategy
After working through this chapter, you should be able to reason about Chapter 6: Using Multiple Providers as an operating subsystem inside Instructor Tutorial: Structured LLM Outputs, with explicit contracts for inputs, state transitions, and outputs.
Use the implementation notes around OpenAI, instructor, provider as your checklist when adapting these patterns to your own repository.
Under the hood, Chapter 6: Using Multiple Providers usually follows a repeatable control path:
- Context bootstrap: initialize runtime config and prerequisites for
model. - Input normalization: shape incoming data so
clientreceives stable contracts. - Core execution: run the main logic branch and propagate intermediate state through
messages. - Policy and safety checks: enforce limits, auth scopes, and failure boundaries.
- Output composition: return canonical result payloads for downstream consumers.
- Operational telemetry: emit logs/metrics needed for debugging and performance tuning.
When debugging, walk this sequence in order and confirm each stage has explicit success/failure conditions.
Use the following upstream sources to verify implementation details while reading this chapter:
- View Repo
Why it matters: authoritative reference on
View Repo(github.com). - Awesome Code Docs
Why it matters: authoritative reference on
Awesome Code Docs(github.com).
Suggested trace strategy:
- search upstream code for
modelandclientto map concrete implementation paths - compare docs claims against actual runtime/config code before reusing patterns in production