Skip to content

feat: Native Anthropic adapter with shared HTTP client infrastructure#426

Open
nabinchha wants to merge 94 commits intomainfrom
nm/overhaul-model-facade-guts-pr4
Open

feat: Native Anthropic adapter with shared HTTP client infrastructure#426
nabinchha wants to merge 94 commits intomainfrom
nm/overhaul-model-facade-guts-pr4

Conversation

@nabinchha
Copy link
Contributor

@nabinchha nabinchha commented Mar 17, 2026

Summary

Fourth PR in the model facade overhaul series (plan, architecture notes). Adds the native Anthropic Messages API adapter and routes provider_type="anthropic" through the native client path. Extracts shared native httpx transport, lifecycle, timeout, and error-wrapping logic into HttpModelClient and http_helpers.py so AnthropicClient and OpenAICompatibleClient share the same HTTP machinery while keeping provider-specific translation separate.

Previous PRs:

Changes

Added

  • anthropic.py — native Anthropic Messages API adapter with /v1 endpoint handling
  • anthropic_translation.py — request/response translation (system message lifting, tool turns, image blocks, empty text block filtering)
  • http_helpers.py — shared HTTP transport utilities
  • http_model_client.py — abstract base for native HTTP adapters with lazy client init and close semantics
  • model-facade-overhaul-pr-4-architecture-notes.md — architecture notes for this PR
  • ProviderErrorKind.QUOTA_EXCEEDED — new error kind for credit/billing failures (e.g. Anthropic "credit balance is too low")
  • ModelQuotaExceededError — user-facing error for quota/billing issues
  • _attach_provider_message helper — surfaces raw provider error messages in formatted output for HTTP 400 errors

Changed

  • openai_compatible.py — refactored to inherit from HttpModelClient
  • factory.py — routes provider_type="anthropic" to native adapter, preserves bridge fallback
  • errors.py (client) — extended _looks_like_unsupported_params_error for mutually exclusive params, added _looks_like_quota_exceeded_error
  • errors.py (model)FormattedLLMErrorMessage now includes optional provider_message (shown before Cause for 400s); _raise_from_provider_error handles QUOTA_EXCEEDED and attaches provider messages selectively

Fixed

  • Anthropic tool calling flow: empty text blocks ({"type": "text", "text": ""}) in assistant messages with tool calls are now filtered out — Anthropic API rejects them (f5b0b39e)
  • Anthropic /v1 endpoint handling: _get_messages_route() avoids /v1/v1/messages duplication when endpoint already includes /v1
  • Error classification: Anthropic "credit balance is too low" (HTTP 400) now maps to ModelQuotaExceededError instead of generic ModelBadRequestError; "temperature and top_p cannot both be specified" maps to ModelUnsupportedParamsError

Attention Areas

Reviewers: Please pay special attention to the following:

  • anthropic_translation.py — request/response translation, especially system message lifting, tool turns, image block conversion, and empty text block filtering
  • http_model_client.py — shared lazy client initialization and close semantics across native adapters
  • factory.pyanthropic routing and bridge fallback behavior
  • errors.py (model)_attach_provider_message helper and selective provider message surfacing logic

Test plan

  • uv run ruff check on all changed source files
  • uv run pytest on all new and updated test files
  • E2E smoke test against Anthropic endpoint (text generation, structured output, tool calling, image context)

Description updated with AI

nabinchha and others added 30 commits February 19, 2026 15:50
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…provements

- Wrap all LiteLLM router calls in try/except to normalize raw exceptions
  into canonical ProviderError at the bridge boundary (blocking review item)
- Extract reusable response-parsing helpers into clients/parsing.py for
  shared use across future native adapters
- Add async image parsing path using httpx.AsyncClient to avoid blocking
  the event loop in agenerate_image
- Add retry_after field to ProviderError for future retry engine support
- Fix _to_int_or_none to parse numeric strings from providers
- Create test conftest.py with shared mock_router/bridge_client fixtures
- Parametrize duplicate image generation and error mapping tests
- Add tests for exception wrapping across all bridge methods
…larity

- Parse RFC 7231 HTTP-date strings in Retry-After header (used by
  Azure and Anthropic during rate-limiting) in addition to numeric
  delay-seconds
- Clarify collect_non_none_optional_fields docstring explaining why
  f.default is None is the correct check for optional field forwarding
- Add tests for HTTP-date and garbage Retry-After values
- Fix misleading comment about prompt field defaults in _IMAGE_EXCLUDE
- Handle list-format detail arrays in _extract_structured_message for
  FastAPI/Pydantic validation errors
- Document scope boundary for vision content in collect_raw_image_candidates
@nabinchha nabinchha requested a review from a team as a code owner March 17, 2026 19:04
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 17, 2026

Greptile Summary

This PR adds a native Anthropic Messages API adapter (AnthropicClient) as the fourth step in the model-facade overhaul, alongside a shared HttpModelClient base class that consolidates HTTP transport, lifecycle, pooling, and error-wrapping logic previously duplicated between OpenAICompatibleClient and the new adapter. The translation layer (anthropic_translation.py) correctly handles system message lifting, OpenAI→Anthropic tool schema and turn conversion, multimodal image block translation, and parallel tool result merging. factory.py is updated to route provider_type="anthropic" to the native adapter while preserving the LiteLLM bridge fallback. New QUOTA_EXCEEDED error classification, ModelQuotaExceededError, and FormattedLLMErrorMessage.provider_message round out the error-handling improvements.

Key findings:

  • The ProviderErrorKind.QUOTA_EXCEEDED entry added to _KIND_MAP in clients/errors.py is unreachable: there is no paired _MESSAGES entry and the explicit if kind == ProviderErrorKind.QUOTA_EXCEEDED: branch in _raise_from_provider_error fires first, making the map entry dead code.
  • The QUOTA_EXCEEDED raise path in models/errors.py does not call _attach_provider_message, silently dropping the provider's diagnostic text (e.g., Anthropic's "Your credit balance is too low") for 400-level quota errors — unlike BAD_REQUEST and the API_ERROR fallback which both forward that information.
  • Translation logic, routing, lifecycle management, and test coverage are high quality; the translate_content_blocks malformed-image-url and system-image-block issues raised in previous review threads are correctly handled in the committed code.

Confidence Score: 4/5

  • Safe to merge; the two flagged issues are minor — one is dead code, the other is a missing diagnostic detail for quota errors — neither affects correctness of the happy path or critical error paths.
  • The core translation, routing, and lifecycle logic is well-implemented and comprehensively tested (732-line test_anthropic.py, translation unit tests, lifecycle parity tests). The two issues identified are in the new error-classification layer: a dead _KIND_MAP entry and a missing _attach_provider_message call for quota errors. Both are low-risk and do not affect runtime correctness or the Anthropic API call path itself. Score is 4 rather than 5 because of those two small gaps in the error-handling design.
  • packages/data-designer-engine/src/data_designer/engine/models/clients/errors.py and packages/data-designer-engine/src/data_designer/engine/models/errors.py warrant a second look for the QUOTA_EXCEEDED handler inconsistencies.

Important Files Changed

Filename Overview
packages/data-designer-engine/src/data_designer/engine/models/clients/adapters/http_model_client.py New base class extracting shared lifecycle, lazy-init, pooling, and HTTP POST logic. The close()/aclose() asymmetry (transport explicitly closed only by sync close()) was flagged in a prior review thread and addressed by the author; the shared-transport teardown assumptions depend on httpx internals but are tested.
packages/data-designer-engine/src/data_designer/engine/models/clients/adapters/anthropic.py Clean Anthropic adapter. Header building (x-api-key, anthropic-version), capability flags, and route detection all look correct. _build_payload_or_raise wraps ValueError from translation as BAD_REQUEST.
packages/data-designer-engine/src/data_designer/engine/models/clients/adapters/anthropic_translation.py Well-structured translation layer. System message lifting (string vs block list), tool schema and turn translation, image_url → Anthropic image block conversion, and parallel tool result merging are all handled correctly and thoroughly tested.
packages/data-designer-engine/src/data_designer/engine/models/clients/errors.py Adds QUOTA_EXCEEDED detection and ModelQuotaExceededError mapping. Contains two issues: the QUOTA_EXCEEDED entry in _KIND_MAP is unreachable dead code (no paired _MESSAGES entry and an explicit branch fires first), and the QUOTA_EXCEEDED raise path skips _attach_provider_message, silently dropping the provider's diagnostic text for 400-level quota errors.
packages/data-designer-engine/src/data_designer/engine/models/errors.py Adds ModelQuotaExceededError, provider_message field on FormattedLLMErrorMessage, and _attach_provider_message. Provider message attachment is intentionally restricted to status 400; formatting changes look correct.
packages/data-designer-engine/src/data_designer/engine/models/clients/factory.py anthropic routing added correctly; bridge env-var override takes priority; case-insensitive matching handled by ModelProvider normalization; model_id removal from OpenAICompatibleClient call is safe since that field was not used in any methods.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant Factory as factory.py
    participant AC as AnthropicClient
    participant HMC as HttpModelClient
    participant AT as anthropic_translation.py
    participant HH as http_helpers.py
    participant httpx

    Caller->>Factory: create_model_client(config, resolver, registry)
    Factory->>Factory: resolve provider_type ("anthropic")
    Factory-->>AC: AnthropicClient(provider_name, endpoint, api_key, ...)

    Caller->>AC: completion(ChatCompletionRequest)
    AC->>AT: build_anthropic_payload(request)
    AT->>AT: translate_request_messages() — lift system, merge tool results
    AT->>AT: translate_tool_definition() — OpenAI→Anthropic schema
    AT->>AT: translate_content_blocks() — image_url→Anthropic image
    AT-->>AC: payload dict

    AC->>AC: TransportKwargs.from_request() — exclude OpenAI-only fields
    AC->>HMC: _post_sync(route, payload, headers, model, timeout)
    HMC->>AC: _build_headers() — x-api-key, anthropic-version
    HMC->>HH: resolve_timeout(default, per_request)
    HMC->>httpx: client.post(url, json=payload, headers=headers)
    httpx-->>HMC: Response

    alt status >= 400
        HMC->>HMC: map_http_error_to_provider_error → ProviderError
        HMC-->>Caller: raises ProviderError
    else status 2xx
        HMC->>HH: parse_json_body(response)
        HH-->>HMC: dict
        HMC-->>AC: response_json
    end

    AC->>AT: parse_anthropic_response(response_json)
    AT->>AT: collect text/thinking/tool_use blocks
    AT-->>AC: ChatCompletionResponse
    AC-->>Caller: ChatCompletionResponse
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: packages/data-designer-engine/src/data_designer/engine/models/clients/errors.py
Line: 75

Comment:
**`QUOTA_EXCEEDED` entry in `_KIND_MAP` is unreachable dead code**

In `_raise_from_provider_error`, `ProviderErrorKind.QUOTA_EXCEEDED` is registered in `_KIND_MAP` but the generic dispatch path that consumes `_KIND_MAP``if kind in _KIND_MAP and kind in _MESSAGES:` — will never fire for `QUOTA_EXCEEDED` because:

1. There is no `_MESSAGES[ProviderErrorKind.QUOTA_EXCEEDED]` entry, so `kind in _MESSAGES` is `False`.
2. Even if one were added, the explicit `if kind == ProviderErrorKind.QUOTA_EXCEEDED:` branch above it already short-circuits.

The `_KIND_MAP` entry for `QUOTA_EXCEEDED` is therefore never used. This could mislead a future developer into thinking the generic `_MESSAGES`/`_KIND_MAP` table drives `QUOTA_EXCEEDED` behavior, or cause confusion when they add a `_MESSAGES` entry expecting it to take effect.

Consider removing the `QUOTA_EXCEEDED` entry from `_KIND_MAP` since the dedicated branch fully owns that case:

```suggestion
    _KIND_MAP: dict[ProviderErrorKind, type[DataDesignerError]] = {
        ProviderErrorKind.RATE_LIMIT: ModelRateLimitError,
        ProviderErrorKind.TIMEOUT: ModelTimeoutError,
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: packages/data-designer-engine/src/data_designer/engine/models/errors.py
Line: 1100-1109

Comment:
**`QUOTA_EXCEEDED` handler silently drops provider diagnostic message**

Every other error path that raises a `DataDesignerError` now wraps the formatted message through `_attach_provider_message`, which forwards the provider's raw diagnostic text when `status_code == 400`. Anthropic's quota errors arrive as HTTP 400s (e.g., `"Your credit balance is too low to access the Anthropic API."`), so the provider message would be attached — but `QUOTA_EXCEEDED` bypasses this entirely and raises with only the hardcoded generic string.

A user debugging a failed generation would see:

```
Cause: Model provider 'anthropic-prod' reported insufficient credits…
```

…but not the provider's own message (which might include the exact balance or a billing URL). The `BAD_REQUEST` and `API_ERROR` fallback paths show the provider message for the same 400 status.

The fix is to call `_attach_provider_message` here, mirroring the `BAD_REQUEST` branch:

```python
    if kind == ProviderErrorKind.QUOTA_EXCEEDED:
        raise ModelQuotaExceededError(
            _attach_provider_message(
                FormattedLLMErrorMessage(
                    cause=(
                        f"Model provider {model_provider_name!r} reported insufficient credits or quota for model "
                        f"{model_name!r} while {purpose}."
                    ),
                    solution=f"Add credits or increase quota/billing for model provider {model_provider_name!r} and try again.",
                ),
                exception,
            )
        ) from None
```

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: "Merge branch 'main' ..."

- Release _aclient in close() to prevent async connection pool leak
  when both sync and async clients were initialized
- Drop malformed image_url blocks (missing image_url key) instead of
  forwarding them unchanged to the Anthropic API
- Preserve image blocks in system messages by returning Anthropic
  block-list format when non-text content is present
- Rename extract_system_text to extract_system_content and add
  merge_system_parts helper for mixed string/block system parts

Made-with: Cursor
@andreatgretel
Copy link
Contributor

(AR) Warning: AnthropicClient not exported from adapters __init__.py

packages/data-designer-engine/src/data_designer/engine/models/clients/adapters/__init__.py

What: __all__ exports OpenAICompatibleClient but omits AnthropicClient, creating an inconsistent public surface.

Why: Both are peer adapters in the factory routing table. The asymmetry suggests AnthropicClient is secondary, which does not match the design.

Suggestion:

Add AnthropicClient to __all__ alongside OpenAICompatibleClient.

@andreatgretel
Copy link
Contributor

(AR) Warning: Missing litellm_bridge env override test for anthropic provider

packages/data-designer-engine/tests/engine/models/clients/test_factory.py

What: test_bridge_env_override_forces_bridge_for_openai_provider exists but there is no equivalent test for the anthropic provider type.

Why: The env-var bypass is part of the routing contract. A refactor moving the check below the anthropic branch would go undetected.

Suggestion:

Add a parallel test verifying DATA_DESIGNER_MODEL_BACKEND=litellm_bridge overrides the anthropic provider to bridge.

@andreatgretel
Copy link
Contributor

(AR) Warning: Missing ConnectionError and non-JSON response tests for AnthropicClient

packages/data-designer-engine/tests/engine/models/clients/test_anthropic.py

What: test_openai_compatible.py has transport_connection_error and non_json_response tests; test_anthropic.py does not.

Why: Symmetric test coverage ensures adapter-specific code (headers, payload building) does not short-circuit shared error paths.

Suggestion:

Add test_transport_connection_error_raises_provider_error and test_non_json_response_raises_provider_error to test_anthropic.py.

@andreatgretel
Copy link
Contributor

(AR) Suggestion: Mock helpers duplicated across three test files

packages/data-designer-engine/tests/engine/models/clients/test_anthropic.py

What: _mock_httpx_response, _make_sync_client, and _make_async_client are defined independently in multiple test files with different resp.text defaults.

Why: AGENTS.md says shared fixtures belong in conftest.py. The text default difference (json.dumps vs empty string) may mask subtle bugs.

Suggestion:

Move common mock helpers to conftest.py and reconcile the text default.

@andreatgretel
Copy link
Contributor

(AR) Suggestion: Factory tests don't assert resolved API key is forwarded

packages/data-designer-engine/tests/engine/models/clients/test_factory.py

What: Factory routing tests only check isinstance(client, ...) but don't verify the resolved API key reaches the client.

Why: Testing only the return type exercises ~30% of the function's routing logic while leaving key resolution untested.

Suggestion:

Add assert client._api_key == 'resolved-key' in the routing tests.

@andreatgretel
Copy link
Contributor

(AR) This PR adds a native Anthropic Messages API adapter with clean separation between translation logic and HTTP transport, extracts shared infrastructure into an HttpModelClient base class, and updates the factory routing. The review covered all 12 changed files (+2455/-236 lines), ran linting (ruff, all passed) and tests (135 passed in 3.50s).

The architecture is well-executed — the stateless translation module is independently testable, the DRY refactoring of HttpModelClient removes real duplication without over-abstracting, and test coverage is thorough across translation, lifecycle, and error paths. The one critical finding is that close() accesses httpx.AsyncClient._transport, a private attribute that could break on httpx upgrades and prevent sync client cleanup. The remaining warnings cover a blocking threading.Lock in async paths (not urgent until AsyncTaskScheduler lands), unused _model_id state, an __init__.py export inconsistency, and a few test coverage gaps.

Verdict: Ship it (with nits) — 1 critical, 5 warnings, 3 suggestions.

- Handle /v1 in Anthropic endpoint gracefully to avoid path duplication
- Add QUOTA_EXCEEDED provider error kind for credit/billing failures
- Extend UNSUPPORTED_PARAMS detection for mutually exclusive params
- Surface raw provider message in formatted errors for 400 status codes
- Consolidate provider message helpers into single _attach_provider_message

Made-with: Cursor
- Fix close() double-close of shared transport by closing self._transport
  directly instead of accessing private aclient._transport (critical)
- Add TODO for threading.Lock → asyncio.Lock split (plan-346)
- Remove unused _model_id from HttpModelClient and all callers
- Export AnthropicClient from adapters __init__.py
- Filter empty text blocks in translate_tool_result_content join
- Move mock helpers to conftest.py with consistent json.dumps text default
- Add __init__.py files to enable absolute imports from test conftest
- Add bridge env override test for anthropic provider
- Add ConnectionError and non-JSON response tests for AnthropicClient
- Assert secret_resolver.resolve called with correct key ref in factory tests

Made-with: Cursor
@nabinchha
Copy link
Contributor Author

Addressed all review findings in b8add5d:

Re: AnthropicClient not exported from adapters __init__.py (#426 (comment))
→ Added AnthropicClient to __all__ in adapters/__init__.py.

Re: Missing litellm_bridge env override test for anthropic provider (#426 (comment))
→ Added test_bridge_env_override_forces_bridge_for_anthropic_provider in test_factory.py.

Re: Missing ConnectionError and non-JSON response tests for AnthropicClient (#426 (comment))
→ Added test_transport_connection_error_raises_provider_error and test_non_json_response_raises_provider_error in test_anthropic.py.

Re: Mock helpers duplicated across three test files (#426 (comment))
→ Moved mock_httpx_response, make_mock_sync_client, make_mock_async_client to conftest.py with consistent json.dumps text default. Added __init__.py files to enable absolute imports from tests.engine.models.clients.conftest.

Re: Factory tests don't assert resolved API key is forwarded (#426 (comment))
→ Provider fixtures now set api_key and tests assert secret_resolver.resolve is called with the correct key ref (avoids accessing private _api_key).

@nabinchha nabinchha requested a review from andreatgretel March 18, 2026 19:54
)
return self._aclient

def close(self) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think close() still leaves the real AsyncClient open after an async request path. I hit the live Anthropic path with acompletion(), then called close(), and client._aclient.is_closed stayed False while aclose() flips it to True. Maybe worth tightening close() so sync teardown fully releases async-owned resources too.

)
) from None

# Fallback for API_ERROR and UNSUPPORTED_CAPABILITY
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the new Anthropic adapter raises UNSUPPORTED_CAPABILITY for embeddings/images, but this fallback turns it into a generic ModelAPIError. Right now native Anthropic fails fast, but the user sees “unexpected API error” instead of “this provider doesn’t support embeddings/image generation”. Maybe add an explicit branch here?

@andreatgretel
Copy link
Contributor

I ran some extra QA on this branch with live Anthropic smoke tests through both the native path and the LiteLLM bridge for text generation, tool calling, multimodal input, and an unsupported embedding call. Text and tool flows looked good on both paths, and native multimodal worked in the live check. Just out of curiosity, I also ran a small live latency check on Claude Sonnet; in that sample I did not see a consistent latency difference in favor of the native path. Not blocking, just sharing the extra validation data point. The unsupported-capability case still hits the generic error path noted inline above.

@andreatgretel
Copy link
Contributor

quick docs question: now that provider_type="anthropic" routes to the native adapter, are you planning to update the model-provider docs in a follow-up? a couple of spots still read as if provider_type only means “openai-compatible API format”.

# releases the connection pool without re-closing the transport.
client.close()

async def aclose(self) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same transport leak as the close() one from earlier, but on the async side. close() got explicit transport cleanup in b8add5d but aclose() doesn't have it. since the transport is injected, httpx won't close it for you - needs the same capture-and-close pattern.

Copy link
Contributor

@andreatgretel andreatgretel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. clean code, good test coverage, e2e works against live anthropic.

two nits inline - the close/aclose transport leak and the UNSUPPORTED_CAPABILITY error mapping. not blocking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants