Skip to content

feat: Constrain HttpModelClient to single concurrency mode...#439

Open
nabinchha wants to merge 8 commits intomainfrom
nm/overhaul-model-facade-guts-pr5
Open

feat: Constrain HttpModelClient to single concurrency mode...#439
nabinchha wants to merge 8 commits intomainfrom
nm/overhaul-model-facade-guts-pr5

Conversation

@nabinchha
Copy link
Contributor

Summary

Fifth PR in the model facade overhaul series (plan, architecture notes). Constrains each HttpModelClient instance to a single execution mode — sync or async — at construction time, eliminating the dual-mode lifecycle complexity that caused transport leaks and cross-mode teardown bugs surfaced during PR-4 review (#426). Adds ModelRegistry.arun_health_check() so health checks use the async path when DATA_DESIGNER_ASYNC_ENGINE=1.

Previous PRs:

Changes

Added

  • ClientConcurrencyMode StrEnum (http_model_client.py) — replaces Literal["sync", "async"] type alias with a proper enum for runtime type identity and IDE autocomplete
  • ModelRegistry.arun_health_check() (registry.py) — async mirror of run_health_check() that calls agenerate / agenerate_text_embeddings / agenerate_image on model facades
  • Async health check dispatch (column_wise_builder.py) — submits arun_health_check() to the background event loop via asyncio.run_coroutine_threadsafe when DATA_DESIGNER_ASYNC_ENGINE=1
  • PR-5 architecture notes (plans/343/model-facade-overhaul-pr-5-architecture-notes.md)

Changed

  • HttpModelClient (http_model_client.py) — constructor accepts concurrency_mode parameter; _get_sync_client() / _get_async_client() raise RuntimeError if called in the wrong mode; close() and aclose() simplified to single-mode teardown (cross-mode calls are no-ops)
  • Factory chainclient_concurrency_mode parameter threaded through create_model_clientcreate_model_registrycreate_resource_provider, derived from DATA_DESIGNER_ASYNC_ENGINE env var
  • ensure_async_engine_loop (async_concurrency.py) — renamed from _ensure_async_engine_loop (now public, used cross-module)
  • Test helpers (test_anthropic.py, test_openai_compatible.py) — auto-derive concurrency_mode from which mock client is injected
  • PR-4 architecture notes — updated planned follow-on section to reflect PR-5 scope change

Fixed

  • Transport leak: close() on a dual-mode client left the async transport open; aclose() never touched the transport at all
  • Cross-mode teardown: close() could not await aclient.aclose(); aclose() had to also handle sync cleanup
  • Health check mode mismatch: async-engine registries ran sync health checks, hitting mode enforcement guards

Attention Areas

Reviewers: Please pay special attention to the following:

  • http_model_client.py — mode enforcement guards in _get_sync_client / _get_async_client and simplified close() / aclose()
  • Factory chain threadingconcurrency_mode flows from env var through resource_provider.pymodels/factory.pyclients/factory.py → adapter constructors
  • registry.pyarun_health_check() mirrors run_health_check() with async facade methods
  • column_wise_builder.py — async health check dispatch via run_coroutine_threadsafe

Test plan

  • uv run ruff check on all changed source files
  • uv run pytest on all new and updated test files
  • Lifecycle tests: sync close, async aclose, idempotency, cross-mode no-ops
  • Mode enforcement tests: wrong-mode access raises RuntimeError
  • Factory forwarding tests: client_concurrency_mode reaches adapter constructors
  • Async health check tests: success and auth error propagation

Made with Cursor

Constrain each HttpModelClient instance to sync or async at
construction time, eliminating dual-mode lifecycle complexity
that caused transport leaks and cross-mode teardown bugs.

- Add ClientConcurrencyMode StrEnum replacing Literal type alias
- Add concurrency_mode constructor param with mode enforcement
  guards on _get_sync_client / _get_async_client
- Simplify close()/aclose() to single-mode teardown (cross-mode
  calls are no-ops)
- Thread client_concurrency_mode through factory chain from
  DATA_DESIGNER_ASYNC_ENGINE env var
- Add ModelRegistry.arun_health_check() async mirror and wire
  async dispatch in ColumnWiseDatasetBuilder
- Make ensure_async_engine_loop public (used cross-module)
- Fix test helpers to derive concurrency mode from injected client
- Add PR-5 architecture notes
@nabinchha nabinchha requested a review from a team as a code owner March 19, 2026 18:21
@nabinchha nabinchha changed the title feat: Constrain HttpModelClient to single concurrency mode with async health checks feat: Constrain HttpModelClient to single concurrency mode... Mar 19, 2026
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 19, 2026

Greptile Summary

This PR constrains each HttpModelClient instance to a single concurrency mode (sync or async) set at construction time, eliminating the dual-mode lifecycle complexity that previously caused transport leaks and cross-mode teardown bugs. It also adds ModelRegistry.arun_health_check() and wires it into ColumnWiseDatasetBuilder for async-engine health checks.

Key changes:

  • ClientConcurrencyMode enum replaces a Literal["sync", "async"] type alias for proper runtime identity
  • HttpModelClient now enforces mode at construction and in _get_sync_client()/_get_async_client() guards; close()/aclose() are simplified to single-path teardown
  • Constructor validation rejects mismatched injected clients (e.g. async_client into a sync-mode instance) with ValueError
  • Factory chain threads client_concurrency_mode from DATA_DESIGNER_ASYNC_ENGINE env var through resource_provider.pymodels/factory.pyclients/factory.py → adapter constructors
  • arun_health_check() mirrors run_health_check() with async facade methods; dispatched via run_coroutine_threadsafe with a 180-second wall-clock guard
  • ensure_async_engine_loop renamed from private to public for cross-module reuse

All previously surfaced issues (double-close, timeout guard, mismatch validation, traceback preservation) are fixed in c3e2abe5. One minor test inconsistency remains: test_arun_health_check_success uses await_count for the image mock while using call_count for the others.

Confidence Score: 5/5

  • This PR is safe to merge — it fixes known transport-leak and teardown bugs from PR-4, with solid test coverage across the mode lifecycle.
  • The implementation is clean and well-tested. All previously flagged issues (double-close, timeout guard, mismatch validation, raise e traceback corruption) are fixed. The factory chain is complete and consistent. The async health check dispatch follows the same pattern as the existing AsyncConcurrentExecutor. The one remaining issue is a trivial test inconsistency (await_count vs call_count) that does not affect correctness.
  • No files require special attention.

Important Files Changed

Filename Overview
packages/data-designer-engine/src/data_designer/engine/models/clients/adapters/http_model_client.py Core of the PR — adds ClientConcurrencyMode enum, mode enforcement guards, and simplified single-path close()/aclose(). Previous issues (double-close, mismatch validation) are fixed. Logic is sound.
packages/data-designer-engine/src/data_designer/engine/models/registry.py Adds arun_health_check() as a clean async mirror of run_health_check(). Uses bare raise (preserving traceback). All three GenerationType branches covered.
packages/data-designer-engine/src/data_designer/engine/dataset_builders/column_wise_builder.py Dispatches async health check via run_coroutine_threadsafe with a 180s timeout and proper cancellation on TimeoutError. No issues; the TimeoutError catch is correct because the async path enforces Python 3.11+.
packages/data-designer-engine/src/data_designer/engine/resources/resource_provider.py Derives client_concurrency_mode from DATA_DESIGNER_ASYNC_ENGINE and threads it through to create_model_registry. Factory chain is complete and consistent.
packages/data-designer-engine/tests/engine/models/test_model_registry.py Good async health check coverage for EMBEDDING and CHAT_COMPLETION paths. Minor inconsistency: mock_agenerate_image assertion uses await_count while the others use call_count.
packages/data-designer-engine/tests/engine/models/clients/test_native_http_clients.py Comprehensive lifecycle, mode enforcement, constructor validation, and lazy-initialization tests for both OpenAI and Anthropic adapters.

Sequence Diagram

sequenceDiagram
    participant CWB as ColumnWiseDatasetBuilder
    participant ACL as async event loop
    participant MR as ModelRegistry
    participant MF as ModelFacade
    participant HMC as HttpModelClient (ASYNC mode)

    Note over CWB: DATA_DESIGNER_ASYNC_ENGINE=1
    CWB->>ACL: ensure_async_engine_loop()
    CWB->>ACL: run_coroutine_threadsafe(arun_health_check, loop)
    CWB->>CWB: future.result(timeout=180)

    ACL->>MR: arun_health_check(model_aliases)
    loop for each model_alias
        MR->>MF: get_model(model_alias)
        alt EMBEDDING
            MR->>MF: await agenerate_text_embeddings(...)
            MF->>HMC: _get_async_client()
            HMC-->>MF: httpx.AsyncClient
            MF-->>MR: embeddings result
        else CHAT_COMPLETION
            MR->>MF: await agenerate(...)
            MF->>HMC: _get_async_client()
            HMC-->>MF: httpx.AsyncClient
            MF-->>MR: generation result
        else IMAGE
            MR->>MF: await agenerate_image(...)
            MF->>HMC: _get_async_client()
            HMC-->>MF: httpx.AsyncClient
            MF-->>MR: image result
        end
        MR->>MR: log ✅ Passed!
    end
    MR-->>CWB: (future resolves)

    Note over CWB: DATA_DESIGNER_ASYNC_ENGINE=0
    CWB->>MR: run_health_check(model_aliases) [sync path]
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: packages/data-designer-engine/tests/engine/models/test_model_registry.py
Line: 396-398

Comment:
**Inconsistent assertion style for `mock_agenerate_image`**

`mock_agenerate_image` is asserted with `await_count` while both `mock_agenerate` and `mock_agenerate_text_embeddings` use `call_count`. For `AsyncMock`, when `await mock(...)` is executed both counters are incremented together, so both will pass — but the inconsistency masks the intent. The sync counterpart `test_run_health_check_success` uses `call_count` for all three checks including `mock_generate_image.call_count == 1`.

```suggestion
    assert mock_agenerate.call_count == 2
    assert mock_agenerate_text_embeddings.call_count == 1
    assert mock_agenerate_image.call_count == 1
```

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: "fix: address Greptil..."

- Fix transport double-close in close()/aclose() by delegating
  teardown to the httpx client when one exists (if/elif pattern);
  only close transport directly if no client was ever created
- Reject mismatched client/mode injection in constructor (e.g.
  async_client on a sync-mode instance raises ValueError)
- Add 5-minute wall-clock timeout to future.result() in async
  health check dispatch
- Add constructor validation tests for both mismatch directions
- Update PR-5 architecture notes

Made-with: Cursor
@andreatgretel
Copy link
Contributor

packages/data-designer-engine/src/data_designer/engine/models/clients/adapters/http_model_client.py:75

self._transport = create_retry_transport(self._retry_config)

Claude Code caught this one: create_retry_transport() always runs even when a client is injected. in close(), the injected client takes the first if branch and the internal transport is never closed. production code never injects clients so it's test-only, but might be worth guarding with if sync_client is None and async_client is None: for correctness

andreatgretel
andreatgretel previously approved these changes Mar 20, 2026
Copy link
Contributor

@andreatgretel andreatgretel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice work on the single-mode lifecycle - the design is clean and the tests are thorough. two reviews ran on this (Claude Code + Codex). both agree the core design is solid. main items:

  • timeout cancellation: if future.result(timeout=300) times out, cancel the future so the coroutine doesn't linger on the shared loop
  • transport leak: eagerly created transport isn't closed when a client is injected (test-only in practice)
  • arun_health_check is a near-complete duplicate of run_health_check - consider extracting shared iteration logic
  • a few test coverage gaps: async health check dispatch, IMAGE generation type, env var derivation

nothing critical. ship it with a quick follow-up for the timeout/transport items

@nabinchha
Copy link
Contributor Author

packages/data-designer-engine/src/data_designer/engine/models/clients/adapters/http_model_client.py:75

self._transport = create_retry_transport(self._retry_config)

Claude Code caught this one: create_retry_transport() always runs even when a client is injected. in close(), the injected client takes the first if branch and the internal transport is never closed. production code never injects clients so it's test-only, but might be worth guarding with if sync_client is None and async_client is None: for correctness

Updated in 1afb1d7ffdf0eeed4cce9013dd156f69aa3d2314 so that transport is lazily created and the constructor accepts one incase consumer provides a custom transport to go along with the client they inject.

- Type _transport as RetryTransport | None, removing type: ignore
  suppressions in close()/aclose()
- Make transport fully lazy (None by default) and accept optional
  transport constructor param so injected-client paths don't
  eagerly allocate an unused RetryTransport
- Cancel future on TimeoutError in async health check dispatch so
  timed-out coroutines don't linger on the shared event loop
- Set health check timeout to 180s (3 min) matching architecture
  notes
- Rename _SYNC_CLIENT_CASES to _CLIENT_FACTORY_CASES since the
  list parametrizes both sync and async mode tests
- Update architecture notes timeout from 180→300 back to 180 to
  match implementation

Made-with: Cursor
@nabinchha nabinchha requested a review from andreatgretel March 20, 2026 16:10
- Use bare `raise` instead of `raise e` in both run_health_check and
  arun_health_check to preserve original traceback frames
- Add GenerationType.IMAGE test coverage for sync and async health
  checks (stub-image config + generate_image/agenerate_image patches)

Made-with: Cursor
…egistry.py

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants