Skip to content

Add explicit OpenAI API mode selection#1

Open
ThomasK33 wants to merge 1 commit intomainfrom
openai-api-mode-k3jj
Open

Add explicit OpenAI API mode selection#1
ThomasK33 wants to merge 1 commit intomainfrom
openai-api-mode-k3jj

Conversation

@ThomasK33
Copy link
Member

@ThomasK33 ThomasK33 commented Mar 12, 2026

Summary

  • add explicit OpenAI APIMode selection with deterministic chat/responses/auto behavior
  • keep WithUseResponsesAPI() as a backward-compatible heuristic shim and tighten provider-option type guards
  • expose the same API mode surface through Azure and add focused OpenAI/Azure tests

Validation

  • go test ./providers/openai ./providers/azure
  • go test ./providertests -run 'TestOpenAI|TestAzure'
  • go test ./... -count=1
  • go vet ./...

📋 Implementation Plan

Recommended path

  • Add a new provider-level APIMode enum to providers/openai, exposed via WithAPIMode(APIMode), with three caller-facing values: Auto, ChatCompletions, and Responses. Keep the provider's unconfigured behavior unchanged by storing "mode explicitly set" separately from the enum internally; if callers do nothing, Fantasy must follow today's exact path: instantiate the Chat Completions language model and never opt into the IsResponsesModel(modelID) heuristic.
  • Keep WithUseResponsesAPI() working as a backward-compatible heuristic shim equivalent to WithAPIMode(APIModeAuto). Do not repurpose it to mean "force Responses," because that would silently change existing behavior.
  • Update providers/openai/provider.LanguageModel() to switch on the effective mode: Responses => always build the Responses implementation; ChatCompletions or legacy default => always build the Chat Completions implementation; Auto => use IsResponsesModel(modelID) only as a heuristic. Forced Responses must bypass the allowlist entirely.
  • Tighten per-call provider-option guards so the chosen language-model implementation clearly expects only the matching option struct, with mode-aware errors. Reuse the same API-mode mechanism through Azure's wrapper and cover the behavior with focused unit tests plus OpenAI/Azure provider parity tests.
  • After this lands, Coder can safely map api_mode=responses|chat_completions to Fantasy without recreating the current mismatch where the language-model instance and provider-options type disagree.
Verified repo facts informing this plan
  • providers/openai/openai.go currently creates a Responses language model only when both useResponsesAPI is enabled and IsResponsesModel(modelID) returns true.
  • providers/azure/azure.go is a thin wrapper over openai.New(...), so Azure currently inherits OpenAI's exact selection logic instead of maintaining a separate implementation.
  • Current provider tests cover OpenAI/Azure chat and Responses flows, but there are no focused unit tests proving forced-selection behavior or provider-option/type compatibility across the API-mode boundary.

1. API design recommendation

Recommendation

Adopt a provider-level API-mode selector as the primary public API:

type APIMode string

const (
    APIModeAuto            APIMode = "auto"
    APIModeChatCompletions APIMode = "chat_completions"
    APIModeResponses       APIMode = "responses"
)

func WithAPIMode(mode APIMode) Option

Why provider-level, not model-level

  • The mode determines which concrete fantasy.LanguageModel implementation gets constructed in LanguageModel(...), so provider construction is the natural control point.
  • A model-level or call-level override would require broader interface changes or extra plumbing through every caller, which is unnecessary for the stated Coder use case.
  • If a caller genuinely needs mixed behavior, it can construct two provider instances with different API modes.

Why WithAPIMode(...) is better than WithForceResponsesAPI() / WithForceChatCompletionsAPI()

  • It maps directly to Coder's admin configuration shape (responses, chat_completions, and optionally auto).
  • It avoids a combinatorial/conflicting option surface (WithUseResponsesAPI() + WithForceResponsesAPI() + WithForceChatCompletionsAPI()).
  • It is more extensible if Fantasy later needs mode-specific validation or documentation.

Backward compatibility stance

  • Keep WithUseResponsesAPI() working and define it as the heuristic/auto path.
  • New code that needs deterministic behavior should use WithAPIMode(...).
  • Recommended migration messaging:
    • WithUseResponsesAPI() remains supported for compatibility.
    • Documentation should clarify that it is heuristic-only, not a forcing API.
    • A soft deprecation notice in Go doc is reasonable after downstreams like Coder migrate; it is not required for the initial change.

Implementation note for preserving defaults

Because APIModeAuto is a new opt-in behavior while today's unconfigured behavior is Chat Completions, the implementation must keep an internal distinction between:

  • mode unset / option not configured => preserve today's exact behavior: always choose Chat Completions and do not consult IsResponsesModel(modelID)
  • mode explicitly set to APIModeChatCompletions => also use Chat Completions, but now intentionally
  • mode explicitly set to APIModeAuto => use the current allowlist heuristic
  • mode explicitly set to APIModeResponses => force the Responses implementation

Use a pointer field or apiModeSet bool; do not let the enum's zero/uninitialized state silently mean APIModeAuto. Also avoid keeping both a legacy boolean and a new enum long-term.

Zero-value / empty-string recommendation

Do not make APIModeAuto equal to "".

Rationale:

  • In this design, empty string is more useful as "unset / not configured" so unconfigured callers keep today's exact behavior.
  • Making APIModeAuto the empty string would blur the line between "caller explicitly asked for heuristic mode" and "caller omitted the new field entirely", which is the opposite of what backward compatibility needs.
  • Keeping APIModeAuto as the explicit string "auto" makes logs, docs, validation, and config parsing clearer.

If a downstream config struct wants a zero-value-friendly string field, the recommended mapping is:

  • "" / omitted => do not call WithAPIMode(...); preserve legacy default behavior
  • "auto" => call WithAPIMode(APIModeAuto)
  • "chat_completions" => call WithAPIMode(APIModeChatCompletions)
  • "responses" => call WithAPIMode(APIModeResponses)

That keeps the public API explicit while still letting downstream config structs add a backward-compatible string field.

2. Selection behavior in LanguageModel()

Recommended selection rules

Normalize selection to one helper owned by providers/openai/openai.go, e.g. effectiveAPIMode(modelID string) or shouldUseResponses(modelID string).

Expected behavior:

  1. legacy default / unset mode
    • Always construct the standard Chat Completions language model.
  2. APIModeChatCompletions
    • Always construct the standard Chat Completions language model, even if IsResponsesModel(modelID) would return true.
  3. APIModeResponses
    • Always construct the Responses language model, even if IsResponsesModel(modelID) returns false or the model ID is brand-new and not yet in Fantasy's allowlist.
  4. APIModeAuto
    • Use IsResponsesModel(modelID) as a heuristic exactly as WithUseResponsesAPI() does today.

Allowlist interaction

  • IsResponsesModel(modelID) should remain an auto-mode heuristic, not a global capability gate.
  • Forced Responses should bypass the allowlist entirely; that is the whole point of making api_mode=responses enforceable for arbitrary or newly introduced model IDs.
  • Forced Chat Completions should similarly bypass the allowlist in the other direction.

Validation stance

Use defensive validation for configuration shape, not for speculative model capability:

  • Validate that WithAPIMode(...) received a known enum value.
  • Do not reject APIModeResponses solely because IsResponsesModel(modelID) is false.
  • Let the upstream API surface the runtime error if the endpoint/model combination is actually unsupported.

That tradeoff is important because a local allowlist cannot be the source of truth for unknown future IDs like gpt-5.4.

3. Provider-options compatibility plan

Goal

Guarantee that the selected language-model implementation and the accepted provider-options type stay aligned:

  • Chat Completions implementation <-> *openai.ProviderOptions
  • Responses implementation <-> *openai.ResponsesProviderOptions

Recommended change

Centralize the typed extraction/validation logic used by both model implementations so the errors are explicit and consistent.

Concretely:

  • Keep validation in the model call path (not in LanguageModel()), because provider options are attached per call.
  • Update the chat path (providers/openai/language_model.go / language_model_hooks.go) to fail with a clearer, mode-aware message when it receives *ResponsesProviderOptions.
  • Update the Responses path (providers/openai/responses_language_model.go) to fail symmetrically when it receives *ProviderOptions.
  • Prefer errors that mention the selected API mode and the expected type, so downstreams can immediately diagnose configuration mismatches.

Why this matters even after the mode fix

The new API-mode forcing will prevent Coder from selecting the wrong language-model implementation, but Fantasy should still guard against callers manually passing the wrong option struct on individual requests.

4. Azure implications

Recommendation

OpenAI and Azure should share the same API-mode forcing mechanism.

Reasoning:

  • Azure currently delegates to OpenAI provider construction and language-model selection.
  • The requested behavior change is about how Fantasy chooses between two OpenAI-family implementations; that logic already lives in the shared OpenAI provider.
  • Keeping one mechanism avoids OpenAI/Azure drift in downstreams like Coder.

Concrete scope

  • Add the new public API in providers/openai.
  • Expose the same mode selector through Azure's package surface (either as a wrapper or re-export, matching how other options are already surfaced there).
  • Keep Azure's implementation thin; do not fork separate selection logic unless repo investigation later uncovers a real Azure-only behavior difference.

Known risk to note, but not solve in the first pass

Azure may eventually diverge from OpenAI in which models support the Responses API. That matters for auto mode heuristics, but it should not block this change:

  • forced Responses should still bypass the allowlist for both providers
  • forced ChatCompletions should still be deterministic for both providers
  • if Azure-specific heuristics are needed later, they can be layered on top of the same APIMode mechanism without changing the forcing API again

5. Definition of done / acceptance criteria

The implementation should be considered complete only when all of the following are true:

  1. providers/openai exposes a public API that can express three caller intents unambiguously:
    • heuristic / auto
    • force Chat Completions
    • force Responses
  2. Leaving the provider unconfigured behaves exactly as today: it selects the Chat Completions implementation even for allowlisted Responses-capable model IDs and never consults IsResponsesModel(modelID) unless callers explicitly opt into heuristic mode.
  3. WithUseResponsesAPI() keeps working and remains heuristic-only; it does not become a forcing API.
  4. provider.LanguageModel(...) deterministically selects:
    • Chat Completions for default / explicit chat mode
    • Responses for explicit responses mode
    • heuristic selection for explicit auto mode
  5. Explicit Responses bypasses IsResponsesModel(modelID) entirely, so unknown or newly released model IDs can still be forced onto the Responses implementation.
  6. Chat and Responses language-model implementations reject the wrong provider-options struct with clear, mode-aware errors.
  7. Azure exposes the same forcing capability through its wrapper surface and follows the same selection rules as OpenAI.
  8. Package-local unit tests prove the selection contract, and OpenAI/Azure provider tests prove parity on explicit mode selection.
  9. Repo validation succeeds with the targeted Go tests plus the normal repo-wide checks that are feasible in the implementation environment.

6. TDD implementation plan

Phase Red — write failing tests first

6.1 Package-local OpenAI unit tests (same package, no network)

Create focused tests under providers/openai/ — ideally a dedicated api_mode_test.go in the same openai package so tests can assert on unexported concrete types without widening the public API.

Add failing tests for these cases first:

  1. TestLanguageModel_DefaultModeUsesChatCompletions

    • Build an OpenAI provider with no API-mode option.
    • Call LanguageModel(ctx, allowlistedResponsesModelID) where the model ID currently returns true from IsResponsesModel(...).
    • Assert the returned concrete type is the standard chat-completions implementation, proving the unconfigured provider does not silently start behaving like explicit auto/heuristic mode.
  2. TestLanguageModel_WithUseResponsesAPI_RemainsHeuristicOnly

    • Build a provider with WithUseResponsesAPI().
    • Assert an allowlisted model ID returns the Responses implementation.
    • Assert a synthetic/non-allowlisted model ID still returns the chat-completions implementation.
  3. TestLanguageModel_WithAPIModeAuto_MatchesLegacyHeuristic

    • Build a provider with explicit WithAPIMode(APIModeAuto).
    • Assert it matches WithUseResponsesAPI() for both allowlisted and non-allowlisted model IDs.
  4. TestLanguageModel_WithAPIModeResponses_BypassesAllowlist

    • Build a provider with WithAPIMode(APIModeResponses).
    • Use a synthetic model ID that is intentionally absent from IsResponsesModel(...).
    • Assert the returned concrete type is the Responses implementation anyway.
  5. TestLanguageModel_WithAPIModeChatCompletions_BypassesAllowlist

    • Build a provider with WithAPIMode(APIModeChatCompletions).
    • Use a model ID that currently passes IsResponsesModel(...).
    • Assert the returned concrete type is still the chat-completions implementation.
  6. TestWithAPIMode_RejectsUnknownValue

    • Pass an invalid enum/string value.
    • Assert provider construction fails early, or the earliest practical validation path returns a deterministic error.

6.2 Provider-options compatibility unit tests

Add failing tests that exercise the lowest-level call/prepare path that currently type-asserts provider options, without relying on live network calls.

Required cases:

  • chat implementation accepts *openai.ProviderOptions
  • chat implementation rejects *openai.ResponsesProviderOptions with a clear error mentioning the expected type and API mode
  • responses implementation accepts *openai.ResponsesProviderOptions
  • responses implementation rejects *openai.ProviderOptions with a clear error mentioning the expected type and API mode

Implementation note for the eventual code author: prefer unit-testing the shared prepare/extraction helper directly if one exists after refactoring; otherwise test the smallest existing function that performs the type assertion (language_model_hooks.go and responses_language_model.go are the currently verified hotspots).

6.3 Azure parity tests

Cover Azure in the cheapest layer that gives confidence:

  • If Azure already exposes a straightforward package-local seam, add a small unit test proving the new mode option is accepted and delegated.
  • Otherwise, put Azure coverage in providertests/ and keep package-local selection tests concentrated in providers/openai, since Azure is a thin wrapper over openai.New(...).

The minimum failing Azure/provider cases should prove:

  • explicit Responses mode works through Azure
  • explicit Chat Completions mode works through Azure
  • Azure does not invent different semantics for WithUseResponsesAPI() vs WithAPIMode(...)

Phase Green — implement the minimum code to satisfy the tests

6.4 Public API and internal state

Primary file: providers/openai/openai.go

  1. Add an exported APIMode type plus constants for Auto, ChatCompletions, and Responses.
  2. Add WithAPIMode(mode APIMode) Option as the new primary public API.
  3. Replace the internal useResponsesAPI bool with a representation that can distinguish:
    • mode unset (legacy default)
    • mode explicitly set to auto
    • mode explicitly set to chat
    • mode explicitly set to responses
  4. Validate the enum value defensively when the option is applied or when the provider is constructed.
  5. Reimplement WithUseResponsesAPI() as a backward-compatible shim that sets explicit auto mode.

6.5 Selection logic

Primary file: providers/openai/openai.go

  1. Add a private helper such as effectiveAPIMode(modelID string) or selectLanguageModelKind(modelID string).
  2. Make that helper the single place that resolves mode semantics.
  3. Implement the switch exactly as follows:
    • unset/default => chat-completions implementation
    • explicit APIModeChatCompletions => chat-completions implementation
    • explicit APIModeResponses => responses implementation
    • explicit APIModeAuto => use IsResponsesModel(modelID) heuristic
  4. Keep IsResponsesModel(modelID) confined to the auto path; do not let it veto explicit Responses mode.
  5. Preserve existing object-mode and shared client wiring when selecting either implementation.

6.6 Provider-options guards

Primary files:

  • providers/openai/language_model.go
  • providers/openai/language_model_hooks.go
  • providers/openai/responses_language_model.go
  • optionally providers/openai/provider_options.go or providers/openai/responses_options.go if a shared helper is the clearest home
  1. Align the chat and responses paths so both perform explicit provider-options type validation.
  2. Update the mismatch errors to name:
    • the selected API mode or implementation
    • the expected provider-options type
    • the actual incompatible type when available
  3. Preserve the no-provider-options path so callers that omit provider options entirely do not regress.

6.7 Azure wrapper surface

Primary file: providers/azure/azure.go

  1. Expose the same API-mode forcing mechanism for Azure.
  2. Preferred implementation: reuse/re-export the OpenAI option machinery if the current Azure API surface already aliases or forwards OpenAI options.
  3. If Azure cannot re-export cleanly, add the thinnest wrapper possible and keep all mode-selection logic in the shared OpenAI provider.
  4. Do not fork separate Azure selection logic in the first pass unless implementation work uncovers a real Azure-only constraint.

Phase Refactor — consolidate and document the solution after green

This phase is required even if the first working implementation is small.

  1. Extract any duplicated mode-resolution logic into one private helper in providers/openai/openai.go.
  2. Extract provider-options validation only if it improves clarity without introducing the wrong abstraction; prefer a small shared helper if both paths end up duplicating the same type-check and error-formatting logic.
  3. Update doc comments so the public contract is explicit:
    • WithAPIMode(...) is the deterministic API
    • WithUseResponsesAPI() is heuristic-only compatibility behavior
    • IsResponsesModel(...) is an allowlist heuristic, not a capability gate for forced mode
  4. Keep Azure thin: if refactoring starts to duplicate OpenAI logic into providers/azure, stop and pull the logic back into the shared OpenAI path.
  5. Re-read the tests after refactoring and confirm they still describe the intended contract rather than internal implementation trivia.

Phase Verify — run automated checks and review the contract

Use the repo's verified Go/Task entry points when implementation happens:

  1. Fast package-local checks while iterating:
    • go test ./providers/openai -v
    • go test ./providers/azure -v
  2. Provider parity checks when the environment supports them:
    • go test ./providertests -v -run "TestOpenAI|TestAzure"
  3. Full regression pass before claiming completion:
    • go test ./... -count=1
  4. Repo-quality checks expected by this repository:
    • task lint
    • task fmt if code formatting changes are introduced

Verification checklist for the implementer:

  • confirm default/no-option behavior still returns the chat-completions implementation for an allowlisted Responses-capable model ID, proving the new API is opt-in only
  • confirm forced Responses returns the Responses implementation for a synthetic non-allowlisted ID in unit tests
  • confirm forced Chat Completions overrides an allowlisted Responses model in unit tests
  • confirm error text for mismatched provider options is specific enough to diagnose downstream misconfiguration quickly
  • confirm Azure tests demonstrate the same semantics as OpenAI rather than a best-effort approximation

7. Specific files and symbols likely to change

Core implementation files

  • providers/openai/openai.go
    • options struct
    • WithUseResponsesAPI()
    • new APIMode type/constants
    • new WithAPIMode(...)
    • private mode-resolution helper
    • func (o *provider) LanguageModel(...)
  • providers/openai/language_model.go
    • chat implementation's provider-options handling / call preparation path
  • providers/openai/language_model_hooks.go
    • current *openai.ProviderOptions assertion path and error text
  • providers/openai/responses_language_model.go
    • responses implementation's provider-options assertion path and error text
  • providers/openai/provider_options.go
    • possible home for a small shared provider-options validation helper
  • providers/openai/responses_options.go
    • IsResponsesModel(...) documentation and any helper reuse that keeps the heuristic clearly scoped to auto mode
  • providers/azure/azure.go
    • wrapper or re-export plumbing for the new API mode surface

Test files

  • providers/openai/api_mode_test.go (recommended new focused test file)
  • existing providers/openai/openai_test.go only if colocating with the current suite is cleaner than a new file
  • providertests/openai_test.go
  • providertests/openai_responses_test.go
  • providertests/azure_test.go
  • providertests/azure_responses_test.go

8. Risks / compatibility concerns

  1. Default behavior drift

    • The biggest regression risk is accidentally making explicit auto behavior the new default. Leaving the option unset must continue to select Chat Completions.
  2. Silent semantic change to WithUseResponsesAPI()

    • Reinterpreting it as "force Responses" would break current callers. Keep it heuristic-only and document that clearly.
  3. Allowlist bypass surprise

    • Forced Responses will intentionally create a Responses language model for IDs Fantasy does not yet recognize. That is desired for forward compatibility, but the docs and tests should make the tradeoff explicit.
  4. Azure heuristic drift over time

    • Azure may eventually need a different allowlist for auto mode. Do not pre-emptively fork the design for that possibility; keep the forcing mechanism shared and revisit heuristics only if real divergence appears.
  5. Option-type mismatch can still be user-induced

    • Even with correct model selection, callers can still pass the wrong provider-options struct manually. Clear validation errors are part of the fix, not a nice-to-have.
  6. Over-abstraction during refactor

    • The selection logic is simple enough that one well-named helper is likely sufficient. Avoid building an elaborate strategy layer just to share tiny bits of code between chat and responses paths.

9. Coder follow-up once Fantasy support exists

Recommended downstream follow-up:

  1. Normalize Coder's admin api_mode handling with an explicit config-boundary mapping:
    • empty / omitted => do not call WithAPIMode(...); preserve today's default behavior
    • explicit responses => openai.WithAPIMode(openai.APIModeResponses)
    • explicit chat_completions => openai.WithAPIMode(openai.APIModeChatCompletions)
    • explicit auto => openai.WithAPIMode(openai.APIModeAuto) or the legacy WithUseResponsesAPI() shim
  2. Choose the per-call provider-options struct from that same normalized mode so the selected Fantasy language model and the option type cannot drift apart.
  3. Keep Coder's api_mode=responses|chat_completions override once Fantasy lands this API; the override becomes truly enforceable instead of being filtered through Fantasy's model allowlist.

Small nuance if Coder later adds auto

If Coder later exposes a third auto value, it should either:

  • reuse Fantasy's heuristic when deciding which provider-options struct to populate, or
  • avoid populating API-specific provider-options fields unless the effective mode is explicit

That nuance does not block the immediate responses|chat_completions fix.

10. Recommended implementation order

  1. Write the package-local failing tests for selection behavior and provider-options type compatibility.
  2. Add APIMode + WithAPIMode(...) and convert WithUseResponsesAPI() into the explicit-auto compatibility shim.
  3. Update LanguageModel(...) to use the new mode-resolution helper.
  4. Harden provider-options mismatch validation and error text.
  5. Expose the same option through Azure without forking the selection logic.
  6. Refactor duplicated helpers/comments only after the tests are green.
  7. Run the targeted package tests, provider tests when available, then the repo-wide Go/lint checks.

Generated with mux • Model: openai:gpt-5.4 • Thinking: high

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant