Skip to content

Fix model configuration typo in GPT-5.3 Codex#2308

Draft
juanmichelini wants to merge 3 commits intomainfrom
jmj/fix-openhands-typo
Draft

Fix model configuration typo in GPT-5.3 Codex#2308
juanmichelini wants to merge 3 commits intomainfrom
jmj/fix-openhands-typo

Conversation

@juanmichelini
Copy link
Collaborator

@juanmichelini juanmichelini commented Mar 4, 2026

Summary

Model contained id gpt-5-3-codex while the correct one was gpt-5.3-codex.
Testing here https://github.com/OpenHands/software-agent-sdk/actions/runs/22690678388

Checklist

  • If the PR is changing/adding functionality, are there tests to reflect this?
  • If there is an example, have you run the example to make sure that it works?
  • If there are instructions on how to run the code, have you followed the instructions and made sure that it works?
  • If the feature is significant enough to require documentation, is there a PR open on the OpenHands/docs repository with the same branch name?
  • Is the github CI passing?

Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.12-nodejs22 Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:19aa11c-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-19aa11c-python \
  ghcr.io/openhands/agent-server:19aa11c-python

All tags pushed for this build

ghcr.io/openhands/agent-server:19aa11c-golang-amd64
ghcr.io/openhands/agent-server:19aa11c-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:19aa11c-golang-arm64
ghcr.io/openhands/agent-server:19aa11c-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:19aa11c-java-amd64
ghcr.io/openhands/agent-server:19aa11c-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:19aa11c-java-arm64
ghcr.io/openhands/agent-server:19aa11c-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:19aa11c-python-amd64
ghcr.io/openhands/agent-server:19aa11c-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-amd64
ghcr.io/openhands/agent-server:19aa11c-python-arm64
ghcr.io/openhands/agent-server:19aa11c-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-arm64
ghcr.io/openhands/agent-server:19aa11c-golang
ghcr.io/openhands/agent-server:19aa11c-java
ghcr.io/openhands/agent-server:19aa11c-python

About Multi-Architecture Support

  • Each variant tag (e.g., 19aa11c-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., 19aa11c-python-amd64) are also available if needed

@juanmichelini juanmichelini changed the title Update model configuration for GPT-5.3 Codex Fix model configuration typo in GPT-5.3 Codex Mar 4, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Mar 4, 2026

API breakage checks (Griffe)

Result: Failed

Log excerpt (first 1000 characters)

============================================================
Checking openhands-sdk (openhands.sdk)
============================================================
Comparing openhands-sdk 1.12.0 against 1.11.5
::warning file=openhands-sdk/openhands/sdk/conversation/conversation.py,line=103,title=Conversation.__new__(delete_on_close)::Parameter default was changed: `False` -> `True`
::notice title=openhands-sdk API::Ignoring Field metadata-only change (non-breaking): temperature
::warning file=openhands-sdk/openhands/sdk/llm/llm.py,line=196,title=LLM.top_p::Attribute value was changed: `Field(default=1.0, ge=0, le=1)` -> `Field(default=None, ge=0, le=1, description='Nucleus sampling parameter. Defaults to None (uses provider default). Set to a value between 0 and 1 to control diversity of outputs.')`
::notice title=openhands-sdk API::Ignoring Field metadata-only change (non-breaking): prompt_cache_retention
Breaking changes detected (2) and version bump policy satisfied (1.11.5 -> 1.12.0)

Action log

@github-actions
Copy link
Contributor

github-actions bot commented Mar 4, 2026

Agent server REST API breakage checks (OpenAPI)

Result: Passed

Action log

Copy link
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taste Rating: 🟡 Acceptable (with critical fixes needed)

Verdict: ❌ Needs rework - Tests not updated, incomplete change

Key Insight: Consistency fix is good taste, but you broke the tests.


The model naming change from gpt-5-3-codex to gpt-5.3-codex makes sense for consistency with other models (e.g., gpt-5.2-codex), but this PR is incomplete.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 4, 2026

🧪 Integration Tests Results

Overall Success Rate: 0.0%
Total Cost: $0.26
Models Tested: 1
Timestamp: 2026-03-04 21:46:53 UTC

📊 Summary

Model Overall Tests Passed Skipped Total Cost Tokens
litellm_proxy_gpt_5.3_codex 0.0% 0/17 1 18 $0.26 108,305

📋 Detailed Results

litellm_proxy_gpt_5.3_codex

  • Success Rate: 0.0% (0/17)
  • Total Cost: $0.26
  • Token Usage: prompt: 96,681, completion: 11,624, cache_read: 42,880, reasoning: 9,783
  • Run Suffix: litellm_proxy_gpt_5.3_codex_a3ba6f8_gpt_5_3_codex_run_N18_20260304_214230
  • Skipped Tests: 1

Skipped Tests:

  • c01_thinking_block_condenser: Model litellm_proxy/gpt-5.3-codex does not support extended thinking or reasoning effort

Failed Tests:

  • t01_fix_simple_typo: Test execution failed: Conversation run failed for id=be642c0d-24cf-464d-8543-ef1ba326ce28: litellm.BadRequestError: {"error":{"message":"litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=gpt-5.3-codex\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]","type":null,"param":null,"code":"400"}} (Cost: $0.00)
  • t05_simple_browsing: Test execution failed: Conversation run failed for id=5858d843-60af-44a2-a08d-91f1d702b556: litellm.BadRequestError: {"error":{"message":"litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=gpt-5.3-codex\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]","type":null,"param":null,"code":"400"}} (Cost: $0.00)
  • t08_image_file_viewing: Test execution failed: Conversation run failed for id=70fe802c-ba11-4739-b7d1-3e3cf67a2846: litellm.BadRequestError: {"error":{"message":"litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=gpt-5.3-codex\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]","type":null,"param":null,"code":"400"}} (Cost: $0.00)
  • t07_interactive_commands: Test execution failed: Conversation run failed for id=5857a01c-a2ea-4825-82a5-8f8f4b377eec: litellm.BadRequestError: {"error":{"message":"litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=gpt-5.3-codex\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]","type":null,"param":null,"code":"400"}} (Cost: $0.01)
  • b04_each_tool_call_has_a_concise_explanation: Test execution failed: Conversation run failed for id=7b4a87e9-26ca-4cb1-bb3d-5b52d7c7ca10: litellm.BadRequestError: {"error":{"message":"litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=gpt-5.3-codex\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]","type":null,"param":null,"code":"400"}} (Cost: $0.01)
  • b02_no_oververification: Test execution failed: Conversation run failed for id=9f84bbb7-dc9a-4018-982d-5254c069fdb8: litellm.BadRequestError: {"error":{"message":"litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=gpt-5.3-codex\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]","type":null,"param":null,"code":"400"}} (Cost: $0.01)
  • t04_git_staging: Test execution failed: Conversation run failed for id=26411fbb-ad8e-4dd0-918f-66147804b6bf: litellm.BadRequestError: {"error":{"message":"litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=gpt-5.3-codex\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]","type":null,"param":null,"code":"400"}} (Cost: $0.00)
  • t03_jupyter_write_file: Test execution failed: Conversation run failed for id=d9df62db-7aac-4c23-9c85-b98f8ac7485a: litellm.BadRequestError: {"error":{"message":"litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=gpt-5.3-codex\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]","type":null,"param":null,"code":"400"}} (Cost: $0.00)
  • c03_delayed_condensation: Test execution failed: Conversation run failed for id=88f3e485-c553-40de-9a17-1a1d436b0ce0: litellm.BadRequestError: {"error":{"message":"litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=gpt-5.3-codex\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]","type":null,"param":null,"code":"400"}} (Cost: $0.00)
  • c05_size_condenser: Test execution failed: Conversation run failed for id=9bc4dd49-524c-4154-a802-2d995f62a95d: litellm.BadRequestError: {"error":{"message":"litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=gpt-5.3-codex\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]","type":null,"param":null,"code":"400"}} (Cost: $0.01)
  • b01_no_premature_implementation: Test execution failed: Conversation run failed for id=8e45c88f-8cbb-408d-bd89-6af4611fe2d5: litellm.BadRequestError: {"error":{"message":"litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=gpt-5.3-codex\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]","type":null,"param":null,"code":"400"}} (Cost: $0.00)
  • t06_github_pr_browsing: Test execution failed: Conversation run failed for id=f590f57b-e9f2-4ace-845a-167568889266: litellm.BadRequestError: {"error":{"message":"litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=gpt-5.3-codex\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]","type":null,"param":null,"code":"400"}} (Cost: $0.03)
  • c02_hard_context_reset: Test execution failed: Conversation run failed for id=983b76b5-02b2-4270-9662-c4427af498b4: litellm.BadRequestError: {"error":{"message":"litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=gpt-5.3-codex\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]","type":null,"param":null,"code":"400"}} (Cost: $0.01)
  • t02_add_bash_hello: Test execution failed: Conversation run failed for id=20167995-ab2c-4ed3-a122-340b928b2700: litellm.BadRequestError: {"error":{"message":"litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=gpt-5.3-codex\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]","type":null,"param":null,"code":"400"}} (Cost: $0.00)
  • b05_do_not_create_redundant_files: Test execution failed: Conversation run failed for id=48e77552-71ee-4bba-b8c4-8f733c18d5e5: litellm.BadRequestError: {"error":{"message":"litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=gpt-5.3-codex\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]","type":null,"param":null,"code":"400"}} (Cost: $0.01)
  • b03_no_useless_backward_compatibility: Test execution failed: Conversation run failed for id=70c2858e-1ef8-4956-a8a3-a104b829ff36: litellm.BadRequestError: {"error":{"message":"litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=gpt-5.3-codex\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]","type":null,"param":null,"code":"400"}} (Cost: $0.02)
  • c04_token_condenser: Test execution failed: Conversation run failed for id=dadd4bbf-b413-4a81-b0d1-f42b69ad38ca: litellm.BadRequestError: {"error":{"message":"litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=gpt-5.3-codex\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]","type":null,"param":null,"code":"400"}} (Cost: $0.14)

Fix test_gpt_5_3_codex_config to reference the corrected model key
'gpt-5.3-codex' (with dots) instead of the old 'gpt-5-3-codex'
(with hyphens) to match the fix in resolve_model_config.py.

Co-authored-by: openhands <openhands@all-hands.dev>
@juanmichelini juanmichelini marked this pull request as draft March 4, 2026 22:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants