Fix model configuration typo in GPT-5.3 Codex by juanmichelini · Pull Request #2308 · OpenHands/software-agent-sdk

juanmichelini · 2026-03-04T21:40:14Z

Summary

Model contained id gpt-5-3-codex while the correct one was gpt-5.3-codex.
Testing here https://github.com/OpenHands/software-agent-sdk/actions/runs/22690678388

Checklist

If the PR is changing/adding functionality, are there tests to reflect this?
If there is an example, have you run the example to make sure that it works?
If there are instructions on how to run the code, have you followed the instructions and made sure that it works?
If the feature is significant enough to require documentation, is there a PR open on the OpenHands/docs repository with the same branch name?
Is the github CI passing?

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.12-nodejs22`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:19aa11c-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-19aa11c-python \
  ghcr.io/openhands/agent-server:19aa11c-python

All tags pushed for this build

ghcr.io/openhands/agent-server:19aa11c-golang-amd64
ghcr.io/openhands/agent-server:19aa11c-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:19aa11c-golang-arm64
ghcr.io/openhands/agent-server:19aa11c-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:19aa11c-java-amd64
ghcr.io/openhands/agent-server:19aa11c-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:19aa11c-java-arm64
ghcr.io/openhands/agent-server:19aa11c-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:19aa11c-python-amd64
ghcr.io/openhands/agent-server:19aa11c-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-amd64
ghcr.io/openhands/agent-server:19aa11c-python-arm64
ghcr.io/openhands/agent-server:19aa11c-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-arm64
ghcr.io/openhands/agent-server:19aa11c-golang
ghcr.io/openhands/agent-server:19aa11c-java
ghcr.io/openhands/agent-server:19aa11c-python

About Multi-Architecture Support

Each variant tag (e.g., 19aa11c-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., 19aa11c-python-amd64) are also available if needed

github-actions · 2026-03-04T21:40:42Z

API breakage checks (Griffe)

Result: Failed

Log excerpt (first 1000 characters)


============================================================
Checking openhands-sdk (openhands.sdk)
============================================================
Comparing openhands-sdk 1.12.0 against 1.11.5
::warning file=openhands-sdk/openhands/sdk/conversation/conversation.py,line=103,title=Conversation.__new__(delete_on_close)::Parameter default was changed: `False` -> `True`
::notice title=openhands-sdk API::Ignoring Field metadata-only change (non-breaking): temperature
::warning file=openhands-sdk/openhands/sdk/llm/llm.py,line=196,title=LLM.top_p::Attribute value was changed: `Field(default=1.0, ge=0, le=1)` -> `Field(default=None, ge=0, le=1, description='Nucleus sampling parameter. Defaults to None (uses provider default). Set to a value between 0 and 1 to control diversity of outputs.')`
::notice title=openhands-sdk API::Ignoring Field metadata-only change (non-breaking): prompt_cache_retention
Breaking changes detected (2) and version bump policy satisfied (1.11.5 -> 1.12.0)

Action log

github-actions · 2026-03-04T21:41:29Z

Agent server REST API breakage checks (OpenAPI)

Result: Passed

Action log

all-hands-bot

Taste Rating: 🟡 Acceptable (with critical fixes needed)

Verdict: ❌ Needs rework - Tests not updated, incomplete change

Key Insight: Consistency fix is good taste, but you broke the tests.

The model naming change from gpt-5-3-codex to gpt-5.3-codex makes sense for consistency with other models (e.g., gpt-5.2-codex), but this PR is incomplete.

.github/run-eval/resolve_model_config.py

github-actions · 2026-03-04T21:47:01Z

🧪 Integration Tests Results

Overall Success Rate: 0.0%
Total Cost: $0.26
Models Tested: 1
Timestamp: 2026-03-04 21:46:53 UTC

📊 Summary

Model	Overall	Tests Passed	Skipped	Total	Cost	Tokens
litellm_proxy_gpt_5.3_codex	0.0%	0/17	1	18	$0.26	108,305

📋 Detailed Results

litellm_proxy_gpt_5.3_codex

Success Rate: 0.0% (0/17)
Total Cost: $0.26
Token Usage: prompt: 96,681, completion: 11,624, cache_read: 42,880, reasoning: 9,783
Run Suffix: litellm_proxy_gpt_5.3_codex_a3ba6f8_gpt_5_3_codex_run_N18_20260304_214230
Skipped Tests: 1

Skipped Tests:

c01_thinking_block_condenser: Model litellm_proxy/gpt-5.3-codex does not support extended thinking or reasoning effort

Failed Tests:

t01_fix_simple_typo: Test execution failed: Conversation run failed for id=be642c0d-24cf-464d-8543-ef1ba326ce28: litellm.BadRequestError: {"error":{"message":"litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=gpt-5.3-codex\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]","type":null,"param":null,"code":"400"}} (Cost: $0.00)
t05_simple_browsing: Test execution failed: Conversation run failed for id=5858d843-60af-44a2-a08d-91f1d702b556: litellm.BadRequestError: {"error":{"message":"litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=gpt-5.3-codex\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]","type":null,"param":null,"code":"400"}} (Cost: $0.00)
t08_image_file_viewing: Test execution failed: Conversation run failed for id=70fe802c-ba11-4739-b7d1-3e3cf67a2846: litellm.BadRequestError: {"error":{"message":"litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=gpt-5.3-codex\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]","type":null,"param":null,"code":"400"}} (Cost: $0.00)
t07_interactive_commands: Test execution failed: Conversation run failed for id=5857a01c-a2ea-4825-82a5-8f8f4b377eec: litellm.BadRequestError: {"error":{"message":"litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=gpt-5.3-codex\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]","type":null,"param":null,"code":"400"}} (Cost: $0.01)
b04_each_tool_call_has_a_concise_explanation: Test execution failed: Conversation run failed for id=7b4a87e9-26ca-4cb1-bb3d-5b52d7c7ca10: litellm.BadRequestError: {"error":{"message":"litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=gpt-5.3-codex\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]","type":null,"param":null,"code":"400"}} (Cost: $0.01)
b02_no_oververification: Test execution failed: Conversation run failed for id=9f84bbb7-dc9a-4018-982d-5254c069fdb8: litellm.BadRequestError: {"error":{"message":"litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=gpt-5.3-codex\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]","type":null,"param":null,"code":"400"}} (Cost: $0.01)
t04_git_staging: Test execution failed: Conversation run failed for id=26411fbb-ad8e-4dd0-918f-66147804b6bf: litellm.BadRequestError: {"error":{"message":"litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=gpt-5.3-codex\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]","type":null,"param":null,"code":"400"}} (Cost: $0.00)
t03_jupyter_write_file: Test execution failed: Conversation run failed for id=d9df62db-7aac-4c23-9c85-b98f8ac7485a: litellm.BadRequestError: {"error":{"message":"litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=gpt-5.3-codex\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]","type":null,"param":null,"code":"400"}} (Cost: $0.00)
c03_delayed_condensation: Test execution failed: Conversation run failed for id=88f3e485-c553-40de-9a17-1a1d436b0ce0: litellm.BadRequestError: {"error":{"message":"litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=gpt-5.3-codex\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]","type":null,"param":null,"code":"400"}} (Cost: $0.00)
c05_size_condenser: Test execution failed: Conversation run failed for id=9bc4dd49-524c-4154-a802-2d995f62a95d: litellm.BadRequestError: {"error":{"message":"litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=gpt-5.3-codex\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]","type":null,"param":null,"code":"400"}} (Cost: $0.01)
b01_no_premature_implementation: Test execution failed: Conversation run failed for id=8e45c88f-8cbb-408d-bd89-6af4611fe2d5: litellm.BadRequestError: {"error":{"message":"litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=gpt-5.3-codex\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]","type":null,"param":null,"code":"400"}} (Cost: $0.00)
t06_github_pr_browsing: Test execution failed: Conversation run failed for id=f590f57b-e9f2-4ace-845a-167568889266: litellm.BadRequestError: {"error":{"message":"litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=gpt-5.3-codex\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]","type":null,"param":null,"code":"400"}} (Cost: $0.03)
c02_hard_context_reset: Test execution failed: Conversation run failed for id=983b76b5-02b2-4270-9662-c4427af498b4: litellm.BadRequestError: {"error":{"message":"litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=gpt-5.3-codex\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]","type":null,"param":null,"code":"400"}} (Cost: $0.01)
t02_add_bash_hello: Test execution failed: Conversation run failed for id=20167995-ab2c-4ed3-a122-340b928b2700: litellm.BadRequestError: {"error":{"message":"litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=gpt-5.3-codex\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]","type":null,"param":null,"code":"400"}} (Cost: $0.00)
b05_do_not_create_redundant_files: Test execution failed: Conversation run failed for id=48e77552-71ee-4bba-b8c4-8f733c18d5e5: litellm.BadRequestError: {"error":{"message":"litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=gpt-5.3-codex\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]","type":null,"param":null,"code":"400"}} (Cost: $0.01)
b03_no_useless_backward_compatibility: Test execution failed: Conversation run failed for id=70c2858e-1ef8-4956-a8a3-a104b829ff36: litellm.BadRequestError: {"error":{"message":"litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=gpt-5.3-codex\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]","type":null,"param":null,"code":"400"}} (Cost: $0.02)
c04_token_condenser: Test execution failed: Conversation run failed for id=dadd4bbf-b413-4a81-b0d1-f42b69ad38ca: litellm.BadRequestError: {"error":{"message":"litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]. Received Model Group=gpt-5.3-codex\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: You passed in model=gpt-5.3-codex. There are no healthy deployments for this modelNo fallback model group found for original model_group=gpt-5.3-codex. Fallbacks=[{'minimax-m2.5': ['minimax-m2.5-api']}]","type":null,"param":null,"code":"400"}} (Cost: $0.14)

Fix test_gpt_5_3_codex_config to reference the corrected model key 'gpt-5.3-codex' (with dots) instead of the old 'gpt-5-3-codex' (with hyphens) to match the fix in resolve_model_config.py. Co-authored-by: openhands <openhands@all-hands.dev>

Update model configuration for GPT-5.3 Codex

a3ba6f8

juanmichelini changed the title ~~Update model configuration for GPT-5.3 Codex~~ Fix model configuration typo in GPT-5.3 Codex Mar 4, 2026

all-hands-bot reviewed Mar 4, 2026

View reviewed changes

.github/run-eval/resolve_model_config.py Show resolved Hide resolved

.github/run-eval/resolve_model_config.py Show resolved Hide resolved

.github/run-eval/resolve_model_config.py Show resolved Hide resolved

juanmichelini marked this pull request as draft March 4, 2026 22:47

Merge branch 'main' into jmj/fix-openhands-typo

7b17c3e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix model configuration typo in GPT-5.3 Codex#2308

Fix model configuration typo in GPT-5.3 Codex#2308
juanmichelini wants to merge 3 commits intomainfrom
jmj/fix-openhands-typo

juanmichelini commented Mar 4, 2026 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Mar 4, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 4, 2026 •

edited

Loading

Uh oh!

all-hands-bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

juanmichelini commented Mar 4, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Checklist

Uh oh!

github-actions bot commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

API breakage checks (Griffe)

Uh oh!

github-actions bot commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Agent server REST API breakage checks (OpenAPI)

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Taste Rating: 🟡 Acceptable (with critical fixes needed)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Mar 4, 2026

🧪 Integration Tests Results

📊 Summary

📋 Detailed Results

litellm_proxy_gpt_5.3_codex

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

juanmichelini commented Mar 4, 2026 •

edited by github-actions bot

Loading

github-actions bot commented Mar 4, 2026 •

edited

Loading

github-actions bot commented Mar 4, 2026 •

edited

Loading