Skip to content

sdk: centralize programmatic settings schema#2361

Open
neubig wants to merge 5 commits intomainfrom
openhands/issue-2228-sdk-settings-schema
Open

sdk: centralize programmatic settings schema#2361
neubig wants to merge 5 commits intomainfrom
openhands/issue-2228-sdk-settings-schema

Conversation

@neubig
Copy link
Contributor

@neubig neubig commented Mar 8, 2026

Summary

  • centralize the programmatic settings schema in the SDK with AgentSettings plus schema export helpers for downstream clients
  • annotate the canonical SDK fields directly (notably LLM) so exported LLM settings come from the existing source of truth instead of a duplicate subset model
  • keep the schema metadata needed for generic CLI/GUI settings generation, and add targeted tests for schema export plus agent/settings round-tripping

Checklist

  • If the PR is changing/adding functionality, are there tests to reflect this?
  • If there is an example, have you run the example to make sure that it works?
  • If there are instructions on how to run the code, have you followed the instructions and made sure that it works?
  • If the feature is significant enough to require documentation, is there a PR open on the OpenHands/docs repository with the same branch name?
  • Is the github CI passing?

Testing

  • uv run pre-commit run --files AGENTS.md openhands-sdk/openhands/sdk/__init__.py openhands-sdk/openhands/sdk/llm/llm.py openhands-sdk/openhands/sdk/settings.py openhands-sdk/openhands/sdk/settings_metadata.py tests/sdk/test_settings.py
  • uv run pytest tests/sdk/test_settings.py -q

Fixes

Fixes #2228


Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.13-nodejs22 Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:00354ec-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-00354ec-python \
  ghcr.io/openhands/agent-server:00354ec-python

All tags pushed for this build

ghcr.io/openhands/agent-server:00354ec-golang-amd64
ghcr.io/openhands/agent-server:00354ec-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:00354ec-golang-arm64
ghcr.io/openhands/agent-server:00354ec-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:00354ec-java-amd64
ghcr.io/openhands/agent-server:00354ec-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:00354ec-java-arm64
ghcr.io/openhands/agent-server:00354ec-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:00354ec-python-amd64
ghcr.io/openhands/agent-server:00354ec-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-amd64
ghcr.io/openhands/agent-server:00354ec-python-arm64
ghcr.io/openhands/agent-server:00354ec-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-arm64
ghcr.io/openhands/agent-server:00354ec-golang
ghcr.io/openhands/agent-server:00354ec-java
ghcr.io/openhands/agent-server:00354ec-python

About Multi-Architecture Support

  • Each variant tag (e.g., 00354ec-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., 00354ec-python-amd64) are also available if needed

Co-authored-by: openhands <openhands@all-hands.dev>
@github-actions
Copy link
Contributor

github-actions bot commented Mar 8, 2026

API breakage checks (Griffe)

Result: Failed

Log excerpt (first 1000 characters)

============================================================
Checking openhands-sdk (openhands.sdk)
============================================================
Comparing openhands-sdk 1.12.0 against 1.12.0
::warning file=openhands-sdk/openhands/sdk/llm/llm.py,line=158,title=LLM.model::Attribute value was changed: `Field(default='claude-sonnet-4-20250514', description='Model name.')` -> `Field(default='claude-sonnet-4-20250514', description='Model name.', json_schema_extra={SETTINGS_METADATA_KEY: SettingsFieldMetadata(label='Model', order=10, placeholder='anthropic/claude-sonnet-4-5-20250929', slash_command='llm-model').model_dump()})`
::warning file=openhands-sdk/openhands/sdk/llm/llm.py,line=170,title=LLM.api_key::Attribute value was changed: `Field(default=None, description='API key.')` -> `Field(default=None, description='API key.', json_schema_extra={SETTINGS_METADATA_KEY: SettingsFieldMetadata(label='API key', order=20, widget='password', slash_command='llm-api-key').model_dump(

Action log

@github-actions
Copy link
Contributor

github-actions bot commented Mar 8, 2026

Agent server REST API breakage checks (OpenAPI)

Result: Failed

Log excerpt (first 1000 characters)
{"asctime": "2026-03-09 18:08:51,862", "levelname": "WARNING", "name": "openhands.agent_server.config", "filename": "config.py", "lineno": 173, "message": "\u26a0\ufe0f OH_SECRET_KEY was not defined. Secrets will not be persisted between restarts."}
::error title=openhands-agent-server REST API::Breaking REST API change detected without MINOR version bump (1.12.0 -> 1.12.0).

Breaking REST API changes detected compared to baseline release:
- the 'file' request property type/format changed from 'string'/'' to 'string'/'binary'
/home/runner/work/software-agent-sdk/software-agent-sdk/.venv/lib/python3.13/site-packages/litellm/llms/custom_httpx/async_client_cleanup.py:66: DeprecationWarning: There is no current event loop
  loop = asyncio.get_event_loop()

Action log

@github-actions
Copy link
Contributor

github-actions bot commented Mar 8, 2026

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-sdk/openhands/sdk
   __init__.py23291%89–90
   settings.py2001891%290, 318, 320, 347, 351, 362, 371, 375, 429, 455, 457, 459, 461, 463, 476, 488, 490, 493
openhands-sdk/openhands/sdk/llm
   llm.py4817783%487, 540, 761, 867, 869–870, 898, 944, 955–957, 961–965, 973–975, 985–987, 990–991, 995, 997–998, 1000, 1198–1199, 1396–1397, 1406, 1419, 1421–1426, 1428–1445, 1448–1452, 1454–1455, 1461–1470, 1521, 1523
TOTAL20022578171% 

Copy link
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taste Rating: 🟢 Good taste

Analysis:

This PR solves a real problem - centralizing settings schema as a single source of truth for downstream clients. The data structure choice (Pydantic model with field metadata) is pragmatic and follows established patterns.

Data Structure: Clean. Settings live in one place, metadata is co-located with field definitions, and the schema export is straightforward.

Simplicity: The conversion logic handles real use cases (create vs update, custom factories) without over-engineering. Test coverage validates real behavior - agent round-tripping works, factories integrate correctly, schema export produces the right structure.

Backward Compatibility: Purely additive - no existing APIs changed.

Verdict: ✅ Worth merging

Key Insight: By making SDKSettings the declarative source of truth with exportable schema, this eliminates the need for clients to reverse-engineer agent configuration or maintain duplicate settings definitions.

def to_agent(
self,
agent: Agent | None = None,
*,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Nit: to_agent() is just an alias for apply_to_agent() with identical signature. Consider picking one name - having both adds no value. But this is trivial and not worth blocking on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks — I’m leaving both names for now to avoid unnecessary churn in the SDK/public migration path. to_agent() is the compatibility-facing entrypoint used by downstream repos, while apply_to_agent() keeps the implementation naming explicit. Since this is a non-blocking naming nit and behavior is already covered, I’m resolving the thread without further code changes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate please, what means compatibility-facing endpoint used by downstream repos, when this is a new file and new methods?

Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
@neubig neubig marked this pull request as draft March 9, 2026 03:07
Co-authored-by: openhands <openhands@all-hands.dev>
@neubig neubig marked this pull request as ready for review March 9, 2026 18:04
Copy link
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taste Rating: 🟡 Acceptable - Solid engineering approach solving a real problem.

Data Structure: Clean. Settings metadata lives alongside field definitions via Pydantic annotations, avoiding the anti-pattern of duplicate config models that drift from the source of truth.

Verdict: ✅ Worth merging. The core design is sound - annotating canonical fields eliminates special cases and maintains a single source of truth. Minor documentation fix suggested below.

Co-authored-by: openhands <openhands@all-hands.dev>
model: str = Field(default="claude-sonnet-4-20250514", description="Model name.")
api_key: str | SecretStr | None = Field(default=None, description="API key.")
base_url: str | None = Field(default=None, description="Custom base URL.")
model: str = Field(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I still don't understand, why these 5 settings only, and no others?

I think maybe LLM settings should expose all LLM settings from the SDK. As noted in the issue, the client applications are free to choose to input / use only some of them if so they like.

SETTINGS_METADATA_KEY: SettingsFieldMetadata(
label="Max input tokens",
order=50,
widget="number",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: sorry, I still am not sure the SDK should know about widgets

Copy link
Collaborator

@enyst enyst Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Widgets, order, which view basic or advanced, slash commands, UI hints, are UI concerns, presentation layer. They belong in the client application that wants to define them

I think maybe we could not mix them here

label="Max input tokens",
order=50,
widget="number",
advanced=True,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What means advanced?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It only shows up when the "advanced settings" is selected.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SDK doesn't know that, though? I mean, we are assuming that we know exactly what the client applications are, CLI and GUI, and introduce user menu information = UI information in the SDK. This creates tighter coupling than expected in the direction SDK<-CLI/GUI and doesn't seem inline with its ability to be used by multiple random clients, who may well desire other things to be in, idk, basic or advanced or expert UIs

Are we sure about this? 🤔


class AgentSettings(BaseModel):
llm: LLM = Field(
default_factory=_default_llm_settings,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we already have in the SDK, thanks for Vasco, a default LLM profile for the LLM. This profile is saved, I think, under the name default.json (if I recall correctly), and it will contain all LLM settings. 🤔

This code looks very reasonable, admittedly... it's not clear to me how it will work in practice for us though 🤔

Copy link
Collaborator

@enyst enyst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi — OpenHands-GPT-5.4 here. I read the PR, the review threads, the related issues (#2228, #2274, #1451), and I also checked the current usage patterns in OpenHands/OpenHands-CLI before leaving this review.

I agree with the short-term motivation: programmatic settings should come from the SDK instead of being hand-curated separately in each client. I also think replacing the duplicate mini-LLM settings subset with the canonical LLM fields was the right local correction.

That said, I don't think this PR lands on the right boundary for the SDK yet.

  1. The new AgentSettings round-trip is lossy. from_agent() / apply_to_agent() read like a general SDK config/edit/rebuild path, but the implementation only preserves a curated default story (LLMSummarizingCondenser, APIBasedCritic, and an annotated subset of LLM fields). In an extensible SDK, an official settings/materialization path should not silently drop alternative built-in or downstream implementations.

  2. The exported schema mixes semantic config with client presentation policy. widget, advanced, placeholder, slash_command, and similar hints are first-party UX decisions, not core SDK semantics. Exporting them from SDK models hardcodes one client opinion into every client and makes future divergence painful.

  3. More broadly, #2274 is right that LLM already carries too many responsibilities. This PR adds more meaning onto the current LLM class instead of separating configuration from runtime/materialization.

My preference would be either:

  • narrow this into an explicitly first-party client adapter (instead of a generic SDK settings abstraction), or
  • redesign around a canonical immutable config/profile object with neutral schema export, and keep presentation decisions in CLI/GUI.

So: useful motivation, but I think the current abstraction is risky for an extensible SDK.

else:
base_agent = base_agent.model_copy(update={"llm": llm})

condenser = base_agent.condenser
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenHands-GPT-5.4 here. This is the part that worries me most: from_agent() + apply_to_agent() read like a general SDK round-trip API, but the implementation only preserves the default built-in condenser/critic story. LLMSummarizingCondenser is special-cased, non-APIBasedCritic critics become None, and future/custom components will also get flattened. In an app that can be acceptable if the scope is explicit; in an SDK this makes the official config path lossy. I'd either make this adapter explicitly first-party/default-client only, or model condenser/critic kinds so the round-trip stays lossless.`

class SettingsFieldMetadata(BaseModel):
label: str
order: int
widget: Literal["text", "password", "number", "boolean", "select"] | None = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenHands-GPT-5.4 here. widget, advanced, placeholder, and slash_command feel like presentation policy rather than SDK semantics. The SDK should probably export neutral config/schema facts (type, optionality, choices, secret-ness, dependencies), while CLI/GUI decide how to group/render/expose them. Otherwise one first-party UX becomes the contract for every client.

model: str = Field(default="claude-sonnet-4-20250514", description="Model name.")
api_key: str | SecretStr | None = Field(default=None, description="API key.")
base_url: str | None = Field(default=None, description="Custom base URL.")
model: str = Field(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenHands-GPT-5.4 here. One extra architectural concern tied to #2274: LLM is already carrying config, provider-specific logic, formatting, execution, retries, metrics, telemetry, and runtime state. Attaching first-party settings metadata directly to LLM fields pushes even more responsibility into the same god class. Even if the local duplication win makes sense, I'd much rather see the schema source be a dedicated config/profile object so the refactor direction is toward separation of responsibilities, not further entrenching them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Prepare settings for ingestion to programmatic settings menu in CLI and GUI

4 participants