Skip to content

DRAFT: feat: add execute_tool API endpoint for direct tool execution on conversations#2356

Draft
xingyaoww wants to merge 2 commits intomainfrom
feat/execute-tool-api
Draft

DRAFT: feat: add execute_tool API endpoint for direct tool execution on conversations#2356
xingyaoww wants to merge 2 commits intomainfrom
feat/execute-tool-api

Conversation

@xingyaoww
Copy link
Collaborator

@xingyaoww xingyaoww commented Mar 7, 2026

Description

Add the ability to execute tools (like the terminal) directly on a conversation without going through the agent loop. This enables pre-run setup operations like running .openhands/setup.sh through the agent's persistent terminal session so environment changes persist.

Problem

In V1, setup.sh was executed via AsyncRemoteWorkspace.execute_command() which runs commands in an ephemeral subprocess — environment changes (exported variables, sourced files, PATH modifications, etc.) don't persist to the agent's terminal session. This made setup.sh effectively broken in V1 compared to V0.

Solution

Add a new POST /api/conversations/{id}/execute_tool endpoint that allows executing a tool directly on a conversation. The endpoint:

  1. Lazy-initializes the agent and its tools (via _ensure_agent_ready())
  2. Validates the action against the tool's schema
  3. Executes through LocalConversation.execute_tool()
  4. Returns the observation as JSON

Changes

Agent Server:

  • models.py: Added ExecuteToolRequest / ExecuteToolResponse models
  • event_service.py: Added EventService.execute_tool() — runs a tool through LocalConversation.execute_tool() in a thread executor
  • conversation_service.py: Added ConversationService.execute_tool() — delegates to EventService
  • conversation_router.py: Added POST /conversations/{id}/execute_tool endpoint with 404/400 error handling

SDK:

  • remote_conversation.py: Implemented RemoteConversation.execute_tool() (was NotImplementedError) — calls the new server endpoint
  • async_remote_workspace.py: Added AsyncRemoteWorkspace.execute_tool() convenience method

Companion PR

This PR is used by the OpenHands companion PR: OpenHands/OpenHands (feat/setup-sh-via-terminal-tool) which restructures the V1 startup flow to run setup.sh through the terminal tool.

Testing

  • All modified files parse correctly
  • No breaking changes to existing APIs

Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.13-nodejs22 Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:54b9759-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-54b9759-python \
  ghcr.io/openhands/agent-server:54b9759-python

All tags pushed for this build

ghcr.io/openhands/agent-server:54b9759-golang-amd64
ghcr.io/openhands/agent-server:54b9759-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:54b9759-golang-arm64
ghcr.io/openhands/agent-server:54b9759-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:54b9759-java-amd64
ghcr.io/openhands/agent-server:54b9759-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:54b9759-java-arm64
ghcr.io/openhands/agent-server:54b9759-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:54b9759-python-amd64
ghcr.io/openhands/agent-server:54b9759-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-amd64
ghcr.io/openhands/agent-server:54b9759-python-arm64
ghcr.io/openhands/agent-server:54b9759-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-arm64
ghcr.io/openhands/agent-server:54b9759-golang
ghcr.io/openhands/agent-server:54b9759-java
ghcr.io/openhands/agent-server:54b9759-python

About Multi-Architecture Support

  • Each variant tag (e.g., 54b9759-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., 54b9759-python-amd64) are also available if needed

…ersations

Add the ability to execute tools (like the terminal) directly on a
conversation without going through the agent loop. This enables pre-run
setup operations like running .openhands/setup.sh through the agent's
persistent terminal session so environment changes persist.

Changes:
- Add ExecuteToolRequest/Response models
- Add EventService.execute_tool() method
- Add ConversationService.execute_tool() method
- Add POST /conversations/{id}/execute_tool endpoint
- Implement RemoteConversation.execute_tool() (was NotImplementedError)
- Add AsyncRemoteWorkspace.execute_tool() convenience method

Co-authored-by: openhands <openhands@all-hands.dev>
@github-actions
Copy link
Contributor

github-actions bot commented Mar 7, 2026

API breakage checks (Griffe)

Result: Passed

Action log

@github-actions
Copy link
Contributor

github-actions bot commented Mar 7, 2026

Agent server REST API breakage checks (OpenAPI)

Result: Failed

Log excerpt (first 1000 characters)
{"asctime": "2026-03-10 16:52:25,688", "levelname": "WARNING", "name": "openhands.agent_server.config", "filename": "config.py", "lineno": 173, "message": "\u26a0\ufe0f OH_SECRET_KEY was not defined. Secrets will not be persisted between restarts."}
::error title=openhands-agent-server REST API::Breaking REST API change detected without MINOR version bump (1.12.0 -> 1.12.0).

Breaking REST API changes detected compared to baseline release:
- added '#/components/schemas/HookExecutionEvent' to the '/items/anyOf[#/components/schemas/Event]/' response property 'oneOf' list for the response status '200'
- added '#/components/schemas/HookExecutionEvent' to the 'items/items/' response property 'oneOf' list for the response status '200'
- added '#/components/schemas/HookExecutionEvent' to the response body 'oneOf' list for the response status '200'
- the 'file' request property type/format changed from 'string'/'' to 'string'/'binary'
- added the new 'hook' enum value to the '/items/anyOf[#/co

Action log

@xingyaoww
Copy link
Collaborator Author

@OpenHands please fix https://github.com/OpenHands/software-agent-sdk/actions/runs/22794005415/job/66125654112?pr=2356 and https://github.com/OpenHands/software-agent-sdk/actions/runs/22794005420/job/66125654134?pr=2356

Then help me review the PR and test if this PR is working - When you are done, upload artifacts to .pr/ folder and push to the PR

@openhands-ai
Copy link

openhands-ai bot commented Mar 10, 2026

I'm on it! xingyaoww can track my progress at all-hands.dev

- Fix pyright errors in event_service.py by capturing _conversation
  in a local variable before the closure (type narrowing)
- Fix Observation ABC instantiation in remote_conversation.py by
  using a concrete _RemoteObservation subclass
- Update test to verify new execute_tool implementation behavior
- Fix line-too-long issues in docstrings and field descriptions

Co-authored-by: openhands <openhands@all-hands.dev>
@github-actions
Copy link
Contributor

📁 PR Artifacts Notice

This PR contains a .pr/ directory with PR-specific documents. This directory will be automatically removed when the PR is approved.

For fork PRs: Manual removal is required before merging.

@openhands-ai
Copy link

openhands-ai bot commented Mar 10, 2026

Summary of Work Done

I fixed both failing CI checks on PR #2356, reviewed the PR, functionally tested the new endpoint, and pushed everything (including .pr/ artifacts) to the feat/execute-tool-api branch.

CI Fixes (commit 045262b5, already pushed)

  1. Pyright type errors in event_service.py — Captured self._conversation in a local variable conversation before the nested _execute() closure, so pyright can narrow the type from LocalConversation | None to LocalConversation.

  2. Observation ABC instantiation in remote_conversation.pyObservation is abstract and can't be instantiated directly. Created a private concrete _RemoteObservation(Observation) subclass inside execute_tool().

  3. Updated stale test — Replaced test_remote_conversation_execute_tool_not_implemented (which expected NotImplementedError) with test_remote_conversation_execute_tool that mocks the API response and verifies the observation is correctly parsed.

  4. Line-too-long fixes — Fixed E501 in event_service.py docstring and models.py field description.

Functional Testing

Started the agent server and verified:

  • ✅ Terminal tool execution returns correct output
  • ✅ Environment variables persist across calls (the core setup.sh use case)
  • ✅ Nonexistent tool → 400 with helpful error listing available tools
  • ✅ Nonexistent conversation → 404

Artifacts

Detailed review notes uploaded to .pr/review.md.

Checklist

  • Fix pre-commit CI failure (pyright errors)
  • Fix sdk-tests CI failure (test + ABC bug)
  • Review the PR code
  • Functionally test the endpoint
  • Upload artifacts to .pr/ folder
  • Push to the PR branch

All changes are already pushed to origin/feat/execute-tool-api.

@github-actions
Copy link
Contributor

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-agent-server/openhands/agent_server
   conversation_router.py1211885%254, 312–315, 327–330, 351–352, 355–361
   conversation_service.py3848478%81–82, 109, 112, 201, 207, 215–216, 225–228, 237, 246, 268–269, 308, 311, 322–326, 328–331, 334–339, 419, 426–430, 433–434, 438–442, 445–446, 455–460, 464–468, 471–472, 478–483, 490–491, 495, 497–498, 503–504, 510–511, 518–519, 523–525, 543, 567, 797
   event_service.py3349571%55–56, 74–76, 85–89, 92–95, 115, 219, 236, 290–291, 295, 303, 306, 351–352, 354–355, 357, 359–363, 367–369, 371, 392–393, 409, 411, 415–417, 421, 430–431, 433, 437, 443, 445, 453–458, 594, 596–597, 601, 615–617, 619, 623–626, 630–633, 641–644, 664, 668–673, 685–686, 688–689, 696–697, 699–700, 704, 710, 727–728
openhands-sdk/openhands/sdk/conversation/impl
   remote_conversation.py60310482%127, 154, 167, 169–172, 182, 204–205, 210–213, 289, 299–301, 307, 348, 480–483, 485, 505–509, 514–517, 520, 532–536, 673–674, 678–679, 690, 709–710, 729, 740–741, 761–764, 766–767, 791–793, 796–800, 802–803, 807, 809–817, 819, 856, 983, 1051–1052, 1056, 1061–1065, 1071–1077, 1090–1091, 1167, 1174, 1180–1181, 1262–1263, 1277–1278
openhands-sdk/openhands/sdk/workspace/remote
   async_remote_workspace.py681676%25–30, 132–134, 148–150, 175, 179, 184–185
TOTAL20432523574% 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants