fix(terminal): filter terminal query sequences from captured output by jpshackelford · Pull Request #2334 · OpenHands/software-agent-sdk

jpshackelford · 2026-03-06T01:03:48Z

Summary

Fixes #2244 - Filter terminal query sequences from captured PTY output to prevent visible escape code garbage when displayed.

Problem

When CLI tools like gh, npm, or other progress-indicator tools run inside the SDK's PTY, they send terminal query sequences as part of their spinner/progress UI:

\x1b[6n - DSR (Device Status Report) - cursor position query
\x1b]11;? - OSC 11 - background color query
\x1b[c - DA (Device Attributes) query

These queries get captured as part of the command output. When the output is later displayed to the user's terminal, the terminal processes these queries and responds, causing visible garbage like:

^[[38;1R^[]11;rgb:30fb/3708/41af^G

How to Reproduce

Run any SDK example that executes terminal commands
Execute a command that uses progress indicators: gh pr list --repo OpenHands/openhands
Observe escape code garbage appearing in the output or corrupting the shell prompt after exit

Visual Example

Before fix:

$ gh pr list
Fetching PRs...^[[6n^[]11;?
#123  Fix bug    main
^[[38;1R

After fix:

$ gh pr list
Fetching PRs...
#123  Fix bug    main

Root Cause Analysis

The escape codes are IN the captured PTY output stream, not generated by terminal responses to the SDK's own queries. When gh (or similar tools) runs:

gh sends \x1b[6n to query cursor position (for spinner positioning)
This query is written to the PTY's stdout
The SDK captures all PTY output, including the query
When displayed, the user's terminal sees the query and responds
The response appears as visible garbage

Solution

Add filter_terminal_queries() to strip terminal query sequences from captured output before returning from the terminal tool. This removes the queries at the source, so the user's terminal never sees them.

Filtered sequences:

DSR (\x1b[6n) - cursor position query
OSC 10/11/4 (\x1b]10;?, \x1b]11;?, \x1b]4;?) - color queries
DA/DA2 (\x1b[c, \x1b[>c) - device attributes
DECRQSS (\x1bP$q...\x1b\\) - terminal state queries

Preserved sequences:

ANSI colors (\x1b[31m, \x1b[0m, etc.)
Cursor movement (\x1b[H, \x1b[5A, etc.)
Text formatting (bold, underline, etc.)

Testing

# Run the unit tests
uv run pytest tests/tools/terminal/test_escape_filter.py -v

# Manual test with gh command
uv run python -c "
from openhands.tools.terminal.utils.escape_filter import filter_terminal_queries

# Simulated output with embedded queries
output = '\x1b[32mSuccess\x1b[0m\x1b[6n some text \x1b]11;?\x07'
filtered = filter_terminal_queries(output)
print(repr(filtered))
# Output: '\x1b[32mSuccess\x1b[0m some text '
"

Files Changed

File	Change
`openhands-tools/.../utils/escape_filter.py`	NEW - Filter implementation (83 lines)
`openhands-tools/.../utils/__init__.py`	NEW - Export function
`openhands-tools/.../terminal_session.py`	Apply filter in `_get_command_output()`
`tests/tools/terminal/test_escape_filter.py`	NEW - 15 unit tests

Design Decisions

Filter at source: Apply filter where output is captured (terminal tool), not where it's displayed. This is simpler and more reliable.
Byte-level regex: Use compiled regex patterns on bytes for accurate escape sequence matching.
Preserve formatting: Only remove query sequences that trigger responses; keep colors and cursor movement intact.
Minimal scope: This fix targets only the terminal tool output processing - no SDK-level changes needed.

Fixes: #2244

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.12-nodejs22`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:39dc01c-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-39dc01c-python \
  ghcr.io/openhands/agent-server:39dc01c-python

All tags pushed for this build

ghcr.io/openhands/agent-server:39dc01c-golang-amd64
ghcr.io/openhands/agent-server:39dc01c-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:39dc01c-golang-arm64
ghcr.io/openhands/agent-server:39dc01c-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:39dc01c-java-amd64
ghcr.io/openhands/agent-server:39dc01c-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:39dc01c-java-arm64
ghcr.io/openhands/agent-server:39dc01c-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:39dc01c-python-amd64
ghcr.io/openhands/agent-server:39dc01c-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-amd64
ghcr.io/openhands/agent-server:39dc01c-python-arm64
ghcr.io/openhands/agent-server:39dc01c-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-arm64
ghcr.io/openhands/agent-server:39dc01c-golang
ghcr.io/openhands/agent-server:39dc01c-java
ghcr.io/openhands/agent-server:39dc01c-python

About Multi-Architecture Support

Each variant tag (e.g., 39dc01c-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., 39dc01c-python-amd64) are also available if needed

Filter terminal query sequences (DSR, OSC, DA, etc.) from captured PTY output before returning from terminal tool. These queries cause the terminal to respond when displayed, producing visible escape code garbage. Root cause: CLI tools like `gh` send terminal queries as part of their progress/spinner UI. When captured and displayed, the terminal processes them and responds, causing visible garbage like `^[[38;1R`. Solution: Add filter_terminal_queries() to strip query sequences while preserving legitimate formatting codes (colors, bold, etc.). Fixes: #2244 Co-authored-by: openhands <openhands@all-hands.dev>

github-actions · 2026-03-06T01:04:17Z

API breakage checks (Griffe)

Result: Passed

Action log

github-actions · 2026-03-06T01:04:27Z

Agent server REST API breakage checks (OpenAPI)

Result: Failed

Log excerpt (first 1000 characters)

{"asctime": "2026-03-06 12:25:29,963", "levelname": "WARNING", "name": "openhands.agent_server.config", "filename": "config.py", "lineno": 173, "message": "\u26a0\ufe0f OH_SECRET_KEY was not defined. Secrets will not be persisted between restarts."}
::error title=openhands-agent-server REST API::Breaking REST API change detected without MINOR version bump (1.12.0 -> 1.12.0).

Breaking REST API changes detected compared to baseline release:
- the 'file' request property type/format changed from 'string'/'' to 'string'/'binary'
/home/runner/work/software-agent-sdk/software-agent-sdk/.venv/lib/python3.13/site-packages/litellm/llms/custom_httpx/async_client_cleanup.py:66: DeprecationWarning: There is no current event loop
  loop = asyncio.get_event_loop()

Action log

all-hands-bot

🟡 Acceptable - Requires Eval Verification

Taste Rating: Code is clean and solves a real problem. However, this touches terminal output handling and needs eval verification before approval per repo guidelines.

Assessment:

✅ Solves real problem (visible escape code garbage)
✅ Simple, targeted solution (filter at source)
✅ Comprehensive tests with good coverage
⚠️ Touches terminal/stdout handling → flag for lightweight evals

The implementation is solid. Regex patterns are compiled, tests verify both removal and preservation, and the fix is applied at the right layer. Once evals confirm no regressions, this is ready to merge.

openhands-tools/openhands/tools/terminal/utils/escape_filter.py

openhands-tools/openhands/tools/terminal/terminal/terminal_session.py

github-actions · 2026-03-06T01:11:12Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
openhands-tools/openhands/tools/terminal/terminal
terminal_session.py	189	66	65%	93, 99, 103–105, 132–133, 165, 180–181, 220–222, 227, 230–231, 235, 241, 244, 259–261, 266, 269–270, 274, 280, 283, 303, 305, 308, 310, 326, 341, 347, 356, 359, 393, 397, 400, 403–404, 410–411, 417, 420, 427–428, 434–435, 494–496, 500, 505, 510–511, 515–516, 519–522, 528–529, 532
TOTAL	19505	9873	49%

Change the OSC filter pattern from matching specific codes (10, 11, 4) to matching any OSC query (sequences ending with ;? before terminator). This is more future-proof and catches additional query types like: - OSC 12 (cursor color) - OSC 17 (highlight background) - Any other OSC queries that follow the standard format The pattern now matches: ESC ] Ps [;param] ;? TERMINATOR Where ;? indicates it's a query, not a set operation. Importantly, SET operations are preserved: - OSC 0 (window title) - OSC 8 (hyperlinks) - OSC 7 (working directory) Co-authored-by: openhands <openhands@all-hands.dev>

jpshackelford · 2026-03-06T01:14:43Z

Good catch! I've updated the OSC pattern to be more general.

Before: Matched only OSC codes 10, 11, 4
After: Matches any OSC sequence ending with ;? (the query marker)

The new pattern: ESC ] Ps [;param] ;? TERMINATOR

This catches all OSC queries:

✅ OSC 10/11 (fg/bg color)
✅ OSC 4 (palette)
✅ OSC 12 (cursor color)
✅ OSC 17 (highlight background)
✅ Any future OSC query types

While preserving SET operations:

✅ OSC 0 (window title) - no ;? = preserved
✅ OSC 8 (hyperlinks) - no ;? = preserved
✅ OSC 7 (working directory) - no ;? = preserved

Added 5 new tests to verify the behavior. See commit a499579.

Adds .pr/test_real_world.py that runs an agent with the gh command to verify terminal query sequences are properly filtered. Usage: LLM_BASE_URL="https://llm-proxy.eval.all-hands.dev" LLM_API_KEY="$LLM_API_KEY" \ uv run python .pr/test_real_world.py Co-authored-by: openhands <openhands@all-hands.dev>

github-actions · 2026-03-06T01:25:14Z

📁 PR Artifacts Notice

This PR contains a .pr/ directory with PR-specific documents. This directory will be automatically removed when the PR is approved.

For fork PRs: Manual removal is required before merging.

jpshackelford · 2026-03-06T01:28:03Z

Manual Testing Instructions

A real-world test script is available in .pr/test_real_world.py to verify the fix works correctly.

How to Run

# Clone and checkout the branch
git fetch origin fix/terminal-escape-filter-minimal
git checkout fix/terminal-escape-filter-minimal

# Run the test (uses All-Hands LLM proxy)
LLM_BASE_URL="https://llm-proxy.eval.all-hands.dev" LLM_API_KEY="$LLM_API_KEY" \
    uv run python .pr/test_real_world.py

What the Test Does

Creates an agent with the terminal tool
Asks it to run gh pr list --repo OpenHands/openhands --limit 3
The gh command sends terminal query sequences (DSR, OSC) as part of its spinner UI
With the fix, these queries are filtered from the captured output

Success Criteria

✅ Pass if:

NO visible escape codes like ^[[38;1R or rgb:30fb/3708/41af in the output
NO garbage appears on your shell prompt after the script exits
Colors from gh output are still visible (formatting preserved)

❌ Fail if:

You see raw escape sequences in the terminal output
Garbage characters appear after the script completes

Without the Fix (for comparison)

To see the problem on main:

git checkout main
LLM_BASE_URL="https://llm-proxy.eval.all-hands.dev" LLM_API_KEY="$LLM_API_KEY" \
    uv run python .pr/test_real_world.py

You should see escape code garbage like ^[[6n or ]11;? in the output.

jpshackelford · 2026-03-06T11:30:36Z

I have confirmed that this solution works. Note that there is a change in what the agent displays. Since we filter out OSC queries, the cli does not render the gh spinner. I think this an acceptable limitation. (Note that we aren't preventing the spinner from displaying, but when gh doesn't get back terminal query results, it elects not to display the spinner.)

Why the `gh` Spinner Doesn't Render Properly

The spinner animation in gh (and similar CLI tools) relies on terminal query sequences to function correctly. Here's why:
How Spinners Work

Query cursor position: The spinner sends \x1b[6n (DSR) to ask "where is my cursor?"
Receive response: The terminal responds with \x1b[row;colR
Overwrite in place: Using the cursor position, the spinner moves back and overwrites itself with the next frame (⣾ → ⣽ → ⣻ → etc.)

What Our Filter Does

We filter out DSR queries (\x1b[6n) from the captured output because when they're displayed to the user's terminal, that terminal responds - and the response becomes visible garbage.

The Consequence

Without the cursor position query reaching the terminal:

The spinner never learns where it is
It can't move back to overwrite itself
Each spinner frame may appear on a new line, or the spinner may not animate at all

Why This Is Acceptable

The command still works - gh pr list executes correctly and returns results
Actual output is preserved - The PR list, colors, and formatting are intact
Agent context - Spinners are for human feedback during waits; the agent doesn't need visual progress indicators
The alternative is worse - Without filtering, you get ^[[6n^[[38;1R garbage polluting the output

Why Filter the Query, Not the Response?

The Response Problem

When a terminal query like \x1b[6n is displayed, the user's terminal:

Processes the query
Writes its response to stdin (e.g., \x1b[38;1R)

Filtering the response would require:

Monitoring stdin continuously - Responses arrive asynchronously, potentially long after we've returned output to the agent. We'd need to constantly drain stdin throughout the entire session.
Distinguishing responses from user input - If a user types while the agent is running, their keystrokes arrive on stdin too. How do we know \x1b[A is a terminal response vs. the user pressing the up arrow? We risk eating legitimate input.
Racing against echo - By the time the response arrives on stdin, the terminal may have already echoed it to the display. The visible garbage (^[[38;1R) appears because the terminal echoes the response before we can intercept it. Filtering stdin doesn't prevent the visual pollution - the damage is already done.
Complex terminal mode manipulation - Reliably reading stdin without blocking, while preserving terminal state, while not corrupting user input, across different platforms... this is the path the original PR fix(terminal): filter terminal query sequences from captured output #2245 went down with flush_stdin() - 700+ lines of complexity and it did not work reliably without the OSC filtering.

Why Filtering Queries Is Better

Single point of control - Filter in _get_command_output() before output is returned
No response is ever generated - If the query never reaches the display terminal, there's nothing to clean up
No stdin complexity - No terminal modes, no race conditions, no risk of eating user input
Deterministic and testable - Simple regex on captured output

jpshackelford · 2026-03-06T11:39:45Z

I think this is ready except that should probably test the CLI built against this version of the SDK to ensure that our approach here doesn't interfere with the TUI.

Perhaps the best course is to open a PR that will build the CI against this branch of the SDK and recruit some users to use it for a day or two.

This CLI build uses the software-agent-sdk branch from PR #2334 which includes the terminal escape filter fix for tools like gh, npm that use spinner/progress UI. SDK PR: OpenHands/software-agent-sdk#2334 Co-authored-by: openhands <openhands@all-hands.dev>

jpshackelford · 2026-03-06T12:23:59Z

It looks like testing this in the CLI is blocked until breaking change in #2133 is dealt with in the CLI unless we rebase this fix branch on v1.11.5.

jpshackelford · 2026-03-09T17:58:42Z

This PR was merged and should address the blocker on testing with OpenHands-CLI OpenHands/OpenHands-CLI#587

enyst · 2026-03-09T18:04:48Z

@OpenHands Do a /codereview-roasted on this PR. Publish your review feedback as review on the PR, using your appropriate event in gh api. (not a comment, a review, you are allowed to review)

openhands-ai · 2026-03-09T18:05:37Z

I'm on it! enyst can track my progress at all-hands.dev

enyst

🔴 Needs improvement

This is aimed at a real bug and the implementation stays pleasantly small, but there are two correctness holes here: the fix only applies to output coming back through TerminalSession, and the filter is stateless so split escape sequences can still leak through incremental updates. Since this also touches terminal/stdout handling, I’d want lightweight eval coverage after those are addressed.

Verdict: not ready as the claimed fix for #2244 yet.

Key insight: sanitizing after the stream has already been split into per-observation chunks is too late; either the sanitization needs carry-over state, or part of the fix has to live at the actual SDK terminal boundary rather than only in the terminal tool output path.

enyst · 2026-03-09T18:09:01Z

openhands-tools/openhands/tools/terminal/terminal/terminal_session.py

+
+        # Filter terminal query sequences that would cause the terminal to
+        # respond when displayed, producing visible garbage
+        command_output = filter_terminal_queries(command_output)


This only sanitizes bytes that flow back through the terminal tool. Issue #2244 also reproduces when the SDK process itself emits terminal queries (the minimal repro in the issue does that directly, and Rich capability detection is another example). Those paths never touch TerminalSession, so this patch doesn’t actually close the full bug it claims to fix. Either narrow the scope to PTY-emitted queries only, or handle the SDK-side leak at the conversation/visualizer boundary too.

enyst · 2026-03-09T18:09:01Z

openhands-tools/openhands/tools/terminal/utils/escape_filter.py

+    # Convert to bytes for regex matching (escape sequences are byte-level)
+    output_bytes = output.encode("utf-8", errors="surrogateescape")
+
+    # Remove each type of query sequence


This filter is stateless, which means a query split across observations survives unchanged. For example, if one update ends with \u001b]11; and the next starts with ?\u0007, neither call matches, but the client still receives the full OSC query once those chunks are rendered in sequence. That’s not a theoretical edge case here because long-running commands are surfaced incrementally. The fix needs carry-over state for incomplete escape sequences, or it needs to run before the output is sliced into deltas.

openhands-ai · 2026-03-09T18:09:41Z

Final summary:

Reviewed PR fix(terminal): filter terminal query sequences from captured output #2334 against main, the PR description, and linked issue #2244.
Submitted a GitHub PR review using the proper COMMENT review event via gh api (not a regular comment).
Left two substantive inline concerns:
1. the fix only sanitizes output in TerminalSession, so it does not fully cover the SDK-side leakage described in #2244
2. the filter is stateless, so escape sequences split across incremental output chunks can still leak through

Checklist:

Performed a roasted code review of the PR
Published feedback as a review, not a comment
Focused on important correctness issues only
Made no code changes
No extraneous local modifications to push

Because I did not change repository files, there was nothing to push to the remote branch.

all-hands-bot reviewed Mar 6, 2026

View reviewed changes

openhands-tools/openhands/tools/terminal/utils/escape_filter.py Show resolved Hide resolved

openhands-tools/openhands/tools/terminal/terminal/terminal_session.py Show resolved Hide resolved

jpshackelford mentioned this pull request Mar 6, 2026

fix(terminal): filter terminal query sequences from captured output #2245

Open

Merge branch 'main' into fix/terminal-escape-filter-minimal

e167891

enyst mentioned this pull request Mar 6, 2026

Investigation: CLI compatibility and API breakage check for PR #2133 #2343

Closed

enyst requested a review from xingyaoww March 9, 2026 18:03

enyst reviewed Mar 9, 2026

View reviewed changes

Conversation

jpshackelford commented Mar 6, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

How to Reproduce

Visual Example

Root Cause Analysis

Solution

Testing

Files Changed

Design Decisions

Uh oh!

github-actions bot commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

API breakage checks (Griffe)

Uh oh!

github-actions bot commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Agent server REST API breakage checks (OpenAPI)

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

🟡 Acceptable - Requires Eval Verification

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jpshackelford commented Mar 6, 2026

Uh oh!

github-actions bot commented Mar 6, 2026

Uh oh!

jpshackelford commented Mar 6, 2026

Manual Testing Instructions

How to Run

What the Test Does

Success Criteria

Without the Fix (for comparison)

Uh oh!

jpshackelford commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why the gh Spinner Doesn't Render Properly

What Our Filter Does

The Consequence

Why This Is Acceptable

Why Filter the Query, Not the Response?

The Response Problem

Uh oh!

jpshackelford commented Mar 6, 2026

Uh oh!

jpshackelford commented Mar 6, 2026

Uh oh!

jpshackelford commented Mar 9, 2026

Uh oh!

enyst commented Mar 9, 2026

Uh oh!

openhands-ai bot commented Mar 9, 2026

Uh oh!

enyst left a comment

Choose a reason for hiding this comment

Uh oh!

enyst Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

enyst Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

openhands-ai bot commented Mar 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jpshackelford commented Mar 6, 2026 •

edited by github-actions bot

Loading

github-actions bot commented Mar 6, 2026 •

edited

Loading

github-actions bot commented Mar 6, 2026 •

edited

Loading

github-actions bot commented Mar 6, 2026 •

edited

Loading

jpshackelford commented Mar 6, 2026 •

edited

Loading

Why the `gh` Spinner Doesn't Render Properly