Skip to content

Builtin read tool panics on UTF-8 character boundary at 2000-byte truncation #21

@zoubo9034

Description

@zoubo9034

Summary

The builtin read tool panics when the 2000th byte of the target file falls inside a multibyte UTF-8 character.

This is reproducible through the Python API with a completely generic local setup. It does not require MCP servers, project files, or private data.

Minimal repro

from pathlib import Path
from tempfile import TemporaryDirectory

from a3s_code import Agent

with TemporaryDirectory(prefix="a3s-read-repro-") as tmp_dir:
    workspace = Path(tmp_dir)
    boundary_file = workspace / "boundary.txt"

    # 1999 ASCII bytes + one 3-byte UTF-8 char + trailing ASCII.
    # This places byte index 2000 inside a multibyte code point.
    boundary_file.write_text("a" * 1999 + "频" + "z" * 20, encoding="utf-8")

    agent = Agent.create("/path/to/your/working/config.hcl")
    session = agent.session(str(workspace), permissive=True)

    session.tool("read", {"file_path": str(boundary_file)})

Steps to reproduce

  1. Create any valid local a3s-code config.
  2. Run the script above.
  3. Observe the builtin read tool panic.

Expected behavior

The read tool should either:

  • return a valid truncated UTF-8 string, or
  • return a normal tool error

but it should not panic the runtime.

Actual behavior

The runtime panics with:

thread '<unnamed>' panicked at .../core/src/tools/builtin/read.rs:99:22:
byte index 2000 is not a char boundary; it is inside '频' (bytes 1999..2002)
...
pyo3_runtime.PanicException: byte index 2000 is not a char boundary; it is inside '频' ...

Notes

  • This appears to be a byte-slicing vs UTF-8 character-boundary bug in builtin read.
  • The issue is independent of my application code.
  • A temporary workaround on my side is to avoid feeding non-ASCII intermediate files to the builtin read tool.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions