fix: UTF-8 panics + refactor: channel supervisor & routes split#193
Open
slysian wants to merge 5 commits intoRightNow-AI:mainfrom
Open
fix: UTF-8 panics + refactor: channel supervisor & routes split#193slysian wants to merge 5 commits intoRightNow-AI:mainfrom
slysian wants to merge 5 commits intoRightNow-AI:mainfrom
Conversation
Replace all unsafe byte-index string slicing (`&s[..N]`) with `str::floor_char_boundary()` across 18 files in runtime, kernel, api, memory, channels, and cli crates. These cause panics when the byte index falls inside a multi-byte UTF-8 character (common with CJK text from QQ/Telegram users). The crash was first observed when web_fetch returned Chinese web content that was truncated at a 3-byte character boundary. Affected hot paths: - context_budget: tool result truncation - context_overflow: overflow recovery truncation - compactor: conversation text tail-keeping - stream_chunker: forced break at max_chunk_chars - web_fetch: HTTP response body truncation - kernel: session topic & identity file truncation - docker_sandbox: stdout/stderr truncation - tool_runner: command/url logging, canvas_id slicing - provider_health: error body truncation - subprocess_sandbox: command logging - session_repair: injection marker stripping (to_lowercase byte mismatch) - cron/triggers: error message & content truncation - session (memory): thinking text truncation - TUI screens: ID/value display truncation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Create `supervisor.rs` with centralized reconnection/backoff infrastructure and refactor 23 channel adapters to use it, eliminating ~10,000 lines of duplicated reconnection logic. The supervisor module provides: - `SupervisorConfig`: configurable initial/max backoff (default 1s/60s) - `run_supervised_loop()`: generic supervised reconnection loop - `run_supervised_loop_reset_on_connect()`: variant that resets backoff after successful connection - `DEFAULT_CHANNEL_BUFFER` (256): shared constant replacing hardcoded sizes Each adapter's inline reconnection loop is extracted into a standalone `async fn` that returns `Result<bool, String>`: - `Ok(true)` = reconnect (transient failure) - `Ok(false)` = permanent stop (shutdown or channel closed) - `Err(msg)` = retry with backoff Refactored adapters: telegram, discord, slack, irc, mattermost, revolt, matrix, mastodon, zulip, bluesky, twitch, guilded, gotify, gitter, discourse, ntfy, nostr, webex, twist, mumble, nextcloud, reddit, keybase, linkedin. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ules Break the single routes.rs god file containing 174 handler functions into domain-specific modules under routes/: agents.rs (1,551 lines, 24 handlers) channels.rs (1,220 lines, 7 handlers) hands.rs (668 lines, 11 handlers) models.rs (554 lines, 8 handlers) skills.rs (489 lines, 9 handlers) sessions.rs (384 lines, 10 handlers) common.rs (330 lines, shared types/helpers) + 22 smaller domain modules Backward compatibility is maintained via `pub use` re-exports in mod.rs, so server.rs and test files continue referencing `routes::handler_name` without changes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace 12 instances of `id.to_string()[..8]` with `id.to_string().get(..8).unwrap_or(&id_str)` across kernel.rs (3) and channel_bridge.rs (9). While UUID Display always produces 36-char ASCII strings making [..8] currently safe, .get() is defensive against any future Display changes and is consistent with the floor_char_boundary() hardening in the rest of the codebase. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace hardcoded Whisper model names and API endpoints with LazyLock statics that read from environment variables at first use: - GROQ_STT_MODEL (default: whisper-large-v3-turbo) - GROQ_STT_URL (default: api.groq.com/openai/v1/audio/transcriptions) - OPENAI_STT_MODEL (default: whisper-1) - OPENAI_STT_URL (default: api.openai.com/v1/audio/transcriptions) This allows users to swap STT models or providers without recompiling. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Replace all unsafe byte-index string slicing (
&s[..N]) withstr::floor_char_boundary()across 18 files in runtime, kernel, api,memory, channels, and cli crates.
These cause panics when the byte index falls inside a multi-byte UTF-8
character (common with CJK text from QQ/Telegram users). The crash was
first observed when web_fetch returned Chinese web content that was
truncated at a 3-byte character boundary.
Affected hot paths:
Create
supervisor.rswith centralized reconnection/backoff infrastructureand refactor 23 channel adapters to use it, eliminating ~10,000 lines of
duplicated reconnection logic.
The supervisor module provides:
SupervisorConfig: configurable initial/max backoff (default 1s/60s)run_supervised_loop(): generic supervised reconnection looprun_supervised_loop_reset_on_connect(): variant that resets backoffafter successful connection
DEFAULT_CHANNEL_BUFFER(256): shared constant replacing hardcoded sizesEach adapter's inline reconnection loop is extracted into a standalone
async fnthat returnsResult<bool, String>:Ok(true)= reconnect (transient failure)Ok(false)= permanent stop (shutdown or channel closed)Err(msg)= retry with backoffRefactored adapters: telegram, discord, slack, irc, mattermost, revolt,
matrix, mastodon, zulip, bluesky, twitch, guilded, gotify, gitter,
discourse, ntfy, nostr, webex, twist, mumble, nextcloud, reddit,
keybase, linkedin.
…ules
Break the single routes.rs god file containing 174 handler functions
into domain-specific modules under routes/:
agents.rs (1,551 lines, 24 handlers)
channels.rs (1,220 lines, 7 handlers)
hands.rs (668 lines, 11 handlers)
models.rs (554 lines, 8 handlers)
skills.rs (489 lines, 9 handlers)
sessions.rs (384 lines, 10 handlers)
common.rs (330 lines, shared types/helpers)
Backward compatibility is maintained via
pub usere-exports inmod.rs, so server.rs and test files continue referencing
routes::handler_namewithout changes.Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com