fix(opencode): reduce memory usage during prompting with lazy boundary scan and context windowing by BYK · Pull Request #18137 · anomalyco/opencode

BYK · 2026-03-18T21:37:53Z

Issue for this PR

Type of change

Bug fix
New feature
Refactor / code improvement
Documentation

What does this PR do?

Two targeted optimizations to reduce peak RSS during prompting from ~4-8GB down to ~1.2GB:

1. Lazy compaction boundary scan (filterCompactedLazy)

The prompt loop calls filterCompacted(stream(sessionID)) which streams ALL messages newest→oldest, loading parts for every message. For compacted sessions, most of those parts are discarded once the boundary is found.

New approach: probe the newest 50 message infos (1 DB query, no parts). If a compaction summary is detected, use a two-phase scan — info-only scan to find the boundary, then hydrate parts only for messages after it. If no compaction summary is found, fall back to the original single-pass filterCompacted(stream()) to avoid wasted info-only queries.

2. Context-window message windowing

toModelMessages was called with ALL messages (e.g., 7,704 for a long session), creating ModelMessage wrapper objects for every one. These flow through 4-5 copy layers (toModelMessages → convertToModelMessages → ProviderTransform.message → convertToLanguageModelPrompt), each creating ~60MB of wrapper objects.

Now the prompt loop estimates which messages from the tail fit in the LLM context window (model.limit.context × 4 chars/token) and only passes those to toModelMessages. For a 7,704-message session where ~200 fit, this cuts the conversion pipeline from ~300MB to ~10MB.

3. Prompt loop caching

The conversation is loaded once before the loop. On normal tool-call iterations, only the latest 200-message page is fetched and merged into the cache. Full reload only happens after compaction.

How did you verify your code works?

Monitored RSS with /proc/<PID>/status every 30s during active prompting
Before: peak 4.8GB, idle 1GB
After: peak 1.2GB, idle ~580MB
All session tests pass (118 pass, 4 skip, 0 fail)
Tested with both compacted and uncompacted sessions

Screenshots / recordings

Memory monitoring (30s intervals) after fix:

time,rss_mb,hwm_mb
20:46:02,942,1236    ← active prompting
20:48:02,1020,1236   ← peak during tool calls
20:50:02,606,1236    ← settled after activity
20:55:32,568,1236    ← stable idle

Checklist

I have tested my changes locally
I have not included unrelated changes in this PR

Two optimizations to drastically reduce memory during prompting: 1. filterCompactedLazy: probe newest 50 message infos (1 query, no parts) to detect compaction. If none found, fall back to original single-pass filterCompacted(stream()) — avoids 155+ wasted info-only queries for uncompacted sessions. Compacted sessions still use the efficient two-pass scan. 2. Context-window windowing: before calling toModelMessages, estimate which messages from the tail fit in the LLM context window using model.limit.context * 4 chars/token. Only convert those messages to ModelMessage format. For a 7,704-message session where ~200 fit in context, this reduces toModelMessages input from 7,704 to ~200 messages — cutting ~300MB of wrapper objects across 4-5 copy layers down to ~10MB. Also caches conversation across prompt loop iterations — full reload only after compaction, incremental merge for tool-call steps.

github-actions · 2026-03-18T21:38:04Z

Hey! Your PR title perf(session): reduce memory usage during prompting with lazy boundary scan and context windowing doesn't follow conventional commit format.

Please update it to start with one of:

feat: or feat(scope): new feature
fix: or fix(scope): bug fix
docs: or docs(scope): documentation changes
chore: or chore(scope): maintenance tasks
refactor: or refactor(scope): code refactoring
test: or test(scope): adding or updating tests

Where scope is the package name (e.g., app, desktop, opencode).

See CONTRIBUTING.md for details.

github-actions bot added the needs:title label Mar 18, 2026

BYK changed the title ~~perf(session): reduce memory usage during prompting with lazy boundary scan and context windowing~~ fix(opencode): reduce memory usage during prompting with lazy boundary scan and context windowing Mar 18, 2026

github-actions bot removed the needs:title label Mar 18, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(opencode): reduce memory usage during prompting with lazy boundary scan and context windowing#18137

fix(opencode): reduce memory usage during prompting with lazy boundary scan and context windowing#18137
BYK wants to merge 1 commit intoanomalyco:devfrom
BYK:perf/session-memory-windowing

BYK commented Mar 18, 2026

Uh oh!

github-actions bot commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

BYK commented Mar 18, 2026

Issue for this PR

Type of change

What does this PR do?

How did you verify your code works?

Screenshots / recordings

Checklist

Uh oh!

github-actions bot commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant