feat(agent): context-aware auto-compaction with retry on context_window_exceeded

## Background

Currently `auto_compact_history` triggers based on **message count** only (`max_history_messages`, default 200). This is a rough proxy for context size — it doesn't account for actual token usage or context window limits.

## Current Behavior

1. `auto_compact_history` runs once at the start of each turn, compacting old messages into an LLM-generated summary when message count exceeds the limit
2. `reliable.rs` already detects `context_window_exceeded` errors via `is_context_window_exceeded()` (matching keywords like "exceeds the context window", "maximum context length", "too many tokens", etc.)
3. **But on detection, it `bail!`s immediately** — no recovery, no retry

## Proposed Improvement

### Phase 1: Compact-then-retry on context exceeded

In `orchestrate.rs`, when the LLM call returns a context_window_exceeded error:
1. Force-run `auto_compact_history` (even if message count is within limit)
2. Retry the LLM call
3. If still fails, then fatal error

This is the minimal useful change — leverage the existing error detection in `reliable.rs` as a reactive trigger.

### Phase 2 (future): Token-based proactive compaction

- Track cumulative `input_tokens` from provider responses across turns
- Store per-model context window sizes (from registry or model metadata)
- Trigger compaction when `input_tokens > context_window * 0.8`
- Requires providers to reliably return token usage (Codex SSE may not always include it)

## Reference

- `crewforge-rs/src/agent/orchestrate.rs:274-287` — current fatal error on LLM failure
- `crewforge-rs/src/agent/history.rs:114-172` — auto_compact_history implementation
- `crewforge-rs/src/provider/reliable.rs:82-88` — is_context_window_exceeded detection
- Upstream (vendor/zeroclaw) uses the same message-count approach, no token-based compaction either

## Related Constants

- `COMPACTION_KEEP_RECENT_MESSAGES = 20`
- `COMPACTION_MAX_SOURCE_CHARS = 12,000`
- `COMPACTION_MAX_SUMMARY_CHARS = 2,000`
- Token estimation heuristic: ~4 chars/token (from upstream)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(agent): context-aware auto-compaction with retry on context_window_exceeded #3

Background

Current Behavior

Proposed Improvement

Phase 1: Compact-then-retry on context exceeded

Phase 2 (future): Token-based proactive compaction

Reference

Related Constants

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

feat(agent): context-aware auto-compaction with retry on context_window_exceeded #3

Description

Background

Current Behavior

Proposed Improvement

Phase 1: Compact-then-retry on context exceeded

Phase 2 (future): Token-based proactive compaction

Reference

Related Constants

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions