-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Description
Problem (one or two sentences)
Users currently have only two ways to interact with a running agent: stop it entirely or wait for an approval prompt. There is no way to send guidance or corrections while the agent is actively working — users must abort, rephrase, and restart from scratch even for minor course corrections.
Context (who is affected and when)
This affects all users who give multi-step tasks to Roo Code. Common scenarios:
- The agent starts coding with Jest, but the user wants Vitest — they have to stop and restart
- The agent is analyzing the wrong directory — the user sees it happening but can't redirect
- The user remembers an important detail mid-task (e.g., "use UTF-8 encoding") but has no way to pass it along without interrupting
The more complex the task, the more costly a full restart becomes (lost context, wasted tokens, time).
Desired behavior (conceptual, not technical)
A "Steer" button alongside the existing Stop button. When clicked:
- User types coaching advice (e.g., "Use Vitest instead of Jest")
- The advice is queued — the agent is not interrupted
- At the next API call, the advice is included in the prompt as context
- The agent naturally incorporates the guidance into its next action
- The user sees status feedback: sent → acknowledged → applied
Think of it like a co-pilot giving directions — the car keeps moving, the driver just adjusts course.
Optionally, if the agent shows a plan (Step 1 → Step 2 → Step 3), the user can attach advice to a future step before the agent reaches it (preemptive steering).
Constraints / preferences (optional)
- Must not break the existing Stop functionality
- Should add zero overhead when no advice is pending (no extra tokens sent to LLM)
- Should work regardless of which API provider is configured
- Minimal changes to the core Task loop — ideally 2-3 lines at the injection point
Request checklist
- I've searched existing Issues and Discussions for duplicates
- This describes a specific problem with clear context and impact
Roo Code Task Links (optional)
No response
Acceptance criteria (optional)
Given the agent is executing a multi-step task
When the user clicks "Steer" and types "Use TypeScript strict mode"
Then the advice is queued without interrupting the current step
And the next API request includes the advice as additional context
And the agent acknowledges and incorporates the advice
But the agent is NOT stopped or restarted
Given no steering advice is pending
When the agent makes an API request
Then zero additional tokens are added to the prompt
And there is no performance impact
Proposed approach (optional)
I've built a standalone library @steer-agent/core (published on npm, Apache-2.0) that implements this as a middleware. It provides:
- SteeringQueue — queues reactive advice from the user
- PlanTracker — tracks agent plan + attaches preemptive advice to future steps
- SteeringInjector — formats advice into prompt-injectable text
- LoopHook — 2-3 line integration point for any agent loop
Integration point in Roo Code (src/core/task/Task.ts, line ~2641):
// Existing code — where environmentDetails is added to finalUserContent
let finalUserContent = [...contentWithoutEnvDetails, { type: "text", text: environmentDetails }]
// ★ Proposed addition (2 lines)
const injection = this.steeringHook?.getInjection(currentStepInfo)
if (injection?.hasAdvice) {
finalUserContent.push({ type: "text", text: injection.injectionText })
}The library is fully tested (28 tests passing) and includes an interactive CLI demo for hands-on verification.
Architecture details: ARCHITECTURE.md
Trade-offs / risks (optional)
| Risk | Mitigation |
|---|---|
| Extra tokens per request | Zero when no advice pending; capped at 5 messages (~150 tokens) when active |
| User sends conflicting advice | Injection text instructs the LLM to "use your judgment if it conflicts" |
| Advice arrives too late (step already passed) | TTL-based expiration (5 min default) + status tracking |
| UI complexity | Single "Steer" button with minimal text input — no complex UI needed for MVP |