Skip to content

[ENHANCEMENT] Real-time steering: coach the agent mid-task without stopping it #11802

@shdomi8599

Description

@shdomi8599

Problem (one or two sentences)

Users currently have only two ways to interact with a running agent: stop it entirely or wait for an approval prompt. There is no way to send guidance or corrections while the agent is actively working — users must abort, rephrase, and restart from scratch even for minor course corrections.

Context (who is affected and when)

This affects all users who give multi-step tasks to Roo Code. Common scenarios:

  • The agent starts coding with Jest, but the user wants Vitest — they have to stop and restart
  • The agent is analyzing the wrong directory — the user sees it happening but can't redirect
  • The user remembers an important detail mid-task (e.g., "use UTF-8 encoding") but has no way to pass it along without interrupting

The more complex the task, the more costly a full restart becomes (lost context, wasted tokens, time).

Desired behavior (conceptual, not technical)

A "Steer" button alongside the existing Stop button. When clicked:

  1. User types coaching advice (e.g., "Use Vitest instead of Jest")
  2. The advice is queued — the agent is not interrupted
  3. At the next API call, the advice is included in the prompt as context
  4. The agent naturally incorporates the guidance into its next action
  5. The user sees status feedback: sent → acknowledged → applied

Think of it like a co-pilot giving directions — the car keeps moving, the driver just adjusts course.

Optionally, if the agent shows a plan (Step 1 → Step 2 → Step 3), the user can attach advice to a future step before the agent reaches it (preemptive steering).

Constraints / preferences (optional)

  • Must not break the existing Stop functionality
  • Should add zero overhead when no advice is pending (no extra tokens sent to LLM)
  • Should work regardless of which API provider is configured
  • Minimal changes to the core Task loop — ideally 2-3 lines at the injection point

Request checklist

  • I've searched existing Issues and Discussions for duplicates
  • This describes a specific problem with clear context and impact

Roo Code Task Links (optional)

No response

Acceptance criteria (optional)

Given the agent is executing a multi-step task
When the user clicks "Steer" and types "Use TypeScript strict mode"
Then the advice is queued without interrupting the current step
And the next API request includes the advice as additional context
And the agent acknowledges and incorporates the advice
But the agent is NOT stopped or restarted
Given no steering advice is pending
When the agent makes an API request
Then zero additional tokens are added to the prompt
And there is no performance impact

Proposed approach (optional)

I've built a standalone library @steer-agent/core (published on npm, Apache-2.0) that implements this as a middleware. It provides:

  • SteeringQueue — queues reactive advice from the user
  • PlanTracker — tracks agent plan + attaches preemptive advice to future steps
  • SteeringInjector — formats advice into prompt-injectable text
  • LoopHook — 2-3 line integration point for any agent loop

Integration point in Roo Code (src/core/task/Task.ts, line ~2641):

// Existing code — where environmentDetails is added to finalUserContent
let finalUserContent = [...contentWithoutEnvDetails, { type: "text", text: environmentDetails }]

// ★ Proposed addition (2 lines)
const injection = this.steeringHook?.getInjection(currentStepInfo)
if (injection?.hasAdvice) {
  finalUserContent.push({ type: "text", text: injection.injectionText })
}

The library is fully tested (28 tests passing) and includes an interactive CLI demo for hands-on verification.

Architecture details: ARCHITECTURE.md

Trade-offs / risks (optional)

Risk Mitigation
Extra tokens per request Zero when no advice pending; capped at 5 messages (~150 tokens) when active
User sends conflicting advice Injection text instructs the LLM to "use your judgment if it conflicts"
Advice arrives too late (step already passed) TTL-based expiration (5 min default) + status tracking
UI complexity Single "Steer" button with minimal text input — no complex UI needed for MVP

Metadata

Metadata

Assignees

No one assigned

    Labels

    EnhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions