Skip to content

[3604] Surface mutating tool evidence status in the interactive TUI#3606

Merged
njfio merged 8 commits intomasterfrom
3604-tui-mutating-evidence-status-pr
Mar 20, 2026
Merged

[3604] Surface mutating tool evidence status in the interactive TUI#3606
njfio merged 8 commits intomasterfrom
3604-tui-mutating-evidence-status-pr

Conversation

@njfio
Copy link
Owner

@njfio njfio commented Mar 20, 2026

Closes #3604

Spec: specs/3604/spec.md

What/why:

  • surface mutating tool evidence in the interactive TUI during active build/create turns
  • reset evidence per turn so prior successful writes do not leak into new turns
  • keep non-build and idle turns quiet
  • clear impacted package quality-gate blockers required for merge

Test evidence:

  • cargo test -p tau-tui 3604 -- --nocapture
  • cargo test -p tau-tui
  • cargo clippy -p tau-tui --all-targets --all-features -- -D warnings
  • cargo clippy -p tau-tools --all-targets --all-features -- -D warnings
  • cargo test -p tau-coding-agent regression_spec_3555_c01_run_local_runtime_uses_cli_request_timeout_for_agent -- --nocapture
  • ./scripts/dev/fast-validate.sh --base origin/master

Copilot AI review requested due to automatic review settings March 20, 2026 03:25
Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

njfio has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9a11f5fbe7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 21 to +25
let Some(entry) = app.tools.latest_entry() else {
return Vec::new();
return lines;
};

terminal_summary_lines(entry)
lines.extend(terminal_summary_lines(entry));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Limit last-tool summary to current turn in build mode

This still appends terminal_summary_lines from app.tools.latest_entry() (global history), so a new build/create turn with no current-tool activity can show both Build status: no mutating evidence yet and Last tool: write from a prior turn. In the context of this change (resetting evidence per turn), that leaks prior mutating evidence back into the active-turn surface and gives contradictory operator guidance. Use current-turn entries for the terminal summary (or suppress it until the current turn has tool events) when build status is shown.

Useful? React with 👍 / 👎.

Comment on lines +97 to +99
pub fn current_turn_tools(&self) -> &[ToolEntry] {
let start = self.current_turn_tool_start.min(self.tools.entries().len());
&self.tools.entries()[start..]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep current-turn tool window aligned after truncation

current_turn_tools() slices by a stored vector index, but ToolPanel evicts from the front once history exceeds 200 entries. During long turns, that start index is not adjusted as evictions happen, so early tool events from the same turn are dropped from the slice; a turn with an early successful write/edit can later be misclassified as read-only or missing. The turn boundary needs to be tracked in a truncation-safe way (e.g., monotonic IDs or eviction-aware offset updates).

Useful? React with 👍 / 👎.

@njfio njfio merged commit f9cdd98 into master Mar 20, 2026
7 checks passed
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a “Build status” banner to the interactive TUI chat surface during active build/create turns, indicating whether the current turn has no successful tool evidence yet, is still read-only, or has confirmed mutating evidence—while resetting evidence per user turn.

Changes:

  • Introduce build_status classification and render it in the chat summary strip (only for non-idle build/create turns).
  • Track “current turn” tool entries in App so prior-turn successful mutations don’t affect the next turn’s status.
  • Add integration-style ratatui render-path tests covering all banner states and the per-turn reset behavior.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tasks/todo.md Roadmap status date bump.
tasks/tau-vs-ironclaw-gap-list.md Status snapshot date bump.
specs/3604/spec.md New spec capturing ACs, failure modes, and test evidence.
crates/tau-tui/src/interactive/build_status.rs New build/create prompt + tool-evidence classifier and unit tests.
crates/tau-tui/src/interactive/ui_chat_tool_lines.rs Prepends build-status banner lines to the chat summary tool lines.
crates/tau-tui/src/interactive/app.rs Tracks per-turn tool slice start; exposes current_turn_tools() and latest_user_prompt().
crates/tau-tui/src/interactive/ui_build_status_tests.rs New render-path tests validating banner states and turn reset.
crates/tau-tui/src/interactive/app_commands.rs Routes user submissions through App::push_message() to trigger per-turn reset.
crates/tau-tui/src/interactive/{chat,input,tools}.rs Adds Default impls for panel structs.
crates/tau-tui/src/interactive/mod.rs Wires new module + tests.
crates/tau-tools/src/tools.rs Small refactors using is_some_and and simplified recursion guard.
crates/tau-coding-agent/src/tests/auth_provider/runtime_and_startup.rs Adds restore_current_dir helper to make tests more robust.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +98 to +99
let start = self.current_turn_tool_start.min(self.tools.entries().len());
&self.tools.entries()[start..]
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

current_turn_tools() slices tools.entries() using current_turn_tool_start captured from tools.total_count(). Since ToolPanel::add_entry() can evict the oldest entries when max_entries is exceeded, indices can shift during a turn, causing current_turn_tool_start to point too far into the truncated buffer and incorrectly drop early tool entries from the current turn (breaking per-turn evidence classification after long sessions). Consider tracking a monotonic tool sequence/turn id (e.g., store an ever-increasing counter on each ToolEntry or in ToolPanel and filter by seq >= turn_start_seq), or adjust the start marker when eviction occurs so it remains aligned with the in-memory entries buffer.

Suggested change
let start = self.current_turn_tool_start.min(self.tools.entries().len());
&self.tools.entries()[start..]
// Interpret `current_turn_tool_start` as a global sequence number from
// `ToolPanel::total_count()`, not as a direct index into the current
// in-memory entries buffer. Since `ToolPanel` may evict old entries,
// we translate that sequence number into an index relative to the
// current sliding window.
let entries = self.tools.entries();
if entries.is_empty() {
return entries;
}
let total = self.tools.total_count();
// Global sequence number of the first entry currently in memory.
let first_seq = total.saturating_sub(entries.len());
let start_seq = self.current_turn_tool_start;
let start_index = if start_seq <= first_seq {
// Either all in-memory entries are from the current turn, or
// earlier entries from this turn have been evicted. In both
// cases, we start from the beginning of the buffer.
0
} else {
start_seq - first_seq
};
&entries[start_index..]

Copilot uses AI. Check for mistakes.
Comment on lines +42 to +45
matches!(
word.to_ascii_lowercase().as_str(),
"build" | "create" | "implement" | "make"
)
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_build_verb() calls to_ascii_lowercase() for every token, allocating a new String each time current_build_status() runs. Since this is invoked during rendering (potentially every tick), this can create avoidable per-frame allocations for long prompts. Consider using eq_ignore_ascii_case() (e.g., word.eq_ignore_ascii_case("build")) or pre-normalizing without allocation to keep render-path overhead predictable.

Suggested change
matches!(
word.to_ascii_lowercase().as_str(),
"build" | "create" | "implement" | "make"
)
word.eq_ignore_ascii_case("build")
|| word.eq_ignore_ascii_case("create")
|| word.eq_ignore_ascii_case("implement")
|| word.eq_ignore_ascii_case("make")

Copilot uses AI. Check for mistakes.
@njfio njfio deleted the 3604-tui-mutating-evidence-status-pr branch March 20, 2026 11:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Surface mutating tool evidence status in the TUI during build/create turns

2 participants