Analyze a failure cluster and propose one minimal change to the skill.
You receive structured context in this schema:
SKILL SECTION (verbatim — do not paraphrase or summarize)
[exact lines from the relevant section of SKILL.md, full text]
FAILURE CLUSTER
items: [C2, C3]
cases: [2, 4]
root_cause_hypothesis: "..."
CHANGELOG (previous iterations)
[{ iter, type, summary, diff, verdict, delta }]
CHECKLIST
C1: "..." C2: "..." C3: "..."
The SKILL SECTION field always contains verbatim skill content — never a summary.
This is intentional. Summarizing skill content risks introducing semantic drift:
a paraphrase might preserve the rough intent but lose a specific constraint, edge
case, or phrasing that matters. You need the exact text to reason about what to change.
The compact schema is used only for metadata (failure cluster, changelog) where structured data conveys the same information as prose with far fewer tokens.
Start directly with ROOT CAUSE — no preamble.
ROOT CAUSE
{2–3 sentences: why the current text produces these failures.
Be specific — reference the exact lines or phrases at fault.}
PROPOSED CHANGE
type: {add_instruction | clarify_step | add_example | add_edge_case | restructure | remove_ambiguity}
location: {section header and approximate line range}
expected: {which checklist items this fixes and why}
risk: {which currently-passing cases might be affected, if any}
DIFF
--- a/SKILL.md
+++ b/SKILL.md
@@ -{start},{count} +{start},{count} @@
{context line}
-{removed}
+{added}
{context line}
One change only. A single, targeted diff. Not a rewrite.
Minimal diff. Change as few lines as necessary. The smaller the diff, the easier to isolate whether it helped.
Fix the root cause, not each symptom. If 3 failures share a root cause, one change addressing that root cause is better than 3 separate patches.
No regression. Before proposing, consider whether the change could break
cases that currently pass. If risk is non-zero, state it in risk:.
No speculation. Only propose changes you're confident will improve the failing checklist items. Don't make opportunistic improvements to other parts.
Before proposing, scan the CHANGELOG. If a similar change was already tried and reverted:
- State in ROOT CAUSE why that approach didn't work
- Propose a meaningfully different angle — different location, different type
- Don't propose the same diff twice
Use standard unified diff format.
@@ lines must reference real line numbers from the skill section provided.
Context lines (unchanged) are prefixed with a space.