diff --git a/.agents/skills/memory-manager/SKILL.md b/.agents/skills/memory-manager/SKILL.md index 5656232..d677e28 100644 --- a/.agents/skills/memory-manager/SKILL.md +++ b/.agents/skills/memory-manager/SKILL.md @@ -1,6 +1,6 @@ --- name: memory-manager -description: Manage long-term AI R&D memory with retrieval, writeback, promotion, and shared export candidates. Use when preserving run state, maintaining working todo lists, reusing prior debugging/research knowledge, recording outcomes, or preparing post-task shared-memory publication. +description: Manage long-term AI R&D memory with retrieval, writeback, promotion, and shared export candidates. Use when preserving run state, recovering prior Memory after compaction/summary steps, maintaining working todo lists, reusing prior debugging/research knowledge, recording outcomes, or preparing post-task shared-memory publication. --- # Memory Manager @@ -50,10 +50,14 @@ Todo granularity should be task-level (small stages/subtasks), not command-level Retrieve early when useful, but do not block execution: 1. Query by `project`, `task_type`, `error_signature` first. -2. Add tags and FTS when exact filters miss. -3. Prefer `active` procedures/insights when confidence is similar. -4. Flag stale entries with low confidence. -5. If retrieval is low-yield and task is time-sensitive, continue with search/deep research directly. +2. Upgrade retrieval from optional to mandatory before continuing when either of these triggers is present: + - you are modifying `memory-manager` or another Memory-related skill/instruction + - a status, state, or context file contains compaction markers such as `Compact`, `压缩`, `Summary`, or similar summary/compression techniques +3. In mandatory-retrieval cases, read prior Memory first and treat the result as required context recovery rather than a best-effort lookup. +4. Add tags and FTS when exact filters miss. +5. Prefer `active` procedures/insights when confidence is similar. +6. Flag stale entries with low confidence. +7. If retrieval is low-yield and task is time-sensitive, continue with search/deep research directly only when the mandatory-retrieval triggers are absent. ## Shared Retrieval Policy @@ -170,7 +174,17 @@ If execution becomes repetitive or confused: 1. Rebuild working state from action and observation logs. 2. Run targeted retrieval by project/task/error signature. -3. Publish compact state summary before continuing. +3. If drift followed a compaction step or summary-style recovery, read prior Memory before publishing or trusting a compact state summary. +4. Publish compact state summary before continuing. + +## Compaction Recovery Policy + +When context may have been compressed: + +1. Inspect available status/state/context files for markers such as `Compact`, `压缩`, `Summary`, or equivalent summary/compression techniques. +2. If any marker is present, call `memory-manager` to read prior Memory before editing instructions, planning next actions, or resuming execution. +3. If prior Memory cannot be read, treat that as an active blocker because key context may be missing. +4. Record the compaction trigger and retrieval result in working state or the next stage report. ## Promotion Policy diff --git a/.agents/skills/research-workflow/SKILL.md b/.agents/skills/research-workflow/SKILL.md index 19ab4b0..2a1ce88 100644 --- a/.agents/skills/research-workflow/SKILL.md +++ b/.agents/skills/research-workflow/SKILL.md @@ -88,29 +88,33 @@ Use these in combination: - stage transition - replan - significant error or new error signature + - the current task modifies `memory-manager` or another Memory-related skill/policy + - state/context files show compaction markers such as `Compact`, `压缩`, `Summary`, or equivalent summary/compression techniques - memory auto-compression/summarization completed - before high-resource action - before final answer/report handoff -4. Periodic `working` memory refresh is required when either holds: +4. In memory-skill-edit or compaction cases, call `memory-manager` to read prior Memory before planning, editing, or resuming execution. +5. Periodic `working` memory refresh is required when either holds: - at least 15 minutes since last memory operation - at least 3 execution cycles since last memory operation -5. Command-gap fallback: if 5 consecutive commands/actions finish without a memory update, force one concise `working` refresh. -6. Cooldown: no more than one non-forced memory operation per cycle. -7. Avoid per-command memory writes; batch observations into one delta update. -8. Use search/deep research directly when topic is time-sensitive, new, or currently blocked. -9. If project-local memory retrieval is low-yield, shared-memory retrieval may query the configured local shared repo as a read-only source. -10. Do not sync the shared repo on every cycle; prefer the current local checkout and sync only on explicit gap handling or before export. -11. For open-ended research/scoping requests, run deep research before giving decomposition or roadmap recommendations. -11.1 For mid-run new research requests, run deep research re-entry before further execution. -12. For unknown errors, use this branch: +6. Command-gap fallback: if 5 consecutive commands/actions finish without a memory update, force one concise `working` refresh. +7. Cooldown: no more than one non-forced memory operation per cycle. +8. Avoid per-command memory writes; batch observations into one delta update. +9. Use search/deep research directly when topic is time-sensitive, new, or currently blocked. +10. If project-local memory retrieval is low-yield, shared-memory retrieval may query the configured local shared repo as a read-only source. +11. Do not sync the shared repo on every cycle; prefer the current local checkout and sync only on explicit gap handling or before export. +12. For open-ended research/scoping requests, run deep research before giving decomposition or roadmap recommendations. +12.1 For mid-run new research requests, run deep research re-entry before further execution. +13. For unknown errors, use this branch: - local evidence triage (logs, stack trace, recent changes) - shared-memory retrieval when reusable SOPs or prior debug cases are likely relevant - targeted search - deep research (debug-investigation) if still unresolved - minimal fix validation -13. If skipping memory due to cooldown or low-value delta, record reason in the stage report. -14. If intake information is missing, trigger `human-checkpoint` before deep research or planning. -15. If deep research was used for open-ended scoping, hand off to `research-plan` to convert findings into an execution-ready plan. Skip only if the user explicitly opts out. +14. If compaction is detected, treat missing memory retrieval as a workflow violation and recover by reading prior Memory before continuing. +15. If skipping memory due to cooldown or low-value delta outside the memory-skill-edit or compaction cases, record reason in the stage report. +16. If intake information is missing, trigger `human-checkpoint` before deep research or planning. +17. If deep research was used for open-ended scoping, hand off to `research-plan` to convert findings into an execution-ready plan. Skip only if the user explicitly opts out. ## Replanning Policy diff --git a/AGENTS.md b/AGENTS.md index 07dea60..9e99965 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -6,17 +6,18 @@ This workspace is for AI research and development tasks (reproduction, debugging 1. Start each non-trivial research task with `run-governor`, but do not initialize `run_id` paths before explicit user confirmation of both `mode` and execution target (`local|remote`). 2. Use `research-workflow` as the default orchestration loop. 3. Use `memory-manager` to maintain working todo state and long-term memory. -4. Trigger `human-checkpoint` using mode-aware policy, always for major safety risks and shared-memory publication. -5. Use `experiment-execution` only for actual run execution. -6. Use `project-context` to collect and persist per-project private runtime context before experiments or report/eval execution. -7. Use `deep-research` for deep external investigation and evidence synthesis, including early-stage project scoping when a user wants to write a research study or paper on a topic, unless the user is explicitly asking for a paper-writing deliverable right now. -8. Use `research-plan` when the user asks for a proposal, roadmap, ablation/evaluation plan, study design, or pre-implementation research decomposition. -9. After open-ended scoping in `deep-research`, hand off findings into `research-plan` by default; skip only if the user explicitly opts out. -10. Use `paper-writing` only when the user explicitly asks for a paper-writing deliverable such as drafting or revising a paper, section, or rebuttal. Do not use it for topic scoping, literature investigation, feasibility analysis, experiment design, or experiment execution. -11. Base conclusions on evidence only (command outputs, metrics, logs, and file diffs). -12. Prefer small, reversible, verifiable steps over broad speculative changes. -13. Follow `REPO_CONVENTIONS.md` for artifact placement and commit hygiene. -14. If a run was initialized before confirmation, stop and run violation recovery: acknowledge, ask whether to keep/clean artifacts, and wait for explicit reconfirmation before continuing. +4. If you modify `memory-manager` or any Memory-related skill, or detect compaction markers in state/context files such as `Compact`, `压缩`, `Summary`, or similar summary/compression techniques, invoke `memory-manager` to read prior Memory before continuing so key context is not dropped. +5. Trigger `human-checkpoint` using mode-aware policy, always for major safety risks and shared-memory publication. +6. Use `experiment-execution` only for actual run execution. +7. Use `project-context` to collect and persist per-project private runtime context before experiments or report/eval execution. +8. Use `deep-research` for deep external investigation and evidence synthesis, including early-stage project scoping when a user wants to write a research study or paper on a topic, unless the user is explicitly asking for a paper-writing deliverable right now. +9. Use `research-plan` when the user asks for a proposal, roadmap, ablation/evaluation plan, study design, or pre-implementation research decomposition. +10. After open-ended scoping in `deep-research`, hand off findings into `research-plan` by default; skip only if the user explicitly opts out. +11. Use `paper-writing` only when the user explicitly asks for a paper-writing deliverable such as drafting or revising a paper, section, or rebuttal. Do not use it for topic scoping, literature investigation, feasibility analysis, experiment design, or experiment execution. +12. Base conclusions on evidence only (command outputs, metrics, logs, and file diffs). +13. Prefer small, reversible, verifiable steps over broad speculative changes. +14. Follow `REPO_CONVENTIONS.md` for artifact placement and commit hygiene. +15. If a run was initialized before confirmation, stop and run violation recovery: acknowledge, ask whether to keep/clean artifacts, and wait for explicit reconfirmation before continuing. ## Memory Invocation Guardrails (Balanced) 1. `memory-manager` is mandatory for non-trivial runs, but only as a control-plane step, not per command.