dgenio · dgenio · Mar 9, 2026 · Mar 9, 2026 · Mar 9, 2026
diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md
@@ -0,0 +1,68 @@
+# ChainWeaver — Claude Instructions
+
+Canonical source of truth: [AGENTS.md](/AGENTS.md) and
+[docs/agent-context/](/docs/agent-context/).
+
+Read AGENTS.md before starting any task. It contains the repo map,
+invariants, entry points, common tasks, validation commands, and
+documentation map that routes to deeper guidance.
+
+---
+
+## Explore before acting
+
+- Read the canonical docs for the topic area before writing code.
+- Inspect the files you plan to change. Do not assume structure from memory.
+- Check [architecture.md](/docs/agent-context/architecture.md) for design
+  traps and reserved module names before creating or renaming files.
+- Do not infer repo-wide rules from a single local example.
+
+## Implement safely
+
+- Preserve invariants. The three executor rules (no LLM, no network I/O,
+  no randomness in `executor.py`) are non-negotiable. See
+  [invariants.md](/docs/agent-context/invariants.md).
+- Use authoritative commands exactly as listed in
+  [AGENTS.md § Validation commands](/AGENTS.md#7-validation-commands).
+  Do not substitute alternative flags, paths, or invocations.
+- Follow the conventions in canonical docs. Do not invent new patterns.
+- Do not "clean up" or "simplify" code that looks unusual without first
+  checking [architecture.md § Design traps](/docs/agent-context/architecture.md#design-traps).
+
+## Validate before completing
+
+- Run all four validation commands and confirm they pass.
+- Check whether your change triggers a doc update. Consult the governance
+  triggers in [workflows.md](/docs/agent-context/workflows.md#documentation-governance-triggers).
+- Walk [review-checklist.md](/docs/agent-context/review-checklist.md) before
+  marking work done.
+- Verify that docstrings match actual behavior, not intended behavior.
+
+## Handle contradictions
+
+- If canonical docs contradict each other, flag the conflict explicitly.
+  Do not silently pick one side.
+- If code contradicts canonical docs, trust the docs for conventions and
+  the code for runtime behavior. Flag the gap.
+- If an older or duplicate document disagrees with AGENTS.md or
+  `docs/agent-context/`, prefer AGENTS.md.
+- Fix small contradictions in the same PR. Open an issue for large ones.
+
+## Capture lessons
+
+- If you discover a recurring failure pattern during work, note it as a
+  candidate lesson.
+- A candidate lesson is provisional. Do not promote it into durable docs
+  based on a single observation.
+- A lesson is promotable when it is reusable, decision-shaping, and durable
+  — not just a one-off incident.
+- Promotion order: canonical docs first (`lessons-learned.md`), then
+  projections. See the criteria in
+  [lessons-learned.md](/docs/agent-context/lessons-learned.md#promotion-criteria).
+
+## Update order
+
+1. Update canonical shared docs (`AGENTS.md`, `docs/agent-context/`) first.
+2. Update tool-specific projections (this file, `.github/copilot-instructions.md`) second.
+3. If a Claude-specific rule starts to look shared and durable, promote it
+   into canonical docs and simplify it here.
diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md
@@ -1,98 +1,46 @@
 # Copilot Instructions — ChainWeaver
 
-These instructions apply to all Copilot interactions in this repository.
-For full architecture context and decision rationale, see [AGENTS.md](/AGENTS.md).
-
-## Language & runtime
-
-- Python 3.10+ (target version for all code)
-- `from __future__ import annotations` at the top of every module
-- Type annotations on all function signatures (this is a `py.typed` package)
-
-## Code style
-
-- Formatter: `ruff format` (line length 99, double quotes, trailing commas)
-- Linter: `ruff check` with rule sets: E, W, F, I, UP, B, SIM, RUF
-- Import order: `isort`-compatible via Ruff's `I` rules (known first-party: `chainweaver`)
-- Naming: snake_case for functions/variables, PascalCase for classes
-- Docstrings: Google style (Args/Returns/Raises sections)
-
-## Architecture rules
-
-- All data models use `pydantic.BaseModel` (pydantic v2 API)
-- All exceptions inherit from `ChainWeaverError` (in `chainweaver/exceptions.py`)
-- All public symbols must be listed in `chainweaver/__init__.py` `__all__`
-- `executor.py` is deterministic — no LLM calls, no network I/O, no randomness
-- Tool functions: `fn(validated_input: BaseModel) -> dict[str, Any]`
-
-## Project layout
-
-```
-chainweaver/          → Package source (all modules use `from __future__ import annotations`)
-  __init__.py         → Public API surface; all exports listed in __all__
-  tools.py            → Tool class: named callable with Pydantic input/output schemas
-  flow.py             → FlowStep + Flow: ordered step definitions (Pydantic models)
-  registry.py         → FlowRegistry: in-memory catalogue of named flows
-  executor.py         → FlowExecutor: sequential, LLM-free runner (main entry point)
-  exceptions.py       → Typed exception hierarchy (all inherit ChainWeaverError)
-  log_utils.py        → Structured per-step logging utilities
-pyproject.toml        → Ruff, mypy, pytest config (source of truth for tool settings)
-tests/                → pytest test suite
-  conftest.py         → Shared fixtures (tools, flows, executors)
-  helpers.py          → Shared Pydantic schemas and tool functions
-examples/             → Runnable usage examples
-.github/workflows/    → CI (ci.yml) and publish (publish.yml) pipelines
-```
-
-## Testing
-
-- Framework: `pytest` (no unittest)
-- Test files: `tests/test_*.py`
-- Use `@pytest.fixture()` for shared objects (tools, flows, executors)
-- Shared schemas and helper functions live in `tests/helpers.py`
-- Test both success and error paths
-- Assertions: use plain `assert` (pytest rewrites them), not `self.assertEqual`
-- No mocking of internal ChainWeaver classes unless testing integration boundaries
-
-## Validation commands (run before every commit/PR)
-
-```bash
-# Install with dev dependencies
-pip install -e ".[dev]"
-
-# Lint
-ruff check chainweaver/ tests/ examples/
-
-# Check formatting
-ruff format --check chainweaver/ tests/ examples/
-
-# Type check
-python -m mypy chainweaver/
-
-# Run tests
-python -m pytest tests/ -v
-```
-
-Always run all four checks. CI runs lint + format + mypy on Python 3.10 only;
-tests run across Python 3.10, 3.11, 3.12, 3.13.
-
-## PR conventions
-
-- One logical change per PR
-- PR title: imperative mood (e.g., "Add retry logic to executor")
-- If you change architecture (add/remove/rename modules), update AGENTS.md and the project layout in this file in the same PR
-- If you change coding conventions, update this file in the same PR
-
-## Anti-patterns (never generate these)
-
-- Do NOT add LLM/AI client calls to `executor.py`
-- Do NOT use `unittest.TestCase` — use plain pytest functions/classes
-- Do NOT import from `chainweaver` internals using relative paths outside the package
-- Do NOT add dependencies without updating `pyproject.toml` `[project.dependencies]`
-- Do NOT commit secrets, API keys, or credentials
-
-## Trust these instructions
-
-These instructions are tested and aligned with CI. Only search for additional
-context if the information here is incomplete or found to be in error.
-For architecture decisions and rationale, see [AGENTS.md](/AGENTS.md).
+> Thin review-oriented layer. Canonical source of truth: [AGENTS.md](/AGENTS.md)
+> and [docs/agent-context/](/docs/agent-context/).
+
+---
+
+## Review-critical rules
+
+- Review code and agent-facing docs together. If a PR changes behavior,
+  invariants, architecture, or workflows, the corresponding docs must be
+  updated in the same PR.
+- Invariants take priority over cleanup, simplification, or local refactors.
+  See [AGENTS.md § Core invariants](/AGENTS.md#4-core-invariants).
+- Do not invent conventions. All coding style, naming, workflow, and testing
+  rules are grounded in [AGENTS.md](/AGENTS.md) and
+  [docs/agent-context/](/docs/agent-context/). If guidance is missing, surface
+  the gap — do not guess.
+- Use authoritative commands exactly as written in
+  [AGENTS.md § Validation commands](/AGENTS.md#7-validation-commands). Do not
+  substitute alternative flags, paths, or invocations.
+- If you find a contradiction or stale content in any doc, flag it explicitly.
+  Do not silently work around it.
+
+## Executor guardrails
+
+`executor.py` has three hard invariants — no LLM calls, no network I/O, no
+randomness. These are non-negotiable. See
+[invariants.md](/docs/agent-context/invariants.md#hard-executor-invariants).
+
+## Vocabulary
+
+| Use | Never use |
+|-----|-----------|
+| **flow** | chain, pipeline |
+| **tool** | function, action (when referring to a `Tool` instance) |
+
+## Where to find guidance
+
+| Topic | Canonical file |
+|-------|----------------|
+| Architecture, boundaries, design traps | [architecture.md](/docs/agent-context/architecture.md) |
+| Commands, CI, code style, testing, PR rules | [workflows.md](/docs/agent-context/workflows.md) |
+| Hard rules, forbidden patterns | [invariants.md](/docs/agent-context/invariants.md) |
+| Recurring mistake patterns | [lessons-learned.md](/docs/agent-context/lessons-learned.md) |
+| Definition-of-done, review gates | [review-checklist.md](/docs/agent-context/review-checklist.md) |
diff --git a/.github/instructions/chainweaver.instructions.md b/.github/instructions/chainweaver.instructions.md
@@ -0,0 +1,15 @@
+---
+applyTo: "chainweaver/**"
+---
+# ChainWeaver package — design traps
+
+Do not "fix" these without a solution for the underlying constraint.
+See [architecture.md § Design traps](/docs/agent-context/architecture.md#design-traps)
+for full context.
+
+- `StepRecord` and `ExecutionResult` are `dataclass`, not Pydantic. They carry
+  `Exception` instances. Do not convert them.
+- `log_utils.py` was renamed from `logging.py` to avoid stdlib shadowing. Do
+  not rename it back.
+- Weaver Stack: do not add agent-kernel or weaver-spec imports to `executor.py`.
+  `KernelBackedExecutor` goes in a separate class.
diff --git a/.github/instructions/tests.instructions.md b/.github/instructions/tests.instructions.md
@@ -0,0 +1,27 @@
+---
+applyTo: "tests/**"
+---
+# Tests
+
+## File boundary
+
+- `tests/helpers.py` — shared Pydantic schemas and tool functions.
+- `tests/conftest.py` — pytest fixtures that compose objects from `helpers.py`.
+
+Do not merge these files. Do not put schemas in `conftest.py` or fixtures in
+`helpers.py`.
+
+## Framework rules
+
+- pytest only. No `unittest.TestCase`, no `self.assertEqual`.
+- Plain `assert` statements (pytest rewrites them).
+- No mocking of internal ChainWeaver classes unless testing integration boundaries.
+- Test both success and failure/error paths.
+
+## Organization
+
+- Unit tests grouped by module (`test_{module}.py`).
+- Integration tests grouped by scenario.
+- Test classes grouped by scenario (e.g., `TestSuccessfulExecution`).
+
+See [workflows.md § Testing conventions](/docs/agent-context/workflows.md#testing-conventions).