Skip to content

feat(loop): evolution, critique, OAuth auth, and documentation#14

Merged
electronicBlacksmith merged 5 commits intomainfrom
feat/loop-evolution-and-docs
Apr 7, 2026
Merged

feat(loop): evolution, critique, OAuth auth, and documentation#14
electronicBlacksmith merged 5 commits intomainfrom
feat/loop-evolution-and-docs

Conversation

@electronicBlacksmith
Copy link
Copy Markdown
Owner

Summary

Consolidates PRs #9, #10, and #13 into a single clean branch from main.

Supersedes #9, #10, #13.

Test plan

  • 978 tests pass, 0 new failures (2 pre-existing init.test.ts env-specific failures on main)
  • bun run lint clean
  • bun run typecheck clean
  • trigger-auth tests pass locally (inline server eliminates CI race)
  • judge-activation tests pass with OAuth token in env

…oop ticks

Loop ticks now use Phantom's full intelligence stack instead of running blind:

Phase 1 - Memory context injection: cached once at loop start from the goal,
injected into every tick prompt via TickPromptOptions. Cleared on finalize,
rebuilt on resume.

Phase 2 - Post-loop evolution and consolidation: bounded transcript
accumulation (first tick + rolling 10 summaries + last tick), SessionData
synthesis in finalize(), fire-and-forget evolution pipeline and LLM/heuristic
memory consolidation with cost-cap guards matching the interactive path.

Phase 3 - Mid-loop critique checkpoints: optional checkpoint_interval param
lets the agent request Sonnet 4.6 review every N ticks. Guard requires
evolution enabled, LLM judges active, and cost cap not exceeded. Critique
is awaited before next tick to avoid race conditions.

Closes #8
- Decouple postLoopDeps so evolution and memory run independently
  (evolution works when memory is down and vice versa)
- Skip mid-loop critique on terminal ticks to avoid wasted Sonnet calls
- Track judge cost on failure paths via JudgeParseError carrying usage data
- Extract recordTranscript/clamp from runner.ts to post-loop.ts (292 < 300 lines)
resolveJudgeMode() and judge client now check ANTHROPIC_AUTH_TOKEN and
CLAUDE_CODE_OAUTH_TOKEN in addition to ANTHROPIC_API_KEY. Enables LLM
judges on Max subscription deployments using OAuth bearer tokens.
Covers MCP tool parameters, state file contract, tick lifecycle,
Slack integration, mid-loop critique, post-loop evolution pipeline,
memory context injection, and tips for writing effective goals.

Closes #12
trigger-auth: use inline Bun.serve instead of startServer to avoid
module-level globals and disk I/O that can race across test files.

judge-activation: save/restore ANTHROPIC_AUTH_TOKEN and
CLAUDE_CODE_OAUTH_TOKEN alongside ANTHROPIC_API_KEY so tests that
expect "no credentials" actually clear all auth env vars.
@electronicBlacksmith electronicBlacksmith force-pushed the feat/loop-evolution-and-docs branch from d23acce to f30a3b2 Compare April 7, 2026 00:19
@electronicBlacksmith electronicBlacksmith merged commit d84a40a into main Apr 7, 2026
1 check passed
@electronicBlacksmith electronicBlacksmith deleted the feat/loop-evolution-and-docs branch April 7, 2026 00:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant