feat(loop): evolution, critique, OAuth auth, and documentation#14
Merged
electronicBlacksmith merged 5 commits intomainfrom Apr 7, 2026
Merged
feat(loop): evolution, critique, OAuth auth, and documentation#14electronicBlacksmith merged 5 commits intomainfrom
electronicBlacksmith merged 5 commits intomainfrom
Conversation
…oop ticks Loop ticks now use Phantom's full intelligence stack instead of running blind: Phase 1 - Memory context injection: cached once at loop start from the goal, injected into every tick prompt via TickPromptOptions. Cleared on finalize, rebuilt on resume. Phase 2 - Post-loop evolution and consolidation: bounded transcript accumulation (first tick + rolling 10 summaries + last tick), SessionData synthesis in finalize(), fire-and-forget evolution pipeline and LLM/heuristic memory consolidation with cost-cap guards matching the interactive path. Phase 3 - Mid-loop critique checkpoints: optional checkpoint_interval param lets the agent request Sonnet 4.6 review every N ticks. Guard requires evolution enabled, LLM judges active, and cost cap not exceeded. Critique is awaited before next tick to avoid race conditions. Closes #8
- Decouple postLoopDeps so evolution and memory run independently (evolution works when memory is down and vice versa) - Skip mid-loop critique on terminal ticks to avoid wasted Sonnet calls - Track judge cost on failure paths via JudgeParseError carrying usage data - Extract recordTranscript/clamp from runner.ts to post-loop.ts (292 < 300 lines)
resolveJudgeMode() and judge client now check ANTHROPIC_AUTH_TOKEN and CLAUDE_CODE_OAUTH_TOKEN in addition to ANTHROPIC_API_KEY. Enables LLM judges on Max subscription deployments using OAuth bearer tokens.
Covers MCP tool parameters, state file contract, tick lifecycle, Slack integration, mid-loop critique, post-loop evolution pipeline, memory context injection, and tips for writing effective goals. Closes #12
This was referenced Apr 6, 2026
1aa1928 to
d23acce
Compare
trigger-auth: use inline Bun.serve instead of startServer to avoid module-level globals and disk I/O that can race across test files. judge-activation: save/restore ANTHROPIC_AUTH_TOKEN and CLAUDE_CODE_OAUTH_TOKEN alongside ANTHROPIC_API_KEY so tests that expect "no credentials" actually clear all auth env vars.
d23acce to
f30a3b2
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Consolidates PRs #9, #10, and #13 into a single clean branch from main.
resolveJudgeMode()andisJudgeAvailable()now checkANTHROPIC_AUTH_TOKENandCLAUDE_CODE_OAUTH_TOKENalongsideANTHROPIC_API_KEYdocs/loop.mdcovering MCP tool params, state file contract, tick lifecycle, Slack integration, critique, post-loop pipeline, and memory contextBun.serveto avoid module-level global races; judge-activation tests save/restore all auth env varsSupersedes #9, #10, #13.
Test plan
bun run lintcleanbun run typecheckclean