Skip to content

write eval results to results.json for structured agent consumption#79

Closed
mvanhorn wants to merge 1 commit intokarpathy:masterfrom
mvanhorn:osc/64-structured-results-json
Closed

write eval results to results.json for structured agent consumption#79
mvanhorn wants to merge 1 commit intokarpathy:masterfrom
mvanhorn:osc/64-structured-results-json

Conversation

@mvanhorn
Copy link

@mvanhorn mvanhorn commented Mar 9, 2026

Fixes #64

Adds a results.json file written after evaluation with the same metrics already printed to stdout. This gives agents a structured, parseable results channel instead of relying on grepping free-form stdout from run.log.

  • train.py: writes results.json after the final summary (5 lines, uses stdlib json)
  • program.md: instructs agent to read results.json first, fall back to grep

Existing stdout output is unchanged.

This contribution was developed with AI assistance (Claude Code).

The agent loop currently reads results by grepping stdout from run.log,
which mixes trusted metrics with arbitrary training output. Writing a
structured results.json gives agents a reliable, parseable results
channel. Existing stdout output is unchanged.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@mvanhorn
Copy link
Author

Closing in favor of #331 which rebases this cleanly on current main and adds crash diagnostic improvements.

@mvanhorn mvanhorn closed this Mar 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Indirect prompt injection via training output fed back to agent

1 participant