Refactor: Simplify checkpoint path and Job-based TUI history#83
Refactor: Simplify checkpoint path and Job-based TUI history#83FL4TLiN3 merged 4 commits intoepic/job-conceptfrom
Conversation
- Simplify checkpoint storage path to jobs/{jobId}/checkpoints/{id}.json
- Remove timestamp from checkpoint filename
- Update --resume-from to strictly require --continue-job
- Change TUI history from Run-based to Job-based display
- Show jobId and checkpointId in TUI for easier CLI usage
Closes #81
Closes #82
|
The latest updates on your projects. Learn more about Vercel for GitHub. |
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
packages/runtime/src/runtime.ts
Outdated
| }) | ||
| job = { | ||
| ...job, | ||
| totalSteps: job.totalSteps + runResultCheckpoint.stepNumber, |
There was a problem hiding this comment.
Bug: Job totalSteps double-counts steps across iterations
The totalSteps calculation incorrectly adds runResultCheckpoint.stepNumber to job.totalSteps on each iteration of the while loop. Since stepNumber is cumulative within a job (preserved across delegations and incremented by createNextStepCheckpoint), this causes double-counting. For example, if a run completes at step 3, then delegates and completes at step 5, totalSteps becomes 0 + 3 + 5 = 8 instead of the correct 5. The calculation should either set totalSteps directly to runResultCheckpoint.stepNumber or compute the delta between initial and result step numbers.
packages/runtime/src/runtime.ts
Outdated
| job = { | ||
| ...job, | ||
| totalSteps: job.totalSteps + runResultCheckpoint.stepNumber, | ||
| usage: sumUsage(job.usage, runResultCheckpoint.usage), |
There was a problem hiding this comment.
Bug: Job usage double-counts tokens across iterations
The usage calculation has the same double-counting issue as totalSteps. The checkpoint's usage is cumulative (accumulated via sumUsage in the state machine and preserved when createNextStepCheckpoint spreads the previous checkpoint). Adding runResultCheckpoint.usage to job.usage on each iteration causes token counts to be double-counted during delegation or continuation. This would result in inflated inputTokens, outputTokens, and totalTokens values in the job's usage tracking.
- Add getAllJobs() to job-store.ts - Add getAllRuns() to run-setting-store.ts - Add getCheckpointsByJobId(), getEventsByRun(), getEventContents() to default-store.ts - Export new functions from runtime index - Simplify run-manager.ts to delegate to runtime functions
stepNumber and usage in checkpoints are cumulative within a Job, so directly assign instead of summing to avoid double-counting.
Summary
jobs/{jobId}/checkpoints/{id}.json(removed timestamp from filename)--resume-fromnow strictly requires--continue-job <jobId>jobIdandcheckpointIdin TUI for easier CLI usageChanges
Checkpoint Storage
jobs/{jobId}/runs/{runId}/checkpoint-{timestamp}-{step}-{id}.jsontojobs/{jobId}/checkpoints/{id}.jsonCLI
--resume-from <checkpointId>requires--continue-job <jobId>TUI
expertKey - {totalSteps} steps ({jobId}) (startedAt)Step {stepNumber} ({checkpointId})Closes #81
Closes #82
Note
Switch TUI history to Jobs and simplify checkpoint storage to job-level files; enforce
--resume-fromrequires--continue-job, with runtime/store APIs and docs/tests updated.perstack/jobs/<jobId>/checkpoints/<checkpointId>.json(remove timestamp/run folder for checkpoints).job.jsonper job) with helpers to create/retrieve/list and track status/usage; updaterun()to persist job lifecycle and usage on stop/complete.getCheckpointPath,getCheckpointsByJobId,getEventsByRun,getEventContents,getAllJobs,getAllRuns.executeStateMachinestoreCheckpointsignature (drops timestamp).resolveRunContextnow requires--continue-jobwhen using--resume-from; uses newgetCheckpointById(jobId, checkpointId)and latest checkpoint by job.run-managerrefactored to use runtime-provided getters and new checkpoint/job APIs.startcommand TUI wiring updated to load jobs, checkpoints, and events via new APIs.JobHistoryItem); update components, state, actions, and types accordingly.Step {stepNumber} ({checkpointId}); history row shows{expertKey} - {totalSteps} steps ({jobId}).--resume-fromrequires--continue-joband to reflect job/checkpoint model.--resume-from.Written by Cursor Bugbot for commit 7b8b47f. This will update automatically on new commits. Configure here.