|
6 | 6 | [](https://github.com/wimi321/task-bundle/stargazers) |
7 | 7 | [](./LICENSE) |
8 | 8 |
|
| 9 | + |
| 10 | + |
9 | 11 | Turn AI coding runs into portable, replayable, benchmark-ready task bundles. |
10 | 12 |
|
| 13 | +The missing middle layer between raw chat logs and heavyweight benchmark platforms. |
| 14 | + |
| 15 | +Package a task once, inspect it later, compare tools on the same starting point, and generate benchmark-style reports from real artifacts. |
| 16 | + |
11 | 17 | Task Bundle is a TypeScript + Node.js CLI for teams building agents, evals, coding benchmarks, and reproducible AI workflows. |
12 | 18 |
|
13 | | -What makes it compelling: |
14 | | -- package one coding task into a clean, shareable directory |
15 | | -- compare outputs across tools and models with real metadata and artifact hashes |
16 | | -- generate benchmark-style reports from a directory of bundles |
17 | | -- keep enough context to rerun work later without pretending replay means token-perfect recording |
| 19 | +Why people star it: |
| 20 | +- turn one AI coding run into a clean, shareable directory instead of a screenshot, transcript, or loose patch |
| 21 | +- compare Codex, Claude Code, Cursor, or internal agents with real metadata, hashes, and outcome fields |
| 22 | +- generate benchmark-style reports from a folder of bundles without building a full evaluation platform first |
| 23 | +- keep replay grounded in re-execution and comparison, not token-by-token theater |
18 | 24 |
|
19 | 25 | If you've ever wanted a format between "a raw chat log" and "a full benchmark platform", this project is that missing middle layer. |
20 | 26 |
|
|
67 | 73 | - [docs/bundle-format.zh-CN.md](./docs/bundle-format.zh-CN.md) |
68 | 74 | - [docs/design-decisions.md](./docs/design-decisions.md) |
69 | 75 | - [docs/replay-contract.md](./docs/replay-contract.md) |
| 76 | +- [docs/branding.md](./docs/branding.md) |
70 | 77 |
|
71 | 78 | ## Five-Minute Demo |
72 | 79 |
|
|
0 commit comments