Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
4fe2a62
Add: Parallel tool call support in @perstack/core
FL4TLiN3 Dec 8, 2025
4a890bd
Add: Parallel tool call execution in @perstack/runtime
FL4TLiN3 Dec 8, 2025
6bb5ea5
Update: TUI and CLI for parallel tool calls
FL4TLiN3 Dec 8, 2025
7dfc6b1
Docs: Update runtime documentation for parallel tool calls
FL4TLiN3 Dec 8, 2025
c6dd0a1
Docs: Update testing.mdx for parallel tool calls
FL4TLiN3 Dec 8, 2025
0815871
Add: Mixed tool call support (MCP + Delegate + Interactive)
FL4TLiN3 Dec 8, 2025
88cff44
Docs: Update runtime documentation for mixed tool calls
FL4TLiN3 Dec 8, 2025
32638f9
Fix: Mixed tool call state propagation
FL4TLiN3 Dec 8, 2025
98a641b
Test: Update tests for mixed tool call support
FL4TLiN3 Dec 8, 2025
189fece
Update: TUI to handle mixed tool call events
FL4TLiN3 Dec 8, 2025
bbd6c0a
Add: E2E test framework for mixed tool calls
FL4TLiN3 Dec 8, 2025
c4740fd
Add: E2E test scenarios (parallel-mcp, delegate-chain, continue-resume)
FL4TLiN3 Dec 8, 2025
997fdb9
Refactor: Migrate E2E tests to vitest for parallel execution
FL4TLiN3 Dec 8, 2025
29e6ae6
Refactor: Clean up E2E test code and remove unused functions
FL4TLiN3 Dec 8, 2025
5007675
Add: tsconfig.json for e2e tests
FL4TLiN3 Dec 8, 2025
70178ec
Update: Add paths mapping to e2e tsconfig for package dependencies
FL4TLiN3 Dec 8, 2025
67a90d9
Add: CLI E2E tests and remove manual E2E.md
FL4TLiN3 Dec 8, 2025
b938c4c
Refactor: Split CLI tests by command and remove help tests
FL4TLiN3 Dec 8, 2025
beb28c3
Add: Missing CLI error case tests for run, unpublish, status
FL4TLiN3 Dec 8, 2025
f376ae7
Docs: Add E2E README and update E2E test instructions
FL4TLiN3 Dec 8, 2025
f1555bd
Chore: Exclude e2e directory from knip check
FL4TLiN3 Dec 8, 2025
b80927d
Fix: Keep current tool call in pendingToolCalls for delegate/interactive
FL4TLiN3 Dec 8, 2025
9a65b0a
Update: Add FileInlinePart to ToolResultPart contents
FL4TLiN3 Dec 8, 2025
abeb0f6
Refactor: Unify special tool handling for parallel execution
FL4TLiN3 Dec 8, 2025
47a90b5
Test: Update tests for unified special tool handling
FL4TLiN3 Dec 8, 2025
46d9ed8
Test: Add E2E test for special tools parallel execution
FL4TLiN3 Dec 8, 2025
325e082
Chore: Apply code formatting
FL4TLiN3 Dec 8, 2025
886ca47
Update: E2E test for special tools with PDF and image
FL4TLiN3 Dec 8, 2025
712fbb7
Fix: Set toolResults in finishAllToolCalls and include fileInlinePart
FL4TLiN3 Dec 8, 2025
7e6f685
Test: Improve unit test coverage for states
FL4TLiN3 Dec 8, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 32 additions & 0 deletions .changeset/parallel-tool-calls.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
---
"@perstack/core": patch
"@perstack/runtime": patch
"@perstack/api-client": patch
"@perstack/base": patch
"@perstack/tui": patch
"perstack": patch
---

Add parallel tool call support and mixed tool call handling

Features:

- Process all tool calls from a single LLM response instead of only the first one
- MCP tools execute in parallel using `Promise.all`
- Support mixed tool calls (MCP + Delegate + Interactive in same response)
- Process tools in priority order: MCP → Delegate → Interactive
- Preserve partial results across checkpoint boundaries

Schema Changes:

- `Step.toolCall` → `Step.toolCalls` (array)
- `Step.toolResult` → `Step.toolResults` (array)
- Add `Step.pendingToolCalls` for tracking unprocessed tool calls
- Add `Checkpoint.pendingToolCalls` and `Checkpoint.partialToolResults` for resume

Event Changes:

- `callTool` → `callTools`
- `resolveToolResult` → `resolveToolResults`
- Add `resumeToolCalls` and `finishAllToolCalls` events

10 changes: 5 additions & 5 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -416,7 +416,7 @@ Key points:
## Testing

- **Unit tests:** Vitest (`*.test.ts` files), run with `pnpm test`
- **E2E tests:** Manual testing by following `E2E.md` — agent should read and execute the procedures
- **E2E tests:** Vitest (`e2e/*.test.ts` files), run with `pnpm test:e2e`
- **Coverage:** V8 provider, lcov output

### Unit Test Scope
Expand Down Expand Up @@ -523,11 +523,11 @@ pnpm build # Build all packages

### E2E Testing (MANDATORY)

After build passes, run E2E tests by following `E2E.md`:
After build passes, run E2E tests:

```bash
pnpm build # Must build first
# Then run E2E tests as documented in E2E.md
pnpm build # Must build first
pnpm test:e2e # Run E2E tests
```

**E2E tests must pass before pushing.** This catches runtime issues that unit tests miss.
Expand Down Expand Up @@ -599,5 +599,5 @@ pick = ["attemptCompletion", "think"]
- [ ] `pnpm check-deps` passes
- [ ] `pnpm reset && pnpm test` passes
- [ ] `pnpm build` passes
- [ ] E2E tests pass (follow `E2E.md`)
- [ ] `pnpm test:e2e` passes
- [ ] Versioning rules in `CONTRIBUTING.md` are followed
14 changes: 11 additions & 3 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,8 @@ pnpm build
git checkout -b feature/your-feature
# ... edit code ...
pnpm changeset
pnpm typecheck && pnpm test
pnpm typecheck && pnpm test && pnpm build
pnpm test:e2e # Run E2E tests
git commit -m "feat: your changes"
```

Expand Down Expand Up @@ -195,6 +196,7 @@ pnpm changeset
pnpm typecheck # Must pass
pnpm test # Must pass
pnpm build # Must succeed
pnpm test:e2e # Run E2E tests
```

### 4. Commit and Push
Expand Down Expand Up @@ -428,8 +430,13 @@ Perstack uses a two-stage release workflow powered by [changesets/action](https:
- Updated `CHANGELOG.md` with PR links and author attribution

**Stage 2: Publish**
1. Review and merge "Version Packages" PR
2. Release workflow automatically:
1. Review "Version Packages" PR
2. **Run E2E tests locally before merging:**
```bash
pnpm build && pnpm test:e2e
```
3. Merge "Version Packages" PR
4. Release workflow automatically:
- Publishes packages to npm
- Creates git tags
- Creates GitHub Releases
Expand Down Expand Up @@ -571,6 +578,7 @@ Before requesting review, ensure:
- [ ] Changeset created with appropriate version bump
- [ ] All tests pass (`pnpm test`)
- [ ] Types check across all packages (`pnpm typecheck`)
- [ ] E2E tests pass (`pnpm test:e2e`)
- [ ] Documentation updated (README, JSDoc, CHANGELOG via changeset)
- [ ] Migration guide included (for breaking changes)
- [ ] No unintended version sync issues
Expand Down
113 changes: 0 additions & 113 deletions E2E.md

This file was deleted.

4 changes: 2 additions & 2 deletions docs/content/making-experts/testing.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -70,8 +70,8 @@ import { run } from "@perstack/runtime"
const result = await run(params, {
// Mock eventListener for assertions
eventListener: (event) => {
if (event.type === "callTool") {
expect(event.toolCall.name).toBe("expectedTool")
if (event.type === "callTools") {
expect(event.toolCalls[0].toolName).toBe("expectedTool")
}
}
})
Expand Down
120 changes: 120 additions & 0 deletions e2e/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
# E2E Tests

End-to-end tests for Perstack CLI and runtime.

## Prerequisites

```bash
pnpm build
```

## Running Tests

```bash
# Run all E2E tests (parallel execution)
pnpm test:e2e

# Run specific test file
pnpm test:e2e -- run.test.ts

# Run tests matching pattern
pnpm test:e2e -- --testNamePattern "publish"
```

## Test Structure

```
e2e/
├── lib/ # Test utilities
│ ├── runner.ts # CLI and Expert execution
│ ├── event-parser.ts # Runtime event parsing
│ └── assertions.ts # Custom assertions
├── experts/ # Expert definitions for tests
│ ├── mixed-tools.toml # MCP + Delegate + Interactive
│ ├── parallel-mcp.toml # Parallel MCP calls
│ ├── delegate-chain.toml # Delegation chain
│ └── continue-resume.toml # Continue/resume functionality
├── run.test.ts # CLI run command
├── publish.test.ts # CLI publish command
├── unpublish.test.ts # CLI unpublish command
├── tag.test.ts # CLI tag command
├── status.test.ts # CLI status command
├── mixed-tools.test.ts # Mixed tool calls (MCP + Delegate + Interactive)
├── parallel-mcp.test.ts # Parallel MCP tool execution
├── delegate-chain.test.ts # Expert delegation chain
└── continue-resume.test.ts # --continue-run and --resume-from
```

## Test Categories

### CLI Commands

Tests for CLI argument validation and error handling.

| File | Tests | Coverage |
|------|-------|----------|
| run.test.ts | 4 | Missing args, nonexistent expert, invalid config |
| publish.test.ts | 4 | dry-run success, nonexistent expert, config errors |
| unpublish.test.ts | 2 | Missing version, missing --force |
| tag.test.ts | 2 | Missing version, missing tags |
| status.test.ts | 3 | Missing version/status, invalid status |

### Runtime Features

Tests for parallel tool calls, delegation, and state management.

| File | Tests | Coverage |
|------|-------|----------|
| mixed-tools.test.ts | 4 | MCP + Delegate + Interactive in single response |
| parallel-mcp.test.ts | 3 | Parallel MCP tool execution |
| delegate-chain.test.ts | 3 | Multi-level delegation |
| continue-resume.test.ts | 4 | --continue-run, --resume-from |

## Writing Tests

### CLI Command Tests

```typescript
import { describe, expect, it } from "vitest"
import { runCli } from "./lib/runner.js"

describe("CLI command", () => {
it("should fail with invalid args", async () => {
const result = await runCli(["command", "invalid-arg"])
expect(result.exitCode).toBe(1)
})
})
```

### Runtime Tests

```typescript
import { beforeAll, describe, expect, it } from "vitest"
import { assertEventSequenceContains } from "./lib/assertions.js"
import { type RunResult, runExpert } from "./lib/runner.js"

describe("Runtime feature", () => {
let result: RunResult

beforeAll(async () => {
result = await runExpert("expert-key", "query", {
configPath: "./e2e/experts/your-expert.toml",
timeout: 180000,
})
}, 200000)

it("should emit expected events", () => {
expect(
assertEventSequenceContains(result.events, ["startRun", "completeRun"]).passed,
).toBe(true)
})
})
```

## Notes

- Tests run in parallel via vitest
- Runtime tests require API keys (set in `.env.local`)
- TUI-based commands (`start`) are excluded from E2E tests
- API-calling tests (actual publish, unpublish) require registry access and are not included

Loading