Skip to content

Commit 5497aa9

Browse files
add mcp docs
1 parent e7c52e5 commit 5497aa9

File tree

2 files changed

+126
-7
lines changed

2 files changed

+126
-7
lines changed

integrations/introduction.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ Besides the Python and TypeScript SDKs, there are a few other ways to get ZeroEv
1818
icon="plug"
1919
href="/integrations/mcp"
2020
>
21-
Connect ZeroEval to AI agents via Model Context Protocol (coming soon)
21+
Connect AI agents to ZeroEval via the Model Context Protocol
2222
</Card>
2323
<Card
2424
title="CLI"

integrations/mcp.mdx

Lines changed: 125 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,131 @@
11
---
22
title: "MCP"
3-
description: "Model Context Protocol server for ZeroEval"
3+
description: "Connect AI agents to ZeroEval via the Model Context Protocol"
44
---
55

6-
<Note>
7-
MCP integration is coming soon.
8-
</Note>
6+
The ZeroEval MCP server lets AI agents inspect traces, manage judges and prompts, submit feedback, run optimizations, and deploy to production, all without leaving the agent context. It speaks the [Model Context Protocol](https://modelcontextprotocol.io), so any MCP-compatible client (Cursor, Claude Code, Windsurf, etc.) can connect directly.
97

10-
We're building an MCP server so AI agents can query traces, manage prompts, and run evaluations through ZeroEval without leaving the agent context.
8+
## Setup
119

12-
Want to know when it ships? Email [founders@zeroeval.com](mailto:founders@zeroeval.com).
10+
The fastest way to get started is to point your MCP client at the hosted server. No installation required.
11+
12+
### Cursor
13+
14+
Add this to your Cursor MCP settings (`.cursor/mcp.json`):
15+
16+
```json
17+
{
18+
"mcpServers": {
19+
"zeroeval": {
20+
"url": "https://mcp.zeroeval.com/mcp",
21+
"headers": {
22+
"Authorization": "Bearer <your-project-api-key>"
23+
}
24+
}
25+
}
26+
}
27+
```
28+
29+
### Claude Code
30+
31+
```bash
32+
claude mcp add zeroeval --transport http https://mcp.zeroeval.com/mcp \
33+
--header "Authorization: Bearer <your-project-api-key>"
34+
```
35+
36+
### Other MCP clients
37+
38+
Any client that supports HTTP transport works. Set the server URL to `https://mcp.zeroeval.com/mcp` and pass your project API key in the `Authorization: Bearer <key>` header.
39+
40+
<Tip>
41+
Get your project API key from the [ZeroEval dashboard](https://app.zeroeval.com) under **Settings → API Keys**.
42+
</Tip>
43+
44+
## Resources
45+
46+
The server exposes two MCP resources for introspection:
47+
48+
| URI | Description |
49+
|-----|-------------|
50+
| `config://server-context` | Redacted server config: auth mode, base URL, project scope, and feature flags |
51+
| `docs://capabilities` | Canonical tool and resource inventory with annotations and output contract summary |
52+
53+
## Tools
54+
55+
### Read tools
56+
57+
Read tools are safe to call at any time. They do not modify state.
58+
59+
| Tool | Description |
60+
|------|-------------|
61+
| `list-traces` | List recent traces |
62+
| `get-trace` | Get a trace with its spans |
63+
| `list-judges` | List all judges |
64+
| `get-judge` | Get judge details and linkage state |
65+
| `list-judge-evaluations` | List evaluations from a judge |
66+
| `get-judge-criteria` | Get scoring criteria for a judge |
67+
| `list-prompts` | List all prompts |
68+
| `get-prompt` | Get a prompt at a specific version or tag |
69+
| `list-prompt-versions` | List all versions of a prompt |
70+
| `list-optimization-runs` | List optimization runs for a task |
71+
| `get-optimization-run` | Get run details with candidate prompt and metrics |
72+
| `get-project-summary` | High-level project monitoring summary |
73+
74+
### Write tools
75+
76+
All write tools require `confirm: true` in the request and are annotated with `destructiveHint: true` so MCP clients can prompt for user approval before calling.
77+
78+
| Tool | Description |
79+
|------|-------------|
80+
| `create-judge` | Create a new judge |
81+
| `link-judge-to-prompt` | Link a judge to a prompt |
82+
| `unlink-judge-from-prompt` | Remove a judge's prompt link |
83+
| `create-judge-feedback` | Submit feedback on a judge evaluation |
84+
| `create-prompt-feedback` | Submit feedback on a prompt completion |
85+
| `start-prompt-optimization` | Start a prompt optimization run |
86+
| `start-judge-optimization` | Start a judge optimization run |
87+
| `cancel-optimization-run` | Cancel a running optimization |
88+
89+
### Deploy
90+
91+
Production deploys always require two steps:
92+
93+
1. **Preview:** Call `preview-optimization-deploy` with the run ID. This verifies the run succeeded, summarizes the candidate vs current production, and returns a time-limited confirmation receipt.
94+
2. **Deploy:** Call `deploy-optimization-run` with `confirm: true` and the receipt from preview. The server re-reads current state and rejects the deploy if anything drifted since the preview.
95+
96+
| Tool | Description |
97+
|------|-------------|
98+
| `preview-optimization-deploy` | Preview what deploying a run would do (read-only) |
99+
| `deploy-optimization-run` | Deploy a succeeded run to production (requires receipt + confirm) |
100+
101+
### Proposal tools
102+
103+
Proposal tools are read-only helpers that gather evidence or prepare the exact next mutating call without executing it.
104+
105+
| Tool | Description |
106+
|------|-------------|
107+
| `investigate-prompt-issues` | Gather evidence about prompt state and recommend next steps |
108+
| `investigate-judge-issues` | Gather evidence about judge state and recommend next steps |
109+
| `prepare-prompt-optimization` | Propose the exact `start-prompt-optimization` call to make |
110+
| `prepare-judge-optimization` | Propose the exact `start-judge-optimization` call to make |
111+
112+
## Good to know
113+
114+
- **Single-project scope.** Each MCP connection is tied to one ZeroEval project. To work with a different project, use a different API key.
115+
- **Optimization prerequisites.** Prompts must have been used with `ze.prompt()` before optimization is available. Judges need a linked tuning task.
116+
- **Proposal tools are read-only.** The `investigate-*` and `prepare-*` tools never mutate state. They recommend the next tool call for the agent to confirm and execute.
117+
118+
<CardGroup cols={2}>
119+
<Card title="Tracing quickstart" icon="rocket" href="/tracing/quickstart">
120+
Get your first trace in under 5 minutes
121+
</Card>
122+
<Card title="Prompt setup" icon="wrench" href="/autotune/setup">
123+
Add ze.prompt() to your codebase
124+
</Card>
125+
<Card title="Judges" icon="gavel" href="/judges/introduction">
126+
How calibrated judges evaluate your production traffic
127+
</Card>
128+
<Card title="CLI" icon="terminal" href="/integrations/cli">
129+
Manage traces, prompts, and judges from your terminal
130+
</Card>
131+
</CardGroup>

0 commit comments

Comments
 (0)