Problem: Your AI agent depends on external MCP servers you don't control. When those servers change their tool definitions (rename parameters, remove tools, add required fields), your agent breaks silently.
Solution: EvalView's MCP contract testing captures a snapshot of a server's tool definitions and diffs against it on every CI run. If the interface changed, you know immediately — before running your full test suite.
Detect when external MCP servers change their interface before your agent breaks.
When you use MCP servers you don't own (Scenario 2), the server can change its tool definitions at any time: rename parameters, remove tools, add required fields. Your agent tests pass today and fail tomorrow — not because your code changed, but because the server did.
MCP contract testing captures a snapshot of a server's tool definitions and diffs against it on every CI run. If the interface changed, you know immediately — before running your full test suite.
This mirrors EvalView's golden baseline system:
- Golden traces detect when your agent's behavior drifts
- MCP contracts detect when an external server's interface drifts
evalview mcp snapshot "npx:@modelcontextprotocol/server-github" --name server-githubOutput:
Snapshot saved: .evalview/contracts/server-github.contract.json
Tools discovered: 8
- create_issue
- list_issues
- create_pull_request
- ...
evalview mcp check server-githubIf the server changed:
CONTRACT_DRIFT - 2 breaking change(s)
REMOVED: create_pull_request - tool 'create_pull_request' no longer available
CHANGED: list_issues - new required parameter 'owner'
evalview run tests/ --contracts --fail-on "REGRESSION,CONTRACT_DRIFT"The --contracts flag checks all saved contracts before running tests.
If any contract drifted, the run aborts immediately — no point testing against
a broken interface.
Capture tool definitions from an MCP server.
evalview mcp snapshot <endpoint> --name <server-name> [--notes "..."] [--timeout 30]| Argument | Description |
|---|---|
endpoint |
MCP server endpoint (e.g., npx:@modelcontextprotocol/server-github) |
--name |
Human-readable identifier for this contract (required) |
--notes |
Optional notes about this snapshot |
--timeout |
Connection timeout in seconds (default: 30) |
Supports all MCP transport types:
- stdio:
"npx:@modelcontextprotocol/server-filesystem /tmp" - HTTP:
"http://localhost:8080" - Command:
"stdio:python my_server.py"
Compare current server interface against a saved contract.
evalview mcp check <name> [--endpoint <override>] [--timeout 30]| Argument | Description |
|---|---|
name |
Contract name (from --name in snapshot) |
--endpoint |
Override endpoint (default: use endpoint from snapshot) |
Exit codes:
0— No breaking changes1— Breaking changes detected (CONTRACT_DRIFT)2— Could not connect to server
List all saved contracts.
evalview mcp listShow full details of a contract including all tool schemas.
evalview mcp show <name>Remove a contract.
evalview mcp delete <name> [--force]The --contracts flag adds a pre-flight check to any test run:
evalview run tests/ --contractsThis checks all contracts in .evalview/contracts/ before running tests.
Combine with --fail-on CONTRACT_DRIFT to fail CI on drift:
evalview run tests/ --contracts --fail-on "REGRESSION,CONTRACT_DRIFT"Or use --strict (now includes CONTRACT_DRIFT):
evalview run tests/ --contracts --strict- name: Run EvalView
uses: hidai25/eval-view@v0.6.1
with:
diff: true
contracts: true
fail-on: 'REGRESSION,CONTRACT_DRIFT'| Change | Example |
|---|---|
| Tool removed | create_pull_request no longer exists |
| Required parameter added | New required param owner on list_issues |
| Parameter removed | repo param no longer accepted |
| Parameter type changed | limit changed from string to integer |
| Parameter became required | owner was optional, now required |
| Change | Example |
|---|---|
| New tool added | merge_pull_request now available |
| Optional parameter added | New optional param labels on create_issue |
| Description changed | Tool description updated |
Contracts are stored as JSON in .evalview/contracts/:
{
"metadata": {
"server_name": "server-github",
"endpoint": "npx:@modelcontextprotocol/server-github",
"snapshot_at": "2026-02-07T10:30:00",
"protocol_version": "2024-11-05",
"tool_count": 8,
"schema_hash": "a1b2c3d4e5f67890"
},
"tools": [
{
"name": "create_issue",
"description": "Create a new issue in a GitHub repository",
"inputSchema": {
"type": "object",
"properties": {
"repo": { "type": "string" },
"title": { "type": "string" },
"body": { "type": "string" }
},
"required": ["repo", "title"]
}
}
]
}Commit these files to your repo so CI can use them.
-
Snapshot after verifying — Run your tests first, confirm everything works, then snapshot. The contract represents a known-good interface.
-
Refresh periodically — If a contract is >30 days old,
evalview mcp checkwill warn you. Re-snapshot to accept intentional changes. -
One contract per server — Name contracts after the server, not the tools.
server-githubnotcreate-issue-tool. -
Commit contracts — Store
.evalview/contracts/in git. They're small JSON files and CI needs them. -
Check before testing — Use
--contractsonevalview runso drift is caught before wasting time on tests that will fail anyway.
- Golden Traces (Regression Detection) — Detect behavioral drift in your agent
- CI/CD Integration — Run contract checks in CI
- CLI Reference — Full command reference for
evalview mcp - Framework Support — Supported agent frameworks