Skip to content

Commit 0804902

Browse files
sjarmakclaude
andcommitted
fix: add github.com/ prefix-stripping note to all MCP-unique instructions
MCP agent was writing repo identifiers with the full Sourcegraph URL prefix (e.g., github.com/sg-benchmarks/kubernetes-client-go) instead of the oracle format (sg-benchmarks/kubernetes-client-go). This caused ccx-config-trace-010 to score 0.0 on MCP despite correctly identifying the right repo and file. Added explicit **Note** to all 11 instructions with structured repo fields: "Sourcegraph MCP tools return repo names with a github.com/ prefix. Strip this prefix in your answer." Confirmed root cause via trajectory inspection of ccx-config-trace-010 MCP run. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent 34b8d29 commit 0804902

File tree

11 files changed

+13
-0
lines changed

11 files changed

+13
-0
lines changed

benchmarks/ccb_mcp_crossorg/ccx-crossorg-061/instruction.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,7 @@ Create a file at `/workspace/answer.json` with your findings:
4848
```
4949

5050
**Important**: Use exact repo identifiers as they appear in Sourcegraph. The oracle expects entries for `kubernetes/kubernetes` and `grafana/grafana`. The `repo` field must match these exactly.
51+
**Note**: Sourcegraph MCP tools return repo names with a `github.com/` prefix (e.g., `github.com/sg-benchmarks/kubernetes-client-go`). Strip this prefix in your answer — use `sg-benchmarks/kubernetes-client-go`, NOT `github.com/sg-benchmarks/kubernetes-client-go`.
5152

5253
## Evaluation
5354

benchmarks/ccb_mcp_crossrepo_tracing/ccx-config-trace-010/instruction.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@ Create a file at `/workspace/answer.json` with your findings in the following st
4646
```
4747

4848
**Important**: The local `/workspace/client-go` directory contains the `kubernetes/client-go` source, but in Sourcegraph it is indexed as `sg-benchmarks/kubernetes-client-go`. Use `sg-benchmarks/kubernetes-client-go` as the `repo` value in your answer — the oracle checks for this exact identifier.
49+
**Note**: Sourcegraph MCP tools return repo names with a `github.com/` prefix (e.g., `github.com/sg-benchmarks/kubernetes-client-go`). Strip this prefix in your answer — use `sg-benchmarks/kubernetes-client-go`, NOT `github.com/sg-benchmarks/kubernetes-client-go`.
4950

5051
Your answer is evaluated against a closed-world oracle — the exact repo, path, and symbol name matter.
5152

benchmarks/ccb_mcp_crossrepo_tracing/ccx-dep-trace-001/instruction.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,7 @@ Create a file at `/workspace/answer.json` with your findings in the following st
3838
```
3939

4040
**Important**: Use `"repo": "sg-benchmarks/kubernetes-client-go"` exactly — this is the canonical repo identifier used by the evaluation oracle. The local checkout at `/workspace/client-go` corresponds to this repo.
41+
**Note**: Sourcegraph MCP tools return repo names with a `github.com/` prefix (e.g., `github.com/sg-benchmarks/kubernetes-client-go`). Strip this prefix in your answer — use `sg-benchmarks/kubernetes-client-go`, NOT `github.com/sg-benchmarks/kubernetes-client-go`.
4142

4243
Include only the `files` field. Your answer is evaluated against a closed-world oracle — completeness matters.
4344

benchmarks/ccb_mcp_crossrepo_tracing/ccx-dep-trace-004/instruction.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@ Create a file at `/workspace/answer.json` with your findings in the following st
4242
**Important**: Use exact repo identifiers as they appear in the oracle:
4343
- For Grafana: `"repo": "grafana/grafana"`
4444
- For Loki: `"repo": "sg-benchmarks/grafana-loki"`
45+
**Note**: Sourcegraph MCP tools return repo names with a `github.com/` prefix (e.g., `github.com/sg-benchmarks/kubernetes-client-go`). Strip this prefix in your answer — use `sg-benchmarks/kubernetes-client-go`, NOT `github.com/sg-benchmarks/kubernetes-client-go`.
4546

4647
The local checkout at `/workspace/loki` corresponds to `sg-benchmarks/grafana-loki`.
4748

benchmarks/ccb_mcp_incident/ccx-incident-031/instruction.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,7 @@ Create a file at `/workspace/answer.json` with your findings:
6060
```
6161

6262
**Important**: Use `etcd-io/etcd` as the exact `repo` identifier in your answer. The oracle checks for files `server/storage/mvcc/kvstore.go` and `server/storage/mvcc/kvstore_txn.go` in `etcd-io/etcd`. Do not cite vendored copies in `kubernetes/kubernetes`.
63+
**Note**: Sourcegraph MCP tools return repo names with a `github.com/` prefix (e.g., `github.com/sg-benchmarks/kubernetes-client-go`). Strip this prefix in your answer — use `sg-benchmarks/kubernetes-client-go`, NOT `github.com/sg-benchmarks/kubernetes-client-go`.
6364

6465
## Evaluation
6566

benchmarks/ccb_mcp_onboarding/ccx-explore-042-ds/instruction.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,7 @@ Create a file at `/workspace/answer.json` with your findings:
5252
```
5353

5454
**Important**: Use exact repo identifiers as they appear in Sourcegraph. The oracle expects `repo` values of `numpy/numpy` (array layer), `pandas-dev/pandas` (data structure layer), and `scipy/scipy` (scientific computation layer). The `repo` field must match these exactly.
55+
**Note**: Sourcegraph MCP tools return repo names with a `github.com/` prefix (e.g., `github.com/sg-benchmarks/kubernetes-client-go`). Strip this prefix in your answer — use `sg-benchmarks/kubernetes-client-go`, NOT `github.com/sg-benchmarks/kubernetes-client-go`.
5556

5657
The `chain` should contain at least 3 steps representing the 3 layers described above.
5758

benchmarks/ccb_mcp_onboarding/ccx-onboard-041/instruction.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,9 @@ Create a file at `/workspace/answer.json` with your findings:
4040
List all files that contain `from scipy.stats import`. Your answer is evaluated against
4141
a closed-world oracle — completeness matters.
4242

43+
**Important**: Use exact repo identifiers as they appear in Sourcegraph. The oracle expects `repo` values of `pandas-dev/pandas`. The `repo` field must match exactly.
44+
**Note**: Sourcegraph MCP tools return repo names with a `github.com/` prefix (e.g., `github.com/sg-benchmarks/kubernetes-client-go`). Strip this prefix in your answer — use `sg-benchmarks/kubernetes-client-go`, NOT `github.com/sg-benchmarks/kubernetes-client-go`.
45+
4346
## Evaluation
4447

4548
Your answer will be scored on:

benchmarks/ccb_mcp_onboarding/ccx-onboard-050-ds/instruction.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,7 @@ Create a file at `/workspace/answer.json` with your findings:
4747
```
4848

4949
**Important**: Use exact repo identifiers as they appear in Sourcegraph. The oracle expects `repo` values of `sg-benchmarks/kubernetes-client-go` (client layer), `kubernetes/kubernetes` (API server layer), and `etcd-io/etcd` (storage layer). The `repo` field must match these exactly.
50+
**Note**: Sourcegraph MCP tools return repo names with a `github.com/` prefix (e.g., `github.com/sg-benchmarks/kubernetes-client-go`). Strip this prefix in your answer — use `sg-benchmarks/kubernetes-client-go`, NOT `github.com/sg-benchmarks/kubernetes-client-go`.
5051

5152
The `chain` should contain at least 3 steps representing the 3 layers described above.
5253

benchmarks/ccb_mcp_platform/ccx-explore-091-ds/instruction.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@ Create a file at `/workspace/answer.json` with your findings:
5151
```
5252

5353
**Important**: Use exact repo identifiers as they appear in Sourcegraph. The oracle expects entries for `sg-benchmarks/kubernetes-api` (API type definitions) and `sg-benchmarks/kubernetes-client-go` (client examples and docs). The `repo` field must match these exactly.
54+
**Note**: Sourcegraph MCP tools return repo names with a `github.com/` prefix (e.g., `github.com/sg-benchmarks/kubernetes-client-go`). Strip this prefix in your answer — use `sg-benchmarks/kubernetes-client-go`, NOT `github.com/sg-benchmarks/kubernetes-client-go`.
5455

5556
The `files` list should include at least 3 files across 2+ repos that together define
5657
the canonical service deployment pattern.

benchmarks/ccb_mcp_security/ccx-vuln-remed-011/instruction.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,7 @@ Create a file at `/workspace/answer.json` with your findings:
4848
```
4949

5050
**Important**: Use exact repo identifiers as they appear in Sourcegraph. The repos to search are `nodejs/node`, `sg-benchmarks/expressjs-express`, `sg-benchmarks/lodash`, and `sg-benchmarks/prisma-prisma`. Note: the local `/workspace/express` directory maps to `sg-benchmarks/expressjs-express` in Sourcegraph — use `sg-benchmarks/expressjs-express` as the `repo` value in your answer.
51+
**Note**: Sourcegraph MCP tools return repo names with a `github.com/` prefix (e.g., `github.com/sg-benchmarks/kubernetes-client-go`). Strip this prefix in your answer — use `sg-benchmarks/kubernetes-client-go`, NOT `github.com/sg-benchmarks/kubernetes-client-go`.
5152

5253
Include only entries where `cookie` appears under `"dependencies"` (not `"devDependencies"`
5354
or `"scripts"`). Your answer is evaluated against a closed-world oracle — completeness matters.

0 commit comments

Comments
 (0)