Skip to content

Commit 474a99a

Browse files
sjarmakclaude
andcommitted
feat: US-012 - Starter tasks: Category A cross-repo dependency tracing (3 tasks)
Adds 3 complete MCP-unique benchmark tasks in ccb_mcp_crossrepo_tracing: - CCX-dep-trace-001: Blast radius — find all Go files in kubernetes-client-go's dynamic/ tree that import k8s.io/apimachinery/pkg/runtime. Oracle: 8 verified files. Eval: file_set_match. Fixture: kubernetes-ecosystem. - CCX-dep-trace-004: API call chain — trace Grafana→Loki HTTP query path from LokiAPI (grafana/grafana) to ParseInstantQuery (grafana-loki). Eval: dependency_chain + provenance. Fixture: grafana-observability. - CCX-config-trace-010: Symbol resolution — find where rest.Config is authoritatively defined (sg-benchmarks/kubernetes-client-go rest/config.go). Eval: symbol_resolution. Fixture: kubernetes-ecosystem. All 3 tasks pass validity gate (gold=1.0, empty=0.0). Oracle answers curated via Sourcegraph keyword_search with verified file paths. No oracle leakage in instruction.md. Registered in configs/selected_mcp_unique_tasks.json. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent 4417858 commit 474a99a

File tree

27 files changed

+2410
-1
lines changed

27 files changed

+2410
-1
lines changed
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
FROM ubuntu:22.04
2+
3+
ENV DEBIAN_FRONTEND=noninteractive
4+
5+
# Base tools
6+
RUN apt-get update && apt-get install -y --no-install-recommends \
7+
git \
8+
ca-certificates \
9+
curl \
10+
python3 \
11+
golang-go \
12+
&& rm -rf /var/lib/apt/lists/*
13+
14+
WORKDIR /workspace
15+
16+
# Clone local checkout repos (baseline config: agent has local access to these)
17+
RUN git clone --depth 1 --branch v1.32.0 https://github.com/kubernetes/kubernetes /workspace/kubernetes
18+
19+
# Initialize git identity for agent commits
20+
RUN git config --global user.email "agent@example.com" && \
21+
git config --global user.name "Agent" && \
22+
git config --global safe.directory '*'
23+
24+
# Create log directories
25+
RUN mkdir -p /logs/agent /logs/verifier
26+
27+
ENTRYPOINT []
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
# CCX-config-trace-010 — sg_only variant
2+
# No local repo clone — agent uses Sourcegraph MCP exclusively for code access.
3+
# The verifier restores the full repo from /repo_full/ before scoring.
4+
5+
FROM ubuntu:22.04
6+
7+
ENV DEBIAN_FRONTEND=noninteractive
8+
9+
RUN apt-get update && apt-get install -y --no-install-recommends \
10+
git \
11+
ca-certificates \
12+
python3 \
13+
curl \
14+
&& rm -rf /var/lib/apt/lists/*
15+
16+
WORKDIR /workspace
17+
18+
# Empty workspace — agent discovers code via MCP tools only
19+
RUN git init && \
20+
git config user.email "agent@example.com" && \
21+
git config user.name "Agent" && \
22+
git config --global safe.directory '*'
23+
24+
# Create log directories
25+
RUN mkdir -p /logs/agent /logs/verifier
26+
27+
# Mark sg_only mode — verifiers and eval scripts check this flag
28+
RUN touch /tmp/.sg_only_mode
29+
30+
ENTRYPOINT []
Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
# Stack Trace Symbol Resolution: rest.Config
2+
3+
## Your Task
4+
5+
A Kubernetes developer is debugging a production issue and encounters the following in a stack trace:
6+
7+
```
8+
goroutine 1 [running]:
9+
k8s.io/client-go/rest.(*Config).DeepCopyInto(...)
10+
vendor/k8s.io/client-go/rest/config.go:87
11+
```
12+
13+
The developer only has access to the main `kubernetes/kubernetes` repository locally.
14+
They need to find where `rest.Config` is actually defined (the authoritative source),
15+
not just a vendored copy.
16+
17+
**Specific question**: Find the repository and file path where the `Config` struct is
18+
**defined** (not vendored) in the `rest` package of `k8s.io/client-go`. What is the
19+
exact Go package import path?
20+
21+
## Context
22+
23+
You are working on a codebase task involving symbol resolution across Kubernetes ecosystem repos.
24+
The `kubernetes/kubernetes` repository vendors many dependencies in its `staging/` or `vendor/`
25+
directories, but the authoritative source lives in separate repositories accessible via MCP tools.
26+
27+
## Available Resources
28+
29+
The local `/workspace/` directory contains: kubernetes/kubernetes.
30+
31+
**Note:** Additional repositories are accessible via Sourcegraph MCP tools:
32+
- `sg-benchmarks/kubernetes-client-go` (go-client-library)
33+
- `sg-benchmarks/kubernetes-api` (api-type-definitions)
34+
- `etcd-io/etcd` (distributed-kv-store)
35+
36+
## Output Format
37+
38+
Create a file at `/workspace/answer.json` with your findings in the following structure:
39+
40+
```json
41+
{
42+
"symbols": [
43+
{"repo": "org/repo-name", "path": "relative/path/to/file.go", "symbol": "SymbolName"}
44+
],
45+
"text": "Explanation of where Config is defined, the package import path, and why this is the authoritative source."
46+
}
47+
```
48+
49+
Your answer is evaluated against a closed-world oracle — the exact repo, path, and symbol name matter.
50+
51+
## Evaluation
52+
53+
Your answer will be scored on:
54+
- **Symbol resolution**: Did you find the correct repo, file, and symbol name for the `Config` struct definition?
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
version = "1.0"
2+
3+
[metadata]
4+
name = "CCX-config-trace-010"
5+
description = "Stack trace symbol resolution across repos"
6+
license = "Apache-2.0"
7+
8+
[task]
9+
id = "CCX-config-trace-010"
10+
repo = "kubernetes/kubernetes"
11+
category = "cross-repo-config-trace"
12+
language = "go"
13+
difficulty = "medium"
14+
time_limit_sec = 900
15+
mcp_suite = "ccb_mcp_crossrepo_tracing"
16+
use_case_id = 10
17+
repo_set_id = "kubernetes-ecosystem"
18+
mcp_unique = true
19+
20+
[verification]
21+
type = "eval"
22+
command = "bash /tests/eval.sh"
23+
24+
reward_type = "score"
25+
description = "Stack trace symbol resolution across repos"
26+
27+
[environment]
28+
build_timeout_sec = 600.0
Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
#!/bin/bash
2+
# eval.sh — MCP-unique benchmark evaluator for CCX-config-trace-010
3+
# Exit-code-first (SWE-Factory pattern):
4+
# exit 0 — agent produced useful output (composite score > 0)
5+
# exit 1 — total failure (composite score == 0 or missing answer)
6+
#
7+
# Writes /logs/verifier/reward.txt with the composite score [0.0, 1.0]
8+
9+
set -euo pipefail
10+
11+
TASK_ID="CCX-config-trace-010"
12+
ANSWER_PATH="/workspace/answer.json"
13+
TASK_SPEC_PATH="/tests/task_spec.json"
14+
ORACLE_CHECKS="/tests/oracle_checks.py"
15+
REWARD_PATH="/logs/verifier/reward.txt"
16+
17+
mkdir -p /logs/verifier
18+
19+
echo "=== CCX-config-trace-010 evaluator ==="
20+
echo "Task spec: $TASK_SPEC_PATH"
21+
echo "Answer: $ANSWER_PATH"
22+
echo ""
23+
24+
# sg_only mode guard: restore full repo if verifier wrapper exists
25+
if [ -f /tmp/.sg_only_mode ] && [ -f /tests/sgonly_verifier_wrapper.sh ]; then
26+
echo "sg_only mode: sourcing verifier wrapper..."
27+
source /tests/sgonly_verifier_wrapper.sh
28+
fi
29+
30+
# Verify answer file exists
31+
if [ ! -f "$ANSWER_PATH" ]; then
32+
echo "ERROR: answer.json not found at $ANSWER_PATH"
33+
echo "0.0" > "$REWARD_PATH"
34+
exit 1
35+
fi
36+
37+
# Validate answer is valid JSON
38+
if ! python3 -c "import json; json.load(open('$ANSWER_PATH'))" 2>/dev/null; then
39+
echo "ERROR: answer.json is not valid JSON"
40+
echo "0.0" > "$REWARD_PATH"
41+
exit 1
42+
fi
43+
44+
echo "answer.json found and valid JSON"
45+
46+
# Run oracle checks
47+
if [ ! -f "$ORACLE_CHECKS" ]; then
48+
echo "ERROR: oracle_checks.py not found at $ORACLE_CHECKS"
49+
echo "0.0" > "$REWARD_PATH"
50+
exit 1
51+
fi
52+
53+
echo "Running oracle checks..."
54+
SCORE=$(python3 "$ORACLE_CHECKS" --answer "$ANSWER_PATH" --spec "$TASK_SPEC_PATH" --verbose 2>&1 | tee /dev/stderr | tail -1)
55+
56+
# Validate score is a number
57+
if ! echo "$SCORE" | python3 -c "import sys; float(sys.stdin.read().strip())" 2>/dev/null; then
58+
echo "ERROR: oracle_checks.py did not return a valid score: $SCORE"
59+
echo "0.0" > "$REWARD_PATH"
60+
exit 1
61+
fi
62+
63+
echo ""
64+
echo "Composite score: $SCORE"
65+
echo "$SCORE" > "$REWARD_PATH"
66+
67+
# Exit based on score (SWE-Factory exit-code-first pattern)
68+
python3 -c "import sys; sys.exit(0 if float('$SCORE') > 0 else 1)"
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
{
2+
"symbols": [
3+
{"repo": "sg-benchmarks/kubernetes-client-go", "path": "rest/config.go", "symbol": "Config"}
4+
],
5+
"text": "The rest.Config struct is authoritatively defined in sg-benchmarks/kubernetes-client-go at rest/config.go. The Go package import path is k8s.io/client-go/rest. The kubernetes/kubernetes repository vendors this code under staging/src/k8s.io/client-go/rest/, but the canonical source is in the kubernetes-client-go repository.",
6+
"_metadata": {
7+
"oracle_type": "symbol_resolution",
8+
"discovery_method": "sourcegraph_keyword_search",
9+
"query": "repo:^github.com/sg-benchmarks/kubernetes-client-go$ 'type Config struct' file:rest/config.go"
10+
}
11+
}

0 commit comments

Comments
 (0)