Skip to content

feat: add coaching MCP tool for personalized engineering guidance#64

Merged
ccf merged 4 commits intomainfrom
feat/coaching-mcp-tool
Mar 6, 2026
Merged

feat: add coaching MCP tool for personalized engineering guidance#64
ccf merged 4 commits intomainfrom
feat/coaching-mcp-tool

Conversation

@ccf
Copy link
Owner

@ccf ccf commented Mar 6, 2026

Summary

New coaching MCP tool that synthesizes multiple analytics into a single, prioritized coaching brief. Engineers call @primer coaching in Claude Code and get a personalized report: what's working, what's slowing them down, and what to try next.

What it returns

## Your Primer Coaching Brief

**Status**: 47 sessions · 68% success rate · Leverage: 52.3 · Effectiveness: 71.0

### What's slowing you down
- **permission_denied** (12 occurrences) — Update your tool's allowed directories
- **context_limit** (8 occurrences) — Break tasks into smaller sessions

### Where you could level up
- You use 1 model. Try Haiku for quick lookups — 3-5x cheaper.
- You haven't used orchestration tools. Engineers who delegate have higher success rates.

### Top recommendations
1. Update `.claude/settings.json` allowed directories
2. Try `Task:explore` for codebase research
3. Add a CLAUDE.md to your main project

Architecture

  • coaching_service.py (new) — orchestrates calls to 5 existing services (overview, maturity, friction, personalized_tips, config_optimization), then synthesizes into a CoachingBrief
  • GET /api/v1/analytics/coaching — API endpoint, engineer-scoped (rejects admin keys)
  • coaching MCP tool — 6th tool in the sidecar, formats API response as markdown

The coaching service is a pure synthesis layer — no new analytics computation, just smart combination of existing data with friction-specific fix suggestions and skill gap detection.

Tests (4 new)

  • Admin key rejected with helpful error
  • Empty state (no sessions)
  • Populated state (sessions with tools, facets, friction)
  • Custom days parameter

Test plan

  • 586 backend tests pass (4 new)
  • 173 frontend tests pass
  • ruff check and ruff format clean
  • All pre-commit hooks pass

🤖 Generated with Claude Code


Note

Medium Risk
Adds a new engineer-scoped analytics endpoint and synthesis service that aggregates multiple existing analytics calls; main risk is correctness/performance of the combined queries and ensuring auth scoping stays strict (admin keys are explicitly rejected).

Overview
Adds a new personalized coaching brief surfaced end-to-end: a CoachingBrief/CoachingSection response schema, a new GET /api/v1/analytics/coaching endpoint (engineer-scoped; returns 400 for admin keys), and a new coaching_service.get_coaching_brief that synthesizes overview, maturity, friction, personalized tips, and config optimization into three prioritized sections.

Exposes this via the sidecar as a new MCP coaching tool (primer_coaching) that calls the new endpoint and renders the response as markdown, and adds backend tests covering auth rejection, empty-state, populated sessions, and the days query param.

Written by Cursor Bugbot for commit 4608385. This will update automatically on new commits. Configure here.

New `coaching` MCP tool that synthesizes multiple data sources into
a single, prioritized coaching brief — what's working, what's not,
and what to try next.

Backend:
- New coaching_service.py orchestrates calls to overview, maturity,
  friction, personalized_tips, and config_optimization services
- Returns CoachingBrief with 3 sections: friction hotspots, skill
  opportunities, and top recommendations
- Friction section includes specific fix suggestions per friction type
- Skills section detects model diversity gaps, orchestration gaps,
  and cache efficiency issues
- GET /api/v1/analytics/coaching endpoint (engineer-scoped)

MCP:
- New `coaching` tool in sidecar (6th tool)
- Formats API response as readable markdown
- Usage: @primer coaching or @primer coaching days=7

Tests:
- test_coaching_requires_engineer_key (admin rejected)
- test_coaching_empty (no sessions)
- test_coaching_with_sessions (populated sections)
- test_coaching_days_param

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Autofix Details

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Truthiness check on leverage_score skips zero values
    • Changed if profile and profile.leverage_score: to if profile and profile.leverage_score is not None: so a valid score of 0.0 is no longer skipped.

Create PR

Or push these changes by commenting:

@cursor push 54a29252c3
Preview (54a29252c3)
diff --git a/src/primer/server/services/coaching_service.py b/src/primer/server/services/coaching_service.py
--- a/src/primer/server/services/coaching_service.py
+++ b/src/primer/server/services/coaching_service.py
@@ -23,7 +23,7 @@
     parts = [f"{overview.total_sessions} sessions"]
     if overview.success_rate is not None:
         parts.append(f"{_fmt_pct(overview.success_rate)} success rate")
-    if profile and profile.leverage_score:
+    if profile and profile.leverage_score is not None:
         parts.append(f"Leverage: {profile.leverage_score:.1f}")
     if profile and profile.effectiveness_score is not None:
         parts.append(f"Effectiveness: {profile.effectiveness_score:.1f}")

Truthiness check skipped valid score of 0.0. Now uses `is not None`
consistently for both leverage and effectiveness scores.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Autofix Details

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Ambiguous percentage formatting produces wrong output near boundary
    • Removed the ambiguous value <= 1.0 heuristic and now always multiply by 100, since all callers pass 0–1 ratios.

Create PR

Or push these changes by commenting:

@cursor push 3b493d2927
Preview (3b493d2927)
diff --git a/src/primer/server/services/coaching_service.py b/src/primer/server/services/coaching_service.py
--- a/src/primer/server/services/coaching_service.py
+++ b/src/primer/server/services/coaching_service.py
@@ -16,7 +16,7 @@
 def _fmt_pct(value: float | None) -> str:
     if value is None:
         return "N/A"
-    return f"{value * 100:.0f}%" if value <= 1.0 else f"{value:.0f}%"
+    return f"{value * 100:.0f}%"
 
 
 def _build_status(overview, profile) -> str:

Always multiply by 100 since success_rate is always a 0-1 ratio.
The <= 1.0 heuristic was fragile with floating-point edge cases.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Bugbot Autofix prepared fixes for both issues found in the latest run.

  • ✅ Fixed: Misleading fallback message when profile data is absent
    • Added a separate fallback branch when profile is None that shows an insufficient-data message instead of the misleading 'well-diversified' congratulation.
  • ✅ Fixed: Grammar error when model count is zero
    • Changed the condition from model_count <= 1 to model_count == 1 so that the zero case is excluded, avoiding both the grammar error and the misleading advice.

Create PR

Or push these changes by commenting:

@cursor push 551178f7c1
Preview (551178f7c1)
diff --git a/src/primer/server/services/coaching_service.py b/src/primer/server/services/coaching_service.py
--- a/src/primer/server/services/coaching_service.py
+++ b/src/primer/server/services/coaching_service.py
@@ -63,9 +63,9 @@
 def _build_skills_section(profile, tips) -> CoachingSection:
     items = []
     if profile:
-        if profile.model_count <= 1:
+        if profile.model_count == 1:
             items.append(
-                f"You use {profile.model_count} model. "
+                "You use 1 model. "
                 "Try cheaper models for simple tasks — "
                 "Haiku is 3-5x cheaper for lookups and quick edits."
             )
@@ -88,10 +88,16 @@
             items.append(tip.description)
 
     if not items:
-        items.append(
-            "Your tool usage is well-diversified. "
-            "Look at the Growth page for shared patterns from top performers."
-        )
+        if profile is None:
+            items.append(
+                "Not enough session data to assess tool usage yet. "
+                "Keep using AI tools and check back later."
+            )
+        else:
+            items.append(
+                "Your tool usage is well-diversified. "
+                "Look at the Growth page for shared patterns from top performers."
+            )
     return CoachingSection(title="Where you could level up", items=items)

- model_count == 1 (not <= 1) to avoid "You use 0 model" message
- Distinct fallback when profile is None (insufficient data) vs
  when profile exists but no gaps found (well-diversified)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@ccf ccf merged commit a56898b into main Mar 6, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant