Skip to content

Conversation

@yrobla
Copy link
Contributor

@yrobla yrobla commented Oct 27, 2025

This will be the entrypoint for the vmcp component

See https://github.com/stacklok/stacklok-epics/issues/153

@codecov
Copy link

codecov bot commented Oct 27, 2025

Codecov Report

❌ Patch coverage is 0% with 67 lines in your changes missing coverage. Please review.
✅ Project coverage is 54.10%. Comparing base (c7f9274) to head (79cb7bc).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
cmd/vmcp/app/commands.go 0.00% 60 Missing ⚠️
cmd/vmcp/main.go 0.00% 7 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2340      +/-   ##
==========================================
- Coverage   54.27%   54.10%   -0.18%     
==========================================
  Files         242      244       +2     
  Lines       23446    23513      +67     
==========================================
- Hits        12725    12721       -4     
- Misses       9506     9583      +77     
+ Partials     1215     1209       -6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@yrobla yrobla requested review from JAORMX and Copilot October 27, 2025 16:28
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces the foundational infrastructure for a new standalone binary called vmcp (Virtual MCP Server), which will serve as an aggregation proxy for multiple MCP (Model Context Protocol) servers. The implementation focuses on establishing the binary structure, CLI commands, and build/CI configuration without the full functional implementation.

Key Changes:

  • Added new vmcp binary with CLI structure (serve, validate, version commands)
  • Integrated build tasks and container image builds for the new binary
  • Extended CI/CD pipeline to build and publish vmcp container images

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
cmd/vmcp/main.go Entry point for the vmcp binary with signal handling
cmd/vmcp/app/commands.go CLI command structure with placeholder implementations
cmd/vmcp/README.md Comprehensive documentation for vmcp usage and configuration
examples/vmcp-config.yaml Example configuration file demonstrating vmcp settings
docs/vmcp-implementation-summary.md Implementation status and architecture documentation
Taskfile.yml Build tasks for vmcp binary and container image
.github/workflows/image-build-and-publish.yml CI/CD pipeline job for building and publishing vmcp images
.github/ko-ci.yml ko build configuration for vmcp

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@JAORMX
Copy link
Collaborator

JAORMX commented Oct 29, 2025

@claude please review this taking into account its relation to docs/proposals/THV-2106-virtual-mcp-server.md

@claude
Copy link
Contributor

claude bot commented Oct 29, 2025

Claude finished @JAORMX's task —— View job


Code Review: vmcp Binary Implementation

I've reviewed this PR in context of the proposal document docs/proposals/THV-2106-virtual-mcp-server.md. This is a foundational PR that sets up the binary structure for the Virtual MCP Server (vmcp) feature. Below is my comprehensive review.


Summary

This PR successfully establishes the basic infrastructure for the vmcp binary including CLI structure, build tasks, and CI/CD integration. However, there are several critical misalignments between the implementation and the proposal that need to be addressed.


Critical Issues

1. Configuration Schema Mismatch (High Priority)

Issue: The example configuration in examples/vmcp-config.yaml and documentation in cmd/vmcp/README.md do NOT match the proposal specification.

Current Implementation (lines 1-105 in examples/vmcp-config.yaml):

server:
  address: "localhost:8080"
  transport: "http"

middleware:
  auth:
    provider: "local"
  authz:
    enabled: true

backends:
  - name: "filesystem-server"
    type: "container"
    tool_prefix: "fs_"

Proposal Specification (lines 314-387 in proposal):

# Virtual MCP metadata
name: "engineering-vmcp"
group: "engineering-team"  # Reference to ToolHive group

# ===== INCOMING AUTHENTICATION =====
incoming_auth:
  type: oidc
  oidc:
    issuer: "..."
    client_id: "..."
    audience: "vmcp"

# ===== OUTGOING AUTHENTICATION =====
outgoing_auth:
  source: inline
  default:
    type: pass_through
  backends:
    github:
      type: token_exchange
      token_exchange:
        token_url: "..."
        audience: "github-api"

# ===== TOOL AGGREGATION =====
aggregation:
  conflict_resolution: prefix
  conflict_resolution_config:
    prefix_format: "{workload}_"

Why This Matters:

  • The proposal uses group-based backend management where vmcp references a ToolHive group and discovers workloads automatically (lines 64-68 in proposal)
  • The proposal has two distinct authentication boundaries: incoming (client → vmcp) and outgoing (vmcp → backends) with token exchange support (lines 136-149)
  • The current implementation treats backends as standalone configurations rather than group-based discovery
  • This is NOT just documentation—it's a fundamental architectural difference

Recommendation:

  • Update examples/vmcp-config.yaml to match the proposal schema exactly
  • Update cmd/vmcp/README.md to reflect the group-based architecture and dual authentication model
  • Consider whether this initial PR should use a simplified "stub" config that's clearly marked as incomplete, rather than an incompatible config

Related Review Comment: JAORMX noted "This doesn't really match the configuration we've defined" on cmd/vmcp/README.md


2. Architecture Documentation Misalignment

Issue: The README.md describes an architecture that differs from the proposal.

Current README (cmd/vmcp/README.md:163-188):

┌─────────────┐
│  MCP Client │
└──────┬──────┘
       │
       ▼
┌─────────────────────────────────┐
│     Virtual MCP Server (vmcp)   │
│  ┌───────────────────────────┐  │
│  │   Middleware Chain        │  │
│  │  - Auth                   │  │
│  │  - Authz                  │  │
│  │  - Audit                  │  │
│  │  - Telemetry              │  │
│  └───────────────────────────┘  │
│  ┌───────────────────────────┐  │
│  │   Router / Aggregator     │  │
│  └───────────────────────────┘  │
└────┬─────────┬─────────┬────────┘
     │         │         │
     ▼         ▼         ▼
┌─────────┐ ┌─────────┐ ┌─────────┐
│ Backend │ │ Backend │ │ Backend │
│ MCP 1   │ │ MCP 2   │ │ MCP 3   │
└─────────┘ └─────────┘ └─────────┘

Proposal Architecture (lines 27-57 in proposal):

┌──────────┐
│ Client   │ Token: aud=vmcp
└────┬─────┘
     │
     ▼
┌────────────────────────────┐
│ Virtual MCP Server         │
│  - Incoming Auth           │ ← Validates aud=vmcp
│  - Request Router          │
│  - Backend Token Manager   │ ← Exchanges tokens per backend
│  - Token Cache             │
└──┬─────────┬──────────┬───┘
   │         │          │
   ▼         ▼          ▼
┌────────┐ ┌────────┐ ┌────────┐
│ GitHub │ │ Jira   │ │ Slack  │
│ MCP    │ │ MCP    │ │ MCP    │
│ Recv:  │ │ Recv:  │ │ Recv:  │
│ token  │ │ token  │ │ token  │
│ with   │ │ with   │ │ with   │
│ aud=   │ │ aud=   │ │ static │
│ github │ │ jira   │ │ token  │
│ -api   │ │ -api   │ │        │
└────────┘ └────────┘ └────────┘

Key Differences:

  1. Proposal emphasizes group-based workload discovery (not static backend list)
  2. Proposal includes Backend Token Manager and Token Cache as core components
  3. Proposal shows per-backend token exchange with different audiences
  4. Proposal clearly separates incoming auth (single boundary) from outgoing auth (per-backend)

Recommendation: Update the architecture diagram in README.md to match the proposal, or clearly mark it as "Phase 1 Architecture (Simplified)" if you're deferring token exchange to a later PR.


3. Documentation File Persistence Question

Issue: The file docs/vmcp-implementation-summary.md (mentioned in PR description but not in changed files list) may not be necessary yet.

Related Review Comment: JAORMX noted "there's probably no need to persist this just yet" on docs/vmcp-implementation-summary.md

Recommendation:

  • If this file exists, consider removing it from this PR since there's no actual implementation yet
  • Or move its content into the proposal document as an "Implementation Status" section

Code Quality Issues

4. Signal Handler Lacks Graceful Cleanup (Medium Priority)

Issue: Signal handler in cmd/vmcp/main.go:19-20 exits immediately without cleanup.

Current Code (cmd/vmcp/main.go:19-20):

ctx, cancel := signal.NotifyContext(context.Background(), os.Interrupt, syscall.SIGTERM, syscall.SIGQUIT)
defer cancel()

Comparison to thv (cmd/thv/main.go:62-71):

func setupSignalHandler() {
	sigCh := make(chan os.Signal, 1)
	signal.Notify(sigCh, os.Interrupt, syscall.SIGTERM, syscall.SIGQUIT)

	go func() {
		<-sigCh
		logger.Debugf("Received signal, cleaning up lock files...")
		lockfile.CleanupAllLocks()
		os.Exit(0)
	}()
}

Why This Matters: When vmcp is actually implemented, it will need to:

  • Close connections to backend MCP servers
  • Flush audit logs
  • Persist token cache (if using persistent cache)
  • Stop background health checks

Related Review Comment: Copilot reviewer noted "The signal handler logs a shutdown message but exits immediately without allowing cleanup"

Recommendation:

  • For this initial PR: Add a TODO comment noting that cleanup logic will be added when server implementation is added
  • Future PR: Implement proper cleanup pattern like thv uses

5. Command Structure is Solid

Positive Feedback: The command structure in cmd/vmcp/app/commands.go is well-designed:

  • ✅ Follows existing ToolHive patterns from cmd/thv/app/commands.go
  • ✅ Includes appropriate subcommands: serve, validate, version
  • ✅ Has clear help text and error messages
  • ✅ Uses TODO comments to mark unimplemented functionality

The version command pattern matches thv and the build configuration properly injects version info via ldflags.


Build and CI/CD Review

6. Build Infrastructure is Complete and Correct

Positive Feedback: The build and CI/CD changes are comprehensive and follow existing patterns:

Taskfile.yml Changes (lines 173-203):

  • build-vmcp: Builds vmcp binary with proper ldflags
  • install-vmcp: Installs to GOPATH/bin
  • build-vmcp-image: Builds container image with ko
  • build-all-images: Updated to include vmcp
  • ✅ Proper version injection using git tags

GitHub Workflow (.github/workflows/image-build-and-publish.yml):

  • ✅ New job vmcp-image-build-and-publish with proper structure
  • ✅ Multi-platform builds (linux/amd64, linux/arm64)
  • ✅ Proper versioning strategy (semver with -dev suffix for main branch)
  • ✅ Cosign image signing
  • ✅ Proper GHCR repository naming: ghcr.io/stacklok/toolhive/vmcp

ko Configuration (.github/ko-ci.yml):

  • ✅ New build config for vmcp with proper ldflags
  • ✅ Consistent with other binaries (thv, thv-operator, thv-proxyrunner)

Minor Issue: Some whitespace-only changes in the workflow file (lines with trailing whitespace removed). This is fine but not strictly necessary.


Test Coverage

7. Zero Test Coverage is Expected for Initial PR

Observation: Codecov reports 0% coverage for 67 new lines (all in cmd/vmcp/).

Analysis: This is acceptable for an initial binary skeleton PR because:

  • The commands are just stubs returning "not yet implemented"
  • No actual business logic to test yet
  • Testing infrastructure will be added when pkg/vmcp/ is implemented

Recommendation: Add a TODO comment in the PR description or a follow-up issue to track test coverage for the actual implementation.


Alignment with Proposal

8. Phase 1 Scope Question

Question: What is the intended scope of this PR vs. the full Phase 1 MVP?

Proposal Phase 1 Goals (lines 1066-1076):

  • Aggregate multiple MCP servers from a group
  • Tool conflict resolution (prefix strategy)
  • Basic request routing
  • Pass-through authentication
  • Zero changes to existing MCP servers

This PR Delivers:

  • ✅ Binary structure
  • ✅ Build and deployment infrastructure
  • ⚠️ Configuration schema (but doesn't match proposal)
  • ❌ No implementation yet (expected)

Recommendation:

  • Clarify in the PR description whether this is meant to be "Phase 0: Binary Setup" separate from "Phase 1: MVP Implementation"
  • Or update the configuration to be a simplified Phase 1 config that can be extended later

Additional Observations

9. Backend Type Confusion

Issue: The README describes three backend types (cmd/vmcp/README.md:72-102):

  1. Container (ToolHive-managed)
  2. Stdio (direct communication)
  3. SSE (HTTP-based)

But the proposal clearly states (lines 64-68):

Virtual MCP references a ToolHive group and automatically discovers all workloads within it:

  • Leverages existing groups.Manager and workloads.Manager
  • Dynamic workload discovery via ListWorkloadsInGroup()
  • Inherits workload configurations from group

This means: vmcp should NOT have a backends list with container/stdio/sse configurations. It should have a group reference and discover workloads from the group.

Recommendation: Align the README with the group-based architecture from the proposal.


10. Comparison Table is Useful

Positive Feedback: The comparison table in README.md (cmd/vmcp/README.md:215-224) is helpful:

Feature thv vmcp
Purpose Run individual MCP servers Aggregate multiple MCP servers
Architecture Single server per instance Multiple backends per instance
Configuration RunConfig format vMCP config format
Use Case Development, testing Production, multi-server deployments
Middleware Per-server Global + per-backend overrides

However, the "Architecture" row should be updated to say "Multiple workloads from a group" to align with the proposal.


Summary of Recommendations

Must Fix Before Merge

  1. Update configuration schema in examples/vmcp-config.yaml to match proposal (group-based, dual auth boundaries)
  2. Update README.md to reflect group-based architecture and remove references to standalone backend types
  3. Address or remove docs/vmcp-implementation-summary.md if it exists

Should Fix Before Merge

  1. Update architecture diagram in README.md to match proposal
  2. Add TODO comment about signal handler cleanup for future implementation

Nice to Have

  1. Add PR description clarification about this being "Phase 0: Binary Setup" vs "Phase 1: MVP"
  2. Add follow-up issue to track test coverage for actual implementation

Verdict

This PR provides solid infrastructure for the vmcp binary (build tasks, CI/CD, command structure) but has critical documentation and configuration misalignments with the proposal. The code quality is good and follows ToolHive patterns, but the configuration schema and README need to be updated to match the group-based, dual-authentication architecture described in the proposal.

Recommendation: Request changes to align configuration and documentation with proposal before merging.


|

@yrobla yrobla requested a review from JAORMX October 29, 2025 10:07
JAORMX
JAORMX previously approved these changes Oct 29, 2025
@yrobla yrobla merged commit 591f19b into main Oct 29, 2025
27 checks passed
@yrobla yrobla deleted the feat/vmcp_binary branch October 29, 2025 11:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants