Skip to content

Add Archi MCP server, authenticated external MCP support, and Mattermost SSO/RBAC integration#543

Draft
hassan11196 wants to merge 31 commits intoarchi-physics:mainfrom
hassan11196:mcp-with-on-demand-tokens
Draft

Add Archi MCP server, authenticated external MCP support, and Mattermost SSO/RBAC integration#543
hassan11196 wants to merge 31 commits intoarchi-physics:mainfrom
hassan11196:mcp-with-on-demand-tokens

Conversation

@hassan11196
Copy link
Copy Markdown
Collaborator

@hassan11196 hassan11196 commented Mar 31, 2026

Summary

This PR turns the chat service into a first-class MCP server and adds end-to-end OAuth support for MCP clients and SSO-protected upstream MCP servers. It also significantly expands Mattermost integration with RBAC-aware auth, persistent conversation continuity, and webhook + polling execution paths.

What changed

1) Built-in MCP server in chat service

  • Adds a built-in SSE MCP endpoint at /mcp/sse (opt-in via services.mcp_server.enabled).
  • Implements MCP tools exposure directly from archi internals (query, docs/metadata/content search, chunks, deployment info, agents, health).
  • Adds full MCP/OAuth endpoints and metadata:
    • /.well-known/oauth-authorization-server
    • /.well-known/oauth-protected-resource
    • /mcp/oauth/register
    • /mcp/oauth/authorize
    • /mcp/oauth/token
    • /mcp/auth, /mcp/auth/regenerate (manual token page)
  • Supports progress notifications and threaded dispatch for MCP tool calls.

2) OAuth + token lifecycle for MCP access

  • Introduces PKCE-based authorization flow for MCP clients.
  • Adds persistent token storage for:
    • Web SSO tokens (sso_tokens)
    • Per-server MCP OAuth client registrations (mcp_oauth_clients)
    • Per-user/per-server MCP OAuth tokens (mcp_oauth_tokens)
    • Manual MCP bearer tokens (mcp_tokens)
    • Authorization codes (mcp_auth_codes)
  • Adds MCPOAuthService and SSOTokenService for discovery, registration, token exchange, refresh, and storage.

3) On-demand MCP tool auth for users

  • Agent MCP tool initialization now accepts user_id and can rebuild MCP tools per request when user context is present.
  • SSO-auth MCP servers are skipped when no valid user token is available.
  • Optional CERN CA bundle support for MCP HTTP clients.
  • Adds a guard to stay within OpenAI's 128 tool request limit.

4) Mattermost integration overhaul

  • Refactors Mattermost integration into reusable components:
    • MattermostClient
    • ThreadContextManager
    • MattermostEventHandler
    • MattermostAuthManager
    • MattermostTokenService
  • Supports webhook and polling modes concurrently.
  • Adds Mattermost SSO login/callback endpoints and role mapping.
  • Adds RBAC gate mattermost:access and thread-safe Mattermost request context propagation.
  • Persists Mattermost conversation continuity into shared conversation metadata/messages so conversations can appear in web chat history.

5) Config/template/docs updates

  • Adds/extends config surfaces for:
    • services.mcp_server
    • MCP server sso_auth per-server toggle
    • Mattermost auth/session settings and service ports
  • Compose/template updates include env and cert passthroughs required by new flows.
  • Documentation updated for MCP built-in server and expanded Mattermost setup/auth/RBAC guidance.

6) Data/schema and SQL changes

  • Adds SQL and schema support for:
    • Mattermost token/session storage
    • MCP OAuth registrations/tokens/codes/manual tokens
    • Cross-source conversation lookup/list/load helpers
  • Adds conversation metadata fields and query paths to include Mattermost-originated conversations.

7) Reliability and UX improvements

  • Adds retry/backoff to remote catalog requests.
  • Adds short-term caching for service alert banners.
  • Improves chat sidebar rendering to visually identify Mattermost-originated conversations.

Testing

  • Adds tests/unit/test_mcp_sse_tools.py covering MCP SSE tool behavior and formatting.
  • Existing MCP/Mattermost and chat flows were exercised during iterative branch development and bug-fix commits.

Notes / migration considerations

  • Deployments using auth-enabled MCP functionality require the new DB tables in init.sql (or equivalent migration) before use.
  • To expose the built-in MCP server, set services.mcp_server.enabled: true and provide a reachable public URL where needed.
  • For SSO-protected upstream MCP servers, set per-server sso_auth: true and complete OAuth authorization for users.

hassan11196 and others added 29 commits March 16, 2026 02:24
Introduces archi_mcp/, a standalone Model Context Protocol server that
exposes archi's RAG capabilities as MCP tools for VS Code, Cursor, and
other MCP-compatible AI assistants.

Tools exposed:
  - archi_query             ask a question via the active RAG pipeline
  - archi_list_documents    browse the indexed knowledge base
  - archi_get_document_content  read a specific indexed document
  - archi_get_deployment_info   show active pipeline/model/retrieval config
  - archi_list_agents       list available agent specs
  - archi_health            verify the deployment is reachable

The server connects to a running archi chat service over HTTP (stdio
transport); no archi internals are imported.  Configuration is via
ARCHI_URL, ARCHI_API_KEY, and ARCHI_TIMEOUT environment variables.

pyproject.toml:
  - adds [project.optional-dependencies] mcp = ["mcp>=1.0.0", ...]
  - registers archi-mcp CLI entry point
  - includes archi_mcp package in setuptools find

archi_mcp/README.md covers VS Code (.vscode/mcp.json), Cursor
(~/.cursor/mcp.json), and generic stdio client setup.
- Add services.mcp_server block to base-config.yaml template
  (url, api_key, timeout; url defaults to chat_app hostname+port)
- Add --config flag to archi-mcp CLI to read settings from a
  rendered archi config file; env vars still take precedence
- Rewrite archi_mcp/README.md with server setup, archi config
  snippet, and VS Code / Cursor client setup instructions
Adds /mcp/sse and /mcp/messages routes directly to the Flask chat app
so MCP clients can connect with just a URL — no local archi-mcp command
or pip install required.

  VS Code (.vscode/mcp.json):
    { "servers": { "archi": { "type": "sse", "url": "http://localhost:7861/mcp/sse" } } }

  Cursor (~/.cursor/mcp.json):
    { "mcpServers": { "archi": { "url": "http://localhost:7861/mcp/sse" } } }

The SSE transport (JSON-RPC 2.0 over Server-Sent Events) is implemented
natively in Flask using thread-safe queues — no Starlette or extra
dependencies needed. Tool handlers call archi internals directly inside
the same process (chat wrapper, data viewer, agent spec loader).

Files changed:
- src/interfaces/chat_app/mcp_sse.py  (new) — SSE transport + 6 tools
- src/interfaces/chat_app/app.py      — register_mcp_sse() call in FlaskAppWrapper
- archi_mcp/README.md                 — document HTTP+SSE as the recommended option
The built-in /mcp/sse endpoint on the chat service makes the separate
archi-mcp CLI redundant. Clients now connect with just a URL.

- Delete archi_mcp/ package (server, client, README, entry point)
- Remove [project.optional-dependencies].mcp from pyproject.toml
- Remove archi-mcp entry point script from pyproject.toml
- Remove archi_mcp* from package discovery
Users now visit /mcp/auth (browser) after SSO login to get a long-lived
bearer token.  The token is stored in the new mcp_tokens PostgreSQL table.

VS Code / Cursor MCP configs must include the token as an Authorization
header.  The /mcp/sse and /mcp/messages endpoints enforce token validation
when auth is enabled; unauthenticated clients receive a 401 JSON response
with a login_url pointing to /mcp/auth.

Changes:
- init.sql: add mcp_tokens table (token, user_id, last_used_at, expires_at)
- mcp_sse.py: bearer-token validation (_validate_mcp_token), auth guard on
  both SSE and messages endpoints, updated session registry to carry user_id
- app.py: register /mcp/auth (GET) and /mcp/auth/regenerate (POST) routes,
  token DB helpers (_get_mcp_token, _create_mcp_token, _rotate_mcp_token),
  sso_callback now honours session['sso_next'] for post-login redirects
- templates/mcp_auth.html: token display page with VS Code / Cursor snippets
  and token rotation UI
Implements the standard OAuth2 authorization code flow with PKCE so that
MCP clients (Claude Desktop, VS Code, etc.) can authenticate automatically
without manual token copy-paste.

New endpoints:
  GET  /.well-known/oauth-authorization-server  – RFC 8414 discovery
  GET  /authorize                               – PKCE authorization (redirects to SSO if needed)
  POST /token                                   – code → bearer token exchange

New DB table: mcp_auth_codes (short-lived, single-use PKCE codes).
- Security: use atomic UPDATE...RETURNING to prevent auth-code replay attacks
- Security: validate redirect_uri in /token matches the one from /authorize
- Efficiency: inline token fetch/create inside existing DB connection (1 conn instead of 2-3)
- Efficiency: opportunistically delete expired mcp_auth_codes rows on each token exchange
- Bug fix: use urlparse/urlunparse for redirect_uri assembly (handles trailing ? edge case)
- Cleanup: move secrets/hashlib/base64/urlencode to module-level imports
- Cleanup: remove dead variable challenge_method
…in page

When an MCP client hits /authorize and the user isn't logged in, directly
invoke self.oauth.sso.authorize_redirect() — the same call the login handler
uses — instead of redirecting to /login?method=sso as an intermediate step.
The existing sso_next session key still brings the user back to /authorize
after the SSO callback completes, so the rest of the PKCE flow is unchanged.
ChatWrapper.__call__ expects message as [["User", content]] (matching the
JS client's history.slice(-1) format). _tool_query was passing a bare
string, causing `sender, content = tuple(message[0])` to fail with
"not enough values to unpack" since message[0] was a single character.
- mcp_auth.html: add two new tabs alongside VS Code and Cursor
  - Claude Desktop: shows claude_desktop_config.json snippet for macOS/Windows
  - Claude Code: shows `claude mcp add` CLI command + .mcp.json project config
- mcp_sse.py: update module docstring with Claude Desktop and Claude Code examples
Auth is now handled via SSO-issued bearer tokens (mcp_tokens table) and
the OAuth2 PKCE flow. The static api_key field was never read by any code.
…on_metadata

user_id was extracted from the bearer token and stored in the session, but
was never passed through _dispatch → _call_tool → _tool_query → wrapper.chat,
causing conversation_metadata.user_id to always be NULL for MCP requests.

Thread user_id from session_entry through the full call chain so it reaches
create_conversation() and is written to the DB.
…gistration

- Relocate /authorize → /mcp/oauth/authorize and /token → /mcp/oauth/token
- Add /mcp/oauth/register (RFC 7591 dynamic client registration)
- Update /.well-known/oauth-authorization-server metadata to point to new paths
  and advertise registration_endpoint
- Add mcp_oauth_clients table to init.sql to persist registered clients
- Update mcp_auth.html: IDE config snippets no longer include hardcoded tokens;
  clients discover OAuth via well-known and handle auth automatically on first use.
  Manual bearer token moved to an Advanced collapsible section for legacy clients.
- Default services.mcp_server.enabled to false in base-config.yaml
- Read the flag in ChatApp and only register /mcp/* routes when enabled
- Move /mcp/auth and OAuth endpoints inside the mcp_enabled guard

https://claude.ai/code/session_01XTALCGRDaVpNPmqFbRD8My
When an MCP client includes _meta.progressToken in a tools/call request
for archi_query, the server now streams intermediate status events over
the existing SSE connection using the MCP notifications/progress protocol:

  - thinking_start / thinking_end → "Thinking…" / "Thought: <preview>"
  - tool_start → "Calling <tool>(<args>)"
  - tool_output → "Got result from <tool>"
  - chunk → "Generating answer…"

This lets MCP hosts (VS Code, Cursor, Claude Desktop, Claude Code) show
live status while archi is working instead of blocking silently.

Clients that omit progressToken continue to use the existing single
blocking wrapper.chat() call, so backwards compatibility is preserved.
Replace the two-path approach (stream vs invoke depending on progressToken)
with a single path that always calls wrapper.chat.stream(). notify() calls
are gated on whether a progressToken was provided, so progress events are
still only sent when the client requests them.

This ensures MCP responses use the same pipeline.stream() code path as the
web app, giving identical tool-call behaviour and model/provider selection.
chunk events carry the full text so far, not a new token. Appending each
one caused the response to repeat itself for every chunk emitted. Fix by
overwriting a single string on each chunk and preferring
final.response.answer (clean PipelineOutput) as the canonical answer.
VS Code MCP extension (2025-03-26 spec) fetches this endpoint first to
discover which authorization server protects the resource. Without it the
client logs 'Failed to fetch resource metadata from all attempted URLs'
and cannot complete the OAuth PKCE flow, resulting in a 401 on /mcp/sse.

The new endpoint returns the resource URI and points to the same-origin
authorization server, matching what .well-known/oauth-authorization-server
already advertises.
The MCP spec requires an absolute URL in the 'endpoint' SSE event.
Sending a relative path (/mcp/messages?...) caused VS Code's MCP client
to fail resolving it, so the POST for 'initialize' never reached the
server — resulting in the 'Waiting for initialize' loop.

Also captures request.host_url before the generator (generators run
outside request context) and adds INFO/WARNING logs so session mismatches
and incoming method calls are visible in server logs.
…event

Two bugs caused 'Waiting for initialize' on authenticated deployments but not localhost:

1. /mcp/messages re-checked the Bearer token (auth_enabled=True).
   VS Code sends the token for the initial SSE connection but not for
   subsequent POST messages to the dynamically-discovered endpoint URL.
   The session_id is already sufficient proof of identity — it was only
   issued to the client that passed auth on /mcp/sse.

2. Behind a reverse proxy, request.host_url returns the internal address
   (e.g. http://127.0.0.1:PORT/) so the absolute endpoint URL pointed
   somewhere unreachable. Now uses X-Forwarded-Proto / X-Forwarded-Host
   headers when present, falling back to request.scheme / request.host.
…oint event

Behind a reverse proxy, X-Forwarded-Proto is often not set, so the
endpoint SSE event was advertising http:// instead of https://, causing
VS Code to POST to the wrong URL.

services.mcp_server.url is already in the config template for exactly
this purpose ('Public URL of the chat service that MCP clients will
connect to'). Pass it through register_mcp_sse(public_url=...) and use
it as the base for the /mcp/messages?session_id=... endpoint URL.

Priority order for URL resolution:
  1. public_url from config  (explicit, most reliable)
  2. X-Forwarded-Proto / X-Forwarded-Host headers  (proxy sets these)
  3. request.scheme / request.host  (direct / localhost fallback)
…e comments

- /mcp/auth, OAuth metadata, and OAuth protected-resource endpoints now
  use _mcp_public_base_url() which reads services.mcp_server.url from
  config, falling back to X-Forwarded-* headers then request.host.
  This fixes https → http downgrade behind a reverse proxy.

- Removed redundant/verbose comments throughout mcp_sse.py; trimmed
  docstrings to essentials. No behavioral changes beyond the URL fix.

https://claude.ai/code/session_01XTALCGRDaVpNPmqFbRD8My
…render

get_active_banner_alerts() opened a new psycopg2 connection on every
Flask template render (every page load), adding measurable latency.
Cache the result for 30 seconds; invalidate immediately on create/delete
so alert managers still see changes right away.
- Add `config_name` and `client_timeout` as optional input parameters to
  the `archi_query` tool schema, matching what the UI sends
- Default `client_timeout` to 18000000ms (5 hours) instead of 120s so
  long-running queries don't time out prematurely
- Convert `client_timeout` from milliseconds (UI convention) to seconds
  before passing to wrapper.chat(), consistent with how app.py handles it
- Pass `config_name` (e.g. 'comp_ops') through to the chat call instead
  of always using None (active config)

https://claude.ai/code/session_01XTALCGRDaVpNPmqFbRD8My
…tests for MCP SSE tools

- Updated `_parse_metadata_query` to handle malformed queries gracefully by using fallback tokenization.
- Removed unnecessary list conversion in `iter_files` calls for performance optimization.
- Introduced comprehensive unit tests for MCP SSE tools, covering various functionalities including document listing, metadata searching, and agent specifications.
@hassan11196 hassan11196 force-pushed the mcp-with-on-demand-tokens branch from ab53fda to eaa6b61 Compare March 31, 2026 10:36
- Implement SSO and OAuth2 support for MCP servers with on-demand token management
- Add Mattermost SSO OAuth2 auth (mattermost_auth.py, mattermost_token_service.py)
- Add Mattermost RBAC context (rbac/mattermost_context.py, permission_enum.py)
- Extend check_tool_permission() to honour Mattermost context (tools/base.py)
- Add local-file retry logic from Mattermost branch (tools/local_files.py)
- Expand Mattermost service with SSO/auth config (mattermost.py, service_mattermost.py)
- Add mattermost_tokens DB table to init.sql
- Add PG env vars, ports, and data-volume fix to mattermost service in base-compose.yaml
- Add port/auth/sso subtree to mattermost in base-config.yaml
- Implement MattermostClient for REST API interactions
- Add SSL certificate support and enhance user ID resolution for Mattermost auth
- Add mcp_servers_config column to static_config table
@hassan11196 hassan11196 force-pushed the mcp-with-on-demand-tokens branch from eaa6b61 to 4a4710c Compare March 31, 2026 10:42
@hassan11196 hassan11196 changed the title Mcp with on demand tokens Add Archi MCP server, authenticated external MCP support, and Mattermost SSO/RBAC integration Mar 31, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant