Add Archi MCP server, authenticated external MCP support, and Mattermost SSO/RBAC integration#543
Draft
hassan11196 wants to merge 31 commits intoarchi-physics:mainfrom
Draft
Add Archi MCP server, authenticated external MCP support, and Mattermost SSO/RBAC integration#543hassan11196 wants to merge 31 commits intoarchi-physics:mainfrom
hassan11196 wants to merge 31 commits intoarchi-physics:mainfrom
Conversation
Introduces archi_mcp/, a standalone Model Context Protocol server that exposes archi's RAG capabilities as MCP tools for VS Code, Cursor, and other MCP-compatible AI assistants. Tools exposed: - archi_query ask a question via the active RAG pipeline - archi_list_documents browse the indexed knowledge base - archi_get_document_content read a specific indexed document - archi_get_deployment_info show active pipeline/model/retrieval config - archi_list_agents list available agent specs - archi_health verify the deployment is reachable The server connects to a running archi chat service over HTTP (stdio transport); no archi internals are imported. Configuration is via ARCHI_URL, ARCHI_API_KEY, and ARCHI_TIMEOUT environment variables. pyproject.toml: - adds [project.optional-dependencies] mcp = ["mcp>=1.0.0", ...] - registers archi-mcp CLI entry point - includes archi_mcp package in setuptools find archi_mcp/README.md covers VS Code (.vscode/mcp.json), Cursor (~/.cursor/mcp.json), and generic stdio client setup.
- Add services.mcp_server block to base-config.yaml template (url, api_key, timeout; url defaults to chat_app hostname+port) - Add --config flag to archi-mcp CLI to read settings from a rendered archi config file; env vars still take precedence - Rewrite archi_mcp/README.md with server setup, archi config snippet, and VS Code / Cursor client setup instructions
Adds /mcp/sse and /mcp/messages routes directly to the Flask chat app
so MCP clients can connect with just a URL — no local archi-mcp command
or pip install required.
VS Code (.vscode/mcp.json):
{ "servers": { "archi": { "type": "sse", "url": "http://localhost:7861/mcp/sse" } } }
Cursor (~/.cursor/mcp.json):
{ "mcpServers": { "archi": { "url": "http://localhost:7861/mcp/sse" } } }
The SSE transport (JSON-RPC 2.0 over Server-Sent Events) is implemented
natively in Flask using thread-safe queues — no Starlette or extra
dependencies needed. Tool handlers call archi internals directly inside
the same process (chat wrapper, data viewer, agent spec loader).
Files changed:
- src/interfaces/chat_app/mcp_sse.py (new) — SSE transport + 6 tools
- src/interfaces/chat_app/app.py — register_mcp_sse() call in FlaskAppWrapper
- archi_mcp/README.md — document HTTP+SSE as the recommended option
The built-in /mcp/sse endpoint on the chat service makes the separate archi-mcp CLI redundant. Clients now connect with just a URL. - Delete archi_mcp/ package (server, client, README, entry point) - Remove [project.optional-dependencies].mcp from pyproject.toml - Remove archi-mcp entry point script from pyproject.toml - Remove archi_mcp* from package discovery
Users now visit /mcp/auth (browser) after SSO login to get a long-lived bearer token. The token is stored in the new mcp_tokens PostgreSQL table. VS Code / Cursor MCP configs must include the token as an Authorization header. The /mcp/sse and /mcp/messages endpoints enforce token validation when auth is enabled; unauthenticated clients receive a 401 JSON response with a login_url pointing to /mcp/auth. Changes: - init.sql: add mcp_tokens table (token, user_id, last_used_at, expires_at) - mcp_sse.py: bearer-token validation (_validate_mcp_token), auth guard on both SSE and messages endpoints, updated session registry to carry user_id - app.py: register /mcp/auth (GET) and /mcp/auth/regenerate (POST) routes, token DB helpers (_get_mcp_token, _create_mcp_token, _rotate_mcp_token), sso_callback now honours session['sso_next'] for post-login redirects - templates/mcp_auth.html: token display page with VS Code / Cursor snippets and token rotation UI
Implements the standard OAuth2 authorization code flow with PKCE so that MCP clients (Claude Desktop, VS Code, etc.) can authenticate automatically without manual token copy-paste. New endpoints: GET /.well-known/oauth-authorization-server – RFC 8414 discovery GET /authorize – PKCE authorization (redirects to SSO if needed) POST /token – code → bearer token exchange New DB table: mcp_auth_codes (short-lived, single-use PKCE codes).
- Security: use atomic UPDATE...RETURNING to prevent auth-code replay attacks - Security: validate redirect_uri in /token matches the one from /authorize - Efficiency: inline token fetch/create inside existing DB connection (1 conn instead of 2-3) - Efficiency: opportunistically delete expired mcp_auth_codes rows on each token exchange - Bug fix: use urlparse/urlunparse for redirect_uri assembly (handles trailing ? edge case) - Cleanup: move secrets/hashlib/base64/urlencode to module-level imports - Cleanup: remove dead variable challenge_method
…in page When an MCP client hits /authorize and the user isn't logged in, directly invoke self.oauth.sso.authorize_redirect() — the same call the login handler uses — instead of redirecting to /login?method=sso as an intermediate step. The existing sso_next session key still brings the user back to /authorize after the SSO callback completes, so the rest of the PKCE flow is unchanged.
ChatWrapper.__call__ expects message as [["User", content]] (matching the JS client's history.slice(-1) format). _tool_query was passing a bare string, causing `sender, content = tuple(message[0])` to fail with "not enough values to unpack" since message[0] was a single character.
- mcp_auth.html: add two new tabs alongside VS Code and Cursor - Claude Desktop: shows claude_desktop_config.json snippet for macOS/Windows - Claude Code: shows `claude mcp add` CLI command + .mcp.json project config - mcp_sse.py: update module docstring with Claude Desktop and Claude Code examples
Auth is now handled via SSO-issued bearer tokens (mcp_tokens table) and the OAuth2 PKCE flow. The static api_key field was never read by any code.
…on_metadata user_id was extracted from the bearer token and stored in the session, but was never passed through _dispatch → _call_tool → _tool_query → wrapper.chat, causing conversation_metadata.user_id to always be NULL for MCP requests. Thread user_id from session_entry through the full call chain so it reaches create_conversation() and is written to the DB.
…gistration - Relocate /authorize → /mcp/oauth/authorize and /token → /mcp/oauth/token - Add /mcp/oauth/register (RFC 7591 dynamic client registration) - Update /.well-known/oauth-authorization-server metadata to point to new paths and advertise registration_endpoint - Add mcp_oauth_clients table to init.sql to persist registered clients - Update mcp_auth.html: IDE config snippets no longer include hardcoded tokens; clients discover OAuth via well-known and handle auth automatically on first use. Manual bearer token moved to an Advanced collapsible section for legacy clients.
- Default services.mcp_server.enabled to false in base-config.yaml - Read the flag in ChatApp and only register /mcp/* routes when enabled - Move /mcp/auth and OAuth endpoints inside the mcp_enabled guard https://claude.ai/code/session_01XTALCGRDaVpNPmqFbRD8My
When an MCP client includes _meta.progressToken in a tools/call request for archi_query, the server now streams intermediate status events over the existing SSE connection using the MCP notifications/progress protocol: - thinking_start / thinking_end → "Thinking…" / "Thought: <preview>" - tool_start → "Calling <tool>(<args>)" - tool_output → "Got result from <tool>" - chunk → "Generating answer…" This lets MCP hosts (VS Code, Cursor, Claude Desktop, Claude Code) show live status while archi is working instead of blocking silently. Clients that omit progressToken continue to use the existing single blocking wrapper.chat() call, so backwards compatibility is preserved.
Replace the two-path approach (stream vs invoke depending on progressToken) with a single path that always calls wrapper.chat.stream(). notify() calls are gated on whether a progressToken was provided, so progress events are still only sent when the client requests them. This ensures MCP responses use the same pipeline.stream() code path as the web app, giving identical tool-call behaviour and model/provider selection.
chunk events carry the full text so far, not a new token. Appending each one caused the response to repeat itself for every chunk emitted. Fix by overwriting a single string on each chunk and preferring final.response.answer (clean PipelineOutput) as the canonical answer.
VS Code MCP extension (2025-03-26 spec) fetches this endpoint first to discover which authorization server protects the resource. Without it the client logs 'Failed to fetch resource metadata from all attempted URLs' and cannot complete the OAuth PKCE flow, resulting in a 401 on /mcp/sse. The new endpoint returns the resource URI and points to the same-origin authorization server, matching what .well-known/oauth-authorization-server already advertises.
The MCP spec requires an absolute URL in the 'endpoint' SSE event. Sending a relative path (/mcp/messages?...) caused VS Code's MCP client to fail resolving it, so the POST for 'initialize' never reached the server — resulting in the 'Waiting for initialize' loop. Also captures request.host_url before the generator (generators run outside request context) and adds INFO/WARNING logs so session mismatches and incoming method calls are visible in server logs.
…event Two bugs caused 'Waiting for initialize' on authenticated deployments but not localhost: 1. /mcp/messages re-checked the Bearer token (auth_enabled=True). VS Code sends the token for the initial SSE connection but not for subsequent POST messages to the dynamically-discovered endpoint URL. The session_id is already sufficient proof of identity — it was only issued to the client that passed auth on /mcp/sse. 2. Behind a reverse proxy, request.host_url returns the internal address (e.g. http://127.0.0.1:PORT/) so the absolute endpoint URL pointed somewhere unreachable. Now uses X-Forwarded-Proto / X-Forwarded-Host headers when present, falling back to request.scheme / request.host.
…oint event
Behind a reverse proxy, X-Forwarded-Proto is often not set, so the
endpoint SSE event was advertising http:// instead of https://, causing
VS Code to POST to the wrong URL.
services.mcp_server.url is already in the config template for exactly
this purpose ('Public URL of the chat service that MCP clients will
connect to'). Pass it through register_mcp_sse(public_url=...) and use
it as the base for the /mcp/messages?session_id=... endpoint URL.
Priority order for URL resolution:
1. public_url from config (explicit, most reliable)
2. X-Forwarded-Proto / X-Forwarded-Host headers (proxy sets these)
3. request.scheme / request.host (direct / localhost fallback)
…e comments - /mcp/auth, OAuth metadata, and OAuth protected-resource endpoints now use _mcp_public_base_url() which reads services.mcp_server.url from config, falling back to X-Forwarded-* headers then request.host. This fixes https → http downgrade behind a reverse proxy. - Removed redundant/verbose comments throughout mcp_sse.py; trimmed docstrings to essentials. No behavioral changes beyond the URL fix. https://claude.ai/code/session_01XTALCGRDaVpNPmqFbRD8My
…render get_active_banner_alerts() opened a new psycopg2 connection on every Flask template render (every page load), adding measurable latency. Cache the result for 30 seconds; invalidate immediately on create/delete so alert managers still see changes right away.
- Add `config_name` and `client_timeout` as optional input parameters to the `archi_query` tool schema, matching what the UI sends - Default `client_timeout` to 18000000ms (5 hours) instead of 120s so long-running queries don't time out prematurely - Convert `client_timeout` from milliseconds (UI convention) to seconds before passing to wrapper.chat(), consistent with how app.py handles it - Pass `config_name` (e.g. 'comp_ops') through to the chat call instead of always using None (active config) https://claude.ai/code/session_01XTALCGRDaVpNPmqFbRD8My
…tests for MCP SSE tools - Updated `_parse_metadata_query` to handle malformed queries gracefully by using fallback tokenization. - Removed unnecessary list conversion in `iter_files` calls for performance optimization. - Introduced comprehensive unit tests for MCP SSE tools, covering various functionalities including document listing, metadata searching, and agent specifications.
ab53fda to
eaa6b61
Compare
- Implement SSO and OAuth2 support for MCP servers with on-demand token management - Add Mattermost SSO OAuth2 auth (mattermost_auth.py, mattermost_token_service.py) - Add Mattermost RBAC context (rbac/mattermost_context.py, permission_enum.py) - Extend check_tool_permission() to honour Mattermost context (tools/base.py) - Add local-file retry logic from Mattermost branch (tools/local_files.py) - Expand Mattermost service with SSO/auth config (mattermost.py, service_mattermost.py) - Add mattermost_tokens DB table to init.sql - Add PG env vars, ports, and data-volume fix to mattermost service in base-compose.yaml - Add port/auth/sso subtree to mattermost in base-config.yaml - Implement MattermostClient for REST API interactions - Add SSL certificate support and enhance user ID resolution for Mattermost auth - Add mcp_servers_config column to static_config table
eaa6b61 to
4a4710c
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR turns the chat service into a first-class MCP server and adds end-to-end OAuth support for MCP clients and SSO-protected upstream MCP servers. It also significantly expands Mattermost integration with RBAC-aware auth, persistent conversation continuity, and webhook + polling execution paths.
What changed
1) Built-in MCP server in chat service
/mcp/sse(opt-in viaservices.mcp_server.enabled)./.well-known/oauth-authorization-server/.well-known/oauth-protected-resource/mcp/oauth/register/mcp/oauth/authorize/mcp/oauth/token/mcp/auth,/mcp/auth/regenerate(manual token page)2) OAuth + token lifecycle for MCP access
sso_tokens)mcp_oauth_clients)mcp_oauth_tokens)mcp_tokens)mcp_auth_codes)MCPOAuthServiceandSSOTokenServicefor discovery, registration, token exchange, refresh, and storage.3) On-demand MCP tool auth for users
user_idand can rebuild MCP tools per request when user context is present.4) Mattermost integration overhaul
MattermostClientThreadContextManagerMattermostEventHandlerMattermostAuthManagerMattermostTokenServicemattermost:accessand thread-safe Mattermost request context propagation.5) Config/template/docs updates
services.mcp_serversso_authper-server toggle6) Data/schema and SQL changes
7) Reliability and UX improvements
Testing
tests/unit/test_mcp_sse_tools.pycovering MCP SSE tool behavior and formatting.Notes / migration considerations
init.sql(or equivalent migration) before use.services.mcp_server.enabled: trueand provide a reachable public URL where needed.sso_auth: trueand complete OAuth authorization for users.