Skip to content

Conversation

@akkomar
Copy link
Contributor

@akkomar akkomar commented Jan 8, 2026

@edugfilho This is a PoC, I'd like to get an early opinion. This implements a simple MCP server for Glean Dictionary. It's not fully compliant with https://github.com/modelcontextprotocol, but supports JSON-RPC over POST which is enough for Claude Code and other tools. Available tools: list_apps, get_app, search_metrics, get_metric, get_ping.
If we decide to productionize this we might want to use the Typescript SDK. Continuing with a simple Js implementation like this is probably fine too.

Why?

While testing SQL generation with AI agents (e.g. in https://github.com/akkomar/mozdata-claude-plugin) we found that the agent having access to Glean metadata can be useful.

Glean Dictionary is a Single Page Application with metrics loaded dynamically via JavaScript. This means that AI web fetchers only see an empty HTML shell, not the actual content. Two workarounds for that were:

  1. Directly use https://probeinfo.telemetry.mozilla.org/ (which requires you to know the API in order to write instructions and doesn't contain annotations)
  2. Custom MCP server to do above thing: https://github.com/irrationalagent/glean_mcp built by @irrationalagent (with no access to annotations)

Given that it's low effort, I think it would make sense to have an official MCP endpoint in dictionary with support for all the available telemetry metadata.

Local testing

First start the server:

npm install
npx netlify dev

Then add it to Claude:

claude mcp add --transport http glean-dictionary-local http://localhost:8888/mcp
Then in Claude we can search metrics:
❯ What metrics does Firefox Desktop collect about search?

⏺ I'll search for search-related metrics in Firefox Desktop using the Glean dictionary.

⏺ glean-dictionary-local - search_metrics (MCP)(app_name: "firefox_desktop", query: "search", limit: 50)
  ⎿  {
       "metrics": [
         {
     … +534 lines (ctrl+o to expand)

⏺ Let me get more search metrics to provide a complete picture.

⏺ glean-dictionary-local - search_metrics (MCP)(app_name: "firefox_desktop", query: "search", limit: 50, offset: 50)
  ⎿  {
       "metrics": [
         {
     … +530 lines (ctrl+o to expand)

⏺ glean-dictionary-local - search_metrics (MCP)(app_name: "firefox_desktop", query: "serp", limit: 50)
  ⎿  {
       "metrics": [
         {
     … +425 lines (ctrl+o to expand)

⏺ Firefox Desktop collects extensive search telemetry. Here's a summary organized by category:

  1. Search Access Point (SAP) Counts

  Tracks where users initiate searches from:
  - sap.counts - Event recording when a search is initiated from any search access point
  - sap.search_form_counts - Visits to search engine homepages via SAP
  - Entry points tracked: urlbar, searchbar, context menu, about:home, about:newtab, webextensions, system (command line)

  2. SERP (Search Engine Results Page) Telemetry

  - serp.impression - When a SERP is shown to a user
  - serp.engagement - User actions on a SERP
  - serp.abandonment - When user leaves SERP without engaging
  - serp.ad_impression - When ads are detected on a SERP
  - serp.categorization - High-level topic categorization (sports, travel, etc.)

  3. Ad Tracking (by search origin)

  browser.search.withads.* - Counts SERPs displaying ads
  browser.search.adclicks.* - Clicks on ads on SERPs
  browser.search.content.* - In-content search counts

  All broken down by origin: about_home, about_newtab, urlbar, searchbar, contextmenu, webextension, system, etc.

  4. Default Engine Metrics

  - search.engine.default.display_name - Default engine name
  - search.engine.default.engine_id - Engine telemetry ID
  - search.engine.default.provider_id - Engine provider identifier
  - search.engine.default.partner_code - Partner code if applicable
  - search.engine.default.changed - Event when default engine changes
  - Separate metrics exist for private browsing engine (search.engine.private.*)

  5. Newtab Search Metrics

  - newtab.search.issued - Search initiated from newtab
  - newtab.search.enabled - Whether search input is enabled
  - newtab.search.ad.click / newtab.search.ad.impression - Ad interactions from newtab searches
  - newtab.trending_search_* - Trending search widget interactions

  6. Add-on Search Detection

  - addons.search_detection.etld_change_* - Detects when search queries are redirected by extensions

  7. Activity Stream Search Events

  - activity_stream.event_search - User interactions on Activity Stream
  - activity_stream.event_search_handoff - Search handoff events

  8. UI/Navigation Metrics

  - browser.engagement.navigation.searchbar - URI loads from searchbar
  - browser.engagement.navigation.urlbar_searchmode - Loads from urlbar search mode
  - urlbar.persistedsearchterms.* - Search term persistence in urlbar

  The key format for many metrics is <provider>:<tagged|organic> or <provider>:[tagged|tagged-follow-on|organic]:[code|other|none], indicating the search provider and whether it was a tagged (partner) or organic search.
or access ping metadata including [annotations](https://github.com/mozilla/glean-annotations/blob/main/annotations/firefox_desktop/pings/messaging-system/README.md):
❯ show me information about messaging-system ping from firefox desktop

⏺ glean-dictionary-local - get_ping (MCP)(app_name: "firefox_desktop", ping_name: "messaging-system")
  ⎿  ⚠ Large MCP response (~12.7k tokens), this can fill up context quickly
  ⎿  {
       "app_name": "firefox_desktop",
       "name": "messaging-system",
     … +479 lines (ctrl+o to expand)

⏺ Messaging System Ping - Firefox Desktop

  The messaging-system ping captures single events triggered by Firefox's messaging system, including About:Welcome, ASRouter (Activity Stream Router), and other messaging surfaces.

  Key Characteristics
  ┌────────────────────┬──────────────────────────────────────────────────────┐
  │      Property      │                        Value                         │
  ├────────────────────┼──────────────────────────────────────────────────────┤
  │ Includes client_id │ No (uses separate messaging_system.client_id)        │
  ├────────────────────┼──────────────────────────────────────────────────────┤
  │ Send if empty      │ No                                                   │
  ├────────────────────┼──────────────────────────────────────────────────────┤
  │ Bug                │ https://bugzilla.mozilla.org/show_bug.cgi?id=1825863 │
  └────────────────────┴──────────────────────────────────────────────────────┘
  Important Note on client_id

  This ping uses messaging_system_client_id (usually a copy of Legacy Telemetry client ID), not the Glean client ID. When joining with other tables, use metrics.uuid.legacy_telemetry_client_id.

  ---
  Metrics Included (36 total)

  Core Event Data:
  - messaging_system.event - Type of event (IMPRESSION, CLICK_BUTTON, SESSION_END, etc.)
  - messaging_system.action - Action type (IMPRESSION or DISMISS)
  - messaging_system.message_id - Message identifier in Activity Stream Router
  - messaging_system.source - Interaction source (frecent_links, newtab, CFR)
  - messaging_system.ping_type - Event type (cfr, onboarding)

  Context Information:
  - messaging_system.page - Page context (about:newtab, about:home, about:welcome, unknown)
  - messaging_system.event_context - Stringified JSON with additional context
  - messaging_system.event_page, event_reason, event_source
  - messaging_system.event_screen_id, event_screen_index, event_screen_family, event_screen_initials

  Attribution Data:
  - messaging_system.attribution.source, .medium, .campaign, .content
  - messaging_system.attribution.experiment, .variation, .ua
  - messaging_system.attribution.dlsource, .dltoken, .msstoresignedin

  Identifiers:
  - messaging_system.client_id - Telemetry client ID (may be canary ID near data upload pref changes)
  - messaging_system.browser_session_id - Legacy Telemetry browser session ID
  - messaging_system.impression_id - Unique impression identifier
  - messaging_system.bucket_id - Shared name for grouped messages
  - messaging_system.locale - App locale (BCP47)

  CFR-specific:
  - messaging_system.cfr_action - Contextual Feature Recommender action

  ---
  Message Types Captured

  1. CFR (Contextual Feature Recommendations) - Doorhanger recommendations
  2. Moments Pages - Homepage override messages
  3. InfoBar - Messages between toolbars and content
  4. Spotlight - Modal dialog messages
  5. Toast Notifications - OS-level notifications
  6. Feature Callouts - In-page callout messages
  7. Onboarding (about:welcome) - First-run experience screens
  8. Toolbar Badges - Browser toolbar notifications

  Common Event Types
  ┌────────────────────────────┬──────────────────────────┐
  │           Event            │       Description        │
  ├────────────────────────────┼──────────────────────────┤
  │ IMPRESSION                 │ Message was displayed    │
  ├────────────────────────────┼──────────────────────────┤
  │ CLICK / CLICK_BUTTON       │ User clicked             │
  ├────────────────────────────┼──────────────────────────┤
  │ DISMISS                    │ User dismissed           │
  ├────────────────────────────┼──────────────────────────┤
  │ INSTALL / PIN / BLOCK      │ CFR-specific actions     │
  ├────────────────────────────┼──────────────────────────┤
  │ SESSION_END                │ Onboarding session ended │
  ├────────────────────────────┼──────────────────────────┤
  │ TARGETING_EXPRESSION_ERROR │ JEXL targeting error     │
  └────────────────────────────┴──────────────────────────┘
  Privacy Handling

  - Release channel: Excludes client_id and uses bucket_id instead of message_id
  - Private browsing: Same as release channel
  - Prerelease/experiments: Includes full client_id and message_id

@akkomar akkomar requested a review from edugfilho January 8, 2026 13:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants