Skip to content

feature: telemetry adapter abstraction (TelemetryPort + NoopAdapter + factory)#77

Open
jamby77 wants to merge 14 commits intomasterfrom
feature/71-telemetry
Open

feature: telemetry adapter abstraction (TelemetryPort + NoopAdapter + factory)#77
jamby77 wants to merge 14 commits intomasterfrom
feature/71-telemetry

Conversation

@jamby77
Copy link
Copy Markdown
Collaborator

@jamby77 jamby77 commented Apr 2, 2026

Summary

  • Introduces TelemetryPort interface with capture(), identify(), shutdown() — matching the StoragePort pattern
  • NoopTelemetryAdapter for opt-out and misconfiguration fallback
  • TelemetryClientFactory reads TELEMETRY_PROVIDER and BETTERDB_TELEMETRY from NestJS ConfigService
  • Refactors UsageTelemetryService to inject TELEMETRY_CLIENT token instead of owning HTTP logic
  • TelemetryModule wires factory via useFactory provider, calls shutdown() on destroy
  • Adds TELEMETRY_PROVIDER, POSTHOG_API_KEY, POSTHOG_HOST to env schema
  • BETTERDB_TELEMETRY=false → factory returns NoopAdapter regardless of provider
  • Misconfiguration logs warning and falls back to NoopAdapter (never crashes)

Test plan

  • Unit tests for NoopTelemetryAdapter (3 tests)
  • Unit tests for TelemetryClientFactory (4 tests — default, noop, opt-out, missing key)
  • Integration test: NestJS DI wiring, service → adapter delegation (2 tests)
  • pnpm build passes
  • Manual: app boots normally with no telemetry env vars set

Closes #71


Note

Medium Risk
Introduces new telemetry plumbing across both API and web, including new third-party PostHog dependencies and runtime configuration that affects startup flow and outbound network calls. Risk is moderate due to DI wiring changes and potential misconfiguration/opt-out behavior impacting telemetry and app readiness gating.

Overview
Adds a telemetry adapter abstraction in the API via TelemetryPort, with provider selection (noop/http/posthog) through a new TelemetryClientFactory, and wires it globally in TelemetryModule (including graceful shutdown() on module destroy).

Refactors server-side telemetry to route all usage/heartbeat events through the injected TELEMETRY_CLIENT (instead of direct HTTP), adds GET /telemetry/config for the frontend, and tightens event validation using shared FRONTEND_TELEMETRY_EVENTS + TelemetryEventDto.

On the web side, introduces useTelemetry() to fetch /telemetry/config and select either an API-backed client or a PostHog JS client; existing trackers now call client.capture(...), and ServerStartupGuard waits for telemetry config resolution before mounting the app. Updates env/schema and examples with TELEMETRY_PROVIDER/PostHog settings, adds posthog-node/posthog-js, and includes comprehensive unit/integration tests around adapter behavior and fallbacks.

Written by Cursor Bugbot for commit 43d82a2. This will update automatically on new commits. Configure here.

…r, and factory

Introduce provider-agnostic telemetry pattern matching the existing
StoragePort/StorageClientFactory architecture. UsageTelemetryService now
delegates to an injected TELEMETRY_CLIENT token instead of owning HTTP
logic directly. Factory reads config from NestJS ConfigService and
returns NoopAdapter when telemetry is disabled or misconfigured.

Closes #71
@claude

This comment was marked as outdated.

@claude

This comment was marked as outdated.

jamby77 added 2 commits April 2, 2026 09:43
…orrect types

Add HttpTelemetryClientAdapter and PosthogTelemetryClientAdapter stubs
implementing TelemetryPort. Factory now returns the correct adapter type
per provider config. Real implementations in #72 and #73.
@claude

This comment was marked as outdated.

@claude

This comment was marked as outdated.

- Guard sendEvent against empty instanceId (no licenseService case)
- Fix payload spread order: base properties override caller payload
- Use NestJS Logger instead of console.warn in factory and module
- Add PostHog stub warning when API key is set but adapter not implemented
- Catch shutdown errors in onModuleDestroy (non-fatal teardown)
- Handle both boolean and string 'false' for BETTERDB_TELEMETRY
- Warn and fall back to noop when ENTITLEMENT_URL path is invalid
- Add factory tests for http, posthog, unknown provider, and URL fallback
- Restructure integration tests: guard behavior + identity lifecycle
@claude

This comment was marked as outdated.

@claude

This comment was marked as outdated.

@KIvanow
Copy link
Copy Markdown
Member

KIvanow commented Apr 2, 2026

@cursor review

feature: implement PosthogTelemetryClientAdapter with posthog-node

Thin wrapper around posthog-node: capture(), identify(), and shutdown()
delegate directly to the PostHog client. Remove stub warning from
factory since adapter is now real.

Closes #73
* feature: implement HttpTelemetryClientAdapter with fetch transport

Replace stub with real HTTP adapter: fire-and-forget POST to telemetry
URL with 5s timeout, silent error handling. identify() and shutdown()
are no-ops.

Closes #72

* fix: track in-flight requests, abort on shutdown, improve error test

- capture() tracks pending AbortControllers, clears on completion
- shutdown() aborts all in-flight requests
- Error swallowing test now drains microtask queue for proper assertion

* fix: clear timers directly on shutdown to prevent delayed process exit

* chore: allow underscore-prefixed unused vars in eslint config

Add argsIgnorePattern and varsIgnorePattern for _ prefix to
@typescript-eslint/no-unused-vars. Fixes lint errors on interface
implementations with intentionally unused parameters.
version: this.version,
tier: this.tier,
deploymentMode: this.deploymentMode,
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unprotected identify call can crash app during startup

Low Severity

The telemetryClient.identify() call in onModuleInit is not wrapped in try-catch, unlike the telemetryClient.capture() call in sendEvent which explicitly catches exceptions with the comment "telemetry must never crash the app." If identify throws (e.g., the PostHog SDK encounters a serialization or initialization error), the exception propagates through onModuleInit, causing the NestJS app to fail during startup. Wrapping identify (and trackAppStart) in the same try-catch pattern used by sendEvent would close this gap.

Additional Locations (1)
Fix in Cursor Fix in Web

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any idea in which cases this might happen?

Also, how does PostHog SDK behaves when it can't connect to the internet (i.e. if it is in a closed network)? Something similar

# docker-compose.yml
networks:
  local_only:
    internal: true

services:
  valkey:
    image: valkey/valkey
    networks: [local_only]

  monitor:
    image: betterdb/monitor
    networks: [local_only]

Should be enough for testing it

…ation (#84)

* feature: add GET /telemetry/config endpoint for frontend runtime configuration

Returns instanceId, telemetryEnabled, provider, and optional posthog
fields. Frontend uses this at runtime to initialize the correct
telemetry client without build-time env vars.

Closes #74

* fix: handle both boolean and string 'false' for BETTERDB_TELEMETRY in config endpoint

* refactor: improve readability and type clarity in TelemetryController

* feature: extract telemetry event types to shared, add DTO with class-validator

Move frontend/backend telemetry event constants and types to
@betterdb/shared for frontend type safety. Add TelemetryEventDto
with class-validator @isin and @isObject for NestJS validation.
Refactor controller to use DTO, switch statement, and private handlers.

* fix: remove posthog API key and host from telemetry config endpoint

* fix: add default case to event switch to throw on unhandled event types

* feature: frontend TelemetryConfigProvider + ApiTelemetryClient + hook refactor (#85)

* feature: frontend TelemetryConfigProvider, ApiTelemetryClient, and hook refactor

Add TelemetryClient interface, NoopTelemetryClient, ApiTelemetryClient.
TelemetryConfigProvider fetches GET /telemetry/config on mount, selects
the correct client, and exposes it via useTelemetry() context hook.
Falls back to ApiTelemetryClient on config fetch failure.

Refactor useNavigationTracker and useIdleTracker to use useTelemetry()
instead of direct fetchApi calls. useConnection telemetry left as-is
since it creates context at a level above the provider.

Closes #75

* fix: remove nonexistent 'api' provider case, use 'http' consistently

* chore: move useTelemetry hook to hooks/ directory

* chore: replace TelemetryConfigProvider with singleton useTelemetry hook

Remove the context provider — config loading now lives in useTelemetry
hook with a module-level singleton. Config is fetched once, cached, and
shared across all consumers. Hook returns { client, ready }.
No provider wrapping needed in App.

* chore: simplify useTelemetry to async function with module-level promise

* feature: block app render until telemetry client is ready in ServerStartupGuard

* chore: use TanStack Query for telemetry config fetching in useTelemetry

* chore: use 30min stale time and default retry for telemetry config query

* feature: frontend PosthogTelemetryClient with posthog-js (#88)

* feature: frontend PosthogTelemetryClient with posthog-js

Thin wrapper around posthog-js: capture() maps page_view to native
$pageview, identify() and shutdown() delegate directly. Wired into
useTelemetry hook — activated when backend returns provider=posthog
and VITE_POSTHOG_API_KEY is set at build time. Falls back to
ApiTelemetryClient when key is missing.

Closes #76

* chore: add telemetry env vars to .env.example

* chore: add frontend telemetry env vars to .env.example

* fix: set Vite envDir to monorepo root so .env vars are loaded

* refactor: update PostHog client to use instance-based API, adjust env vars and tests

Switch from the global `posthog` instance to an instance-based approach with `PostHog`. Updated env vars for clarity (`VITE_POSTHOG_*` to `VITE_PUBLIC_POSTHOG_*`). Refactored `useTelemetry` to manage lifecycle via `useEffect` and updated tests to mock the new client structure.

* fix: store posthog.init() instance, use || for empty string host fallback

- Use returned PostHog instance from init() instead of global
- Use || instead of ?? so empty string host falls back to default
- Remove debug console.log

* fix: use module-level singleton for telemetry client to prevent per-component reset

* chore: remove unused backend telemetry type exports from shared package

* chore: remove NoopTelemetryClient from frontend, use ApiTelemetryClient as fallback

* fix: call identify with instanceId when creating PostHog frontend client
…#90)

* feature: route LicenseService heartbeat through TelemetryPort adapter (#79)

Inject TELEMETRY_CLIENT into LicenseService as @optional() and delegate
heartbeat telemetry_ping events through the adapter via capture() instead
of direct HTTP. Version info continues via validateLicense() calls to
the entitlement URL. sendStartupError() remains unchanged.

* fix: make TelemetryModule @global() so TELEMETRY_CLIENT is injectable across modules
@jamby77 jamby77 requested a review from KIvanow April 2, 2026 16:23
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.


constructor(apiKey: string, host?: string) {
this.client = posthog.init(apiKey, {
api_host: host || 'https://us.i.posthog.com',
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Frontend PostHog host defaults to US, backend to EU

Medium Severity

The frontend PosthogTelemetryClient hardcodes a fallback host of https://us.i.posthog.com, while the .env.example and backend configuration default to https://eu.i.posthog.com. When VITE_PUBLIC_POSTHOG_HOST is not explicitly set, the frontend sends telemetry data to the US region while the backend sends to the EU region. This is a data sovereignty issue for EU-based deployments.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Telemetry adapter abstraction: TelemetryPort + NoopAdapter + factory

2 participants