Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
191 changes: 191 additions & 0 deletions packages/enricher/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,191 @@
# @posthog/enricher

Detect and enrich PostHog SDK usage in source code. Uses tree-sitter AST analysis to find `capture()` calls, feature flag checks, `init()` calls, and variant branches across JavaScript, TypeScript, Python, Go, and Ruby.

## Quick start

```typescript
import { PostHogEnricher } from "@posthog/enricher";

const enricher = new PostHogEnricher();
await enricher.initialize("/path/to/grammars");

const result = await enricher.parse(sourceCode, "typescript");

result.events; // [{ name: "purchase", line: 5, dynamic: false }]
result.flagChecks; // [{ method: "getFeatureFlag", flagKey: "new-checkout", line: 8 }]
result.flagKeys; // ["new-checkout"]
result.eventNames; // ["purchase"]
result.toList(); // [{ type: "event", line: 5, name: "purchase", method: "capture" }, ...]
```

## Enriching from the PostHog API

Let the enricher fetch everything it needs based on what `parse()` found — feature flags, experiments, event definitions, and event volume/user stats:

```typescript
const result = await enricher.parse(sourceCode, "typescript");
const enriched = await result.enrichFromApi({
apiKey: "phx_...",
host: "https://us.posthog.com",
projectId: 12345,
});

// Flags with staleness, rollout, experiment info
enriched.enrichedFlags;
// [{ flagKey: "new-checkout", flagType: "boolean", staleness: "fully_rolled_out",
// rollout: 100, experiment: { name: "Checkout v2", ... }, ... }]

// Events with definition, volume, unique users
enriched.enrichedEvents;
// [{ eventName: "purchase", verified: true, lastSeenAt: "2025-04-01",
Comment on lines +35 to +41
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[the tiniest nitpick ever lol feel free to ignore]

i would prefer enriched.flags and enriched.events on this interface 🙈

// tags: ["revenue"], stats: { volume: 12500, uniqueUsers: 3200 }, ... }]

// Flat list combining both
enriched.toList();
// [{ type: "event", name: "purchase", verified: true, volume: 12500, ... },
// { type: "flag", name: "new-checkout", flagType: "boolean", staleness: "fully_rolled_out", ... }]

// Source code with inline annotation comments
enriched.toComments();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is fire

// // [PostHog] Event: "purchase" (verified) — 12,500 events — 3,200 users
// posthog.capture("purchase", { amount: 99 });
//
// // [PostHog] Flag: "new-checkout" — boolean — 100% rolled out — STALE (fully_rolled_out)
// const flag = posthog.getFeatureFlag("new-checkout");
```

## Supported languages

| Language | ID | Capture | Flags | Init | Variants |
|---|---|---|---|---|---|
| JavaScript | `javascript` | yes | yes | yes | yes |
| TypeScript | `typescript` | yes | yes | yes | yes |
| JSX | `javascriptreact` | yes | yes | yes | yes |
| TSX | `typescriptreact` | yes | yes | yes | yes |
| Python | `python` | yes | yes | yes | yes |
| Go | `go` | yes | yes | yes | yes |
| Ruby | `ruby` | yes | yes | yes | yes |

## API reference

### `PostHogEnricher`

Main entry point. Owns the tree-sitter parser lifecycle.

```typescript
const enricher = new PostHogEnricher();
await enricher.initialize(wasmDir);
const result = await enricher.parse(source, languageId);
enricher.dispose();
```

### `ParseResult`

Returned by `enricher.parse()`. Contains all detected PostHog SDK usage.

| Property / Method | Type | Description |
|---|---|---|
| `calls` | `PostHogCall[]` | All detected SDK method calls |
| `initCalls` | `PostHogInitCall[]` | `posthog.init()` and constructor calls |
| `flagAssignments` | `FlagAssignment[]` | Flag result variable assignments |
| `variantBranches` | `VariantBranch[]` | If/switch branches on flag values |
| `functions` | `FunctionInfo[]` | Function definitions in the file |
| `events` | `CapturedEvent[]` | Capture calls only |
| `flagChecks` | `FlagCheck[]` | Flag method calls only |
| `flagKeys` | `string[]` | Unique flag keys |
| `eventNames` | `string[]` | Unique event names |
| `toList()` | `ListItem[]` | Flat sorted list of all SDK usage |
| `enrichFromApi(config)` | `Promise<EnrichedResult>` | Fetch from PostHog API and enrich |

### `EnrichedResult`

Returned by `enrich()` or `enrichFromApi()`. Detection combined with PostHog context.

| Property / Method | Type | Description |
|---|---|---|
| `enrichedFlags` | `EnrichedFlag[]` | Flags grouped by key with type, staleness, rollout, experiment |
| `enrichedEvents` | `EnrichedEvent[]` | Events grouped by name with definition, stats, tags |
| `toList()` | `EnrichedListItem[]` | Flat list with all metadata |
| `toComments()` | `string` | Source code with inline annotation comments |

### `EnricherApiConfig`

```typescript
interface EnricherApiConfig {
apiKey: string;
host: string; // e.g. "https://us.posthog.com"
projectId: number;
}
```

### `EnrichedFlag`

```typescript
interface EnrichedFlag {
flagKey: string;
flagType: "boolean" | "multivariate" | "remote_config";
staleness: StalenessReason | null;
rollout: number | null;
variants: { key: string; rollout_percentage: number }[];
flag: FeatureFlag | undefined;
experiment: Experiment | undefined;
occurrences: FlagCheck[];
}
```

### `EnrichedEvent`

```typescript
interface EnrichedEvent {
eventName: string;
verified: boolean;
lastSeenAt: string | null;
tags: string[];
stats: { volume?: number; uniqueUsers?: number } | undefined;
definition: EventDefinition | undefined;
occurrences: CapturedEvent[];
}
```

## Detection API

The lower-level detection API is also exported for direct use (this is the same API used by the PostHog VSCode extension):

```typescript
import { PostHogDetector } from "@posthog/enricher";

const detector = new PostHogDetector();
await detector.initialize(wasmDir);

const calls = await detector.findPostHogCalls(source, "typescript");
const initCalls = await detector.findInitCalls(source, "typescript");
const branches = await detector.findVariantBranches(source, "typescript");
const assignments = await detector.findFlagAssignments(source, "typescript");
const functions = await detector.findFunctions(source, "typescript");

detector.dispose();
```

### Flag classification utilities

```typescript
import { classifyFlagType, classifyStaleness } from "@posthog/enricher";

classifyFlagType(flag); // "boolean" | "multivariate" | "remote_config"
classifyStaleness(key, flag, experiments, opts); // StalenessReason | null
```

## Logging

Warnings are silenced by default. To receive them:

```typescript
import { setLogger } from "@posthog/enricher";

setLogger({ warn: console.warn });
```

## Setup

The package requires pre-built tree-sitter WASM grammar files. Run `pnpm fetch-grammars` to build them, or place pre-built `.wasm` files in the `grammars/` directory.
12 changes: 0 additions & 12 deletions packages/enricher/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -7,18 +7,6 @@
".": {
"types": "./dist/index.d.ts",
"import": "./dist/index.js"
},
"./classification": {
"types": "./dist/flag-classification.d.ts",
"import": "./dist/flag-classification.js"
},
"./stale-flags": {
"types": "./dist/stale-flags.d.ts",
"import": "./dist/stale-flags.js"
},
"./types": {
"types": "./dist/types.d.ts",
"import": "./dist/types.js"
}
},
"scripts": {
Expand Down
96 changes: 96 additions & 0 deletions packages/enricher/src/comment-formatter.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
import type { EnrichedEvent, EnrichedFlag, EnrichedListItem } from "./types.js";

function commentPrefix(languageId: string): string {
if (languageId === "python" || languageId === "ruby") {
return "#";
}
return "//";
}

function formatFlagComment(flag: EnrichedFlag): string {
const parts: string[] = [`Flag: "${flag.flagKey}"`];

if (flag.flag) {
parts.push(flag.flagType);
if (flag.rollout !== null) {
parts.push(`${flag.rollout}% rolled out`);
}
if (flag.experiment) {
const status = flag.experiment.end_date ? "complete" : "running";
parts.push(`Experiment: "${flag.experiment.name}" (${status})`);
}
if (flag.staleness) {
parts.push(`STALE (${flag.staleness})`);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lazy-web: what are the possible values for flag.staleness ?

just wondering if it provides anything significantly different from the text "STALE", and if not, perhaps not work feeding to the LLM

}
}

return parts.join(" \u2014 ");
}

function formatEventComment(event: EnrichedEvent): string {
const parts: string[] = [`Event: "${event.eventName}"`];
if (event.verified) {
parts.push("(verified)");
}
if (event.stats?.volume !== undefined) {
parts.push(`${event.stats.volume.toLocaleString()} events`);
}
if (event.stats?.uniqueUsers !== undefined) {
parts.push(`${event.stats.uniqueUsers.toLocaleString()} users`);
}
if (event.definition?.description) {
parts.push(event.definition.description);
}
return parts.join(" \u2014 ");
}

export function formatComments(
source: string,
languageId: string,
items: EnrichedListItem[],
enrichedFlags: Map<string, EnrichedFlag>,
enrichedEvents: Map<string, EnrichedEvent>,
): string {
const prefix = commentPrefix(languageId);
const lines = source.split("\n");
const sorted = [...items].sort((a, b) => a.line - b.line);

let offset = 0;
// One comment per original source line — if multiple detections share a line,
// only the first (by sort order) gets an annotation to keep output readable.
Comment on lines +59 to +60
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thoughts on removing this restriction?

probably an edge case (i'd be impressed to see a single line of code with a large number of posthog events/flags 😛), but might be preferable to display everything, esp if the target audience is an agent?

(context for example: in the diff viewer, i'd like to add these things as token hovers, so we'd have full control over the UX and there's no issue with multiple things coming from a single src line)

const annotatedLines = new Set<number>();

for (const item of sorted) {
const targetLine = item.line + offset;
if (annotatedLines.has(item.line)) {
continue;
}
annotatedLines.add(item.line);

let comment: string | null = null;

if (item.type === "flag") {
const flag = enrichedFlags.get(item.name);
if (flag) {
comment = `${prefix} [PostHog] ${formatFlagComment(flag)}`;
}
} else if (item.type === "event") {
const event = enrichedEvents.get(item.name);
if (event) {
comment = `${prefix} [PostHog] ${formatEventComment(event)}`;
} else if (item.detail) {
comment = `${prefix} [PostHog] Event: ${item.detail}`;
}
} else if (item.type === "init") {
comment = `${prefix} [PostHog] Init: token "${item.name}"`;
}

if (comment) {
const indent = lines[targetLine]?.match(/^(\s*)/)?.[1] ?? "";
lines.splice(targetLine, 0, `${indent}${comment}`);
offset++;
}
}

return lines.join("\n");
}
Loading
Loading