Conversation
…fold (TDD RED) - Add MAX_SSE_CONNECTIONS constant (default 1000, configurable via SSE_MAX_CONNECTIONS env var) - Return 503 + Retry-After: 30 header when limit is reached, before ReadableStream construction - Create src/lib/vector/__tests__/catalog.test.ts with getVectorCatalog singleton tests (RED until Task 2)
- Delete evaluateAndDeliverAlerts function and its call from heartbeat route - Remove dead imports: evaluateAlerts, deliverSingleWebhook, deliverToChannels, trackWebhookDelivery - Update test to assert evaluateAlerts is NOT called (PERF-01 traceability)
- Annotates existing "keepalive removes dead connections" test with PERF-02 marker - Confirms ghost connection eviction within 30s keepalive interval is already covered
…g() (PERF-04) - Replace eager export const VECTOR_CATALOG with lazy _catalog singleton - Add getVectorCatalog() function: builds catalog on first access, returns same reference on repeat calls - Update findComponentDef() to call getVectorCatalog() internally - Update component-palette.tsx: import + 3 usages migrated to getVectorCatalog() - Update library/shared-components/new/page.tsx: import + 2 usages migrated to getVectorCatalog() - All 4 catalog tests pass (singleton reference equality, findComponentDef lookup)
- Add NodeGroup model with criteria, labelTemplate, requiredLabels JSON fields - Add parentId self-reference to PipelineGroup (GroupChildren relation) - Remove PipelineGroup unique(environmentId, name) constraint - Add @@index([parentId]) to PipelineGroup for efficient child queries - Add nodeGroups NodeGroup[] relation to Environment model - Create migration 20260326400000_phase2_fleet_organization - Regenerate Prisma client with NodeGroup model
- Create nodeGroupRouter with list, create, update, delete operations - All mutations use withTeamAccess(ADMIN) authorization - Audit logging via withAudit for created/updated/deleted events - Unique name validation per environment with CONFLICT error - NOT_FOUND errors for missing groups on update/delete - Register nodeGroupRouter in appRouter as trpc.nodeGroup.* - 12 unit tests covering all CRUD behaviors including error cases
…ent + tests - Add labelCompliant field to fleet.list response (NODE-02) - Queries all NodeGroup requiredLabels for the environment - Sets labelCompliant=true when node has all required label keys - Vacuously compliant when no NodeGroups have required labels - Add NODE-03 label template auto-assignment in enrollment route - After node creation, finds matching NodeGroups by criteria - Merges labelTemplate fields from matching groups into node labels - Non-fatal: enrollment succeeds even if template application fails - Add 3 new fleet.list label compliance tests - Add 3 enrollment auto-assignment unit tests (match, non-match, empty)
…pth guard - Add parentId to create/update input schemas - Replace findUnique compound key check with findFirst for application-layer uniqueness per (environmentId, name, parentId) - Add depth guard: rejects nesting beyond 3 levels (BAD_REQUEST) - Update list to include children count in _count - Update update to support parentId changes with depth enforcement - Add 11 new tests covering nesting, depth guard, and duplicate name scenarios
…e router
- bulkAddTags: validates tags against team.availableTags before loop, deduplicates via Set, handles partial failures, max 100 pipelines
- bulkRemoveTags: filters specified tags from each pipeline, handles partial failures, max 100 pipelines
- Both procedures return { results, total, succeeded } summary
- 11 tests covering all behaviors including partial failures, deduplication, and validation
- Add NodeGroup model with criteria, labelTemplate, requiredLabels JSON fields - Add parentId self-reference to PipelineGroup (GroupChildren relation) - Remove PipelineGroup unique(environmentId, name) constraint - Add @@index([parentId]) to PipelineGroup for efficient child queries - Add nodeGroups NodeGroup[] relation to Environment model - Create migration 20260326400000_phase2_fleet_organization - Regenerate Prisma client with NodeGroup model
- Create nodeGroupRouter with list, create, update, delete operations - All mutations use withTeamAccess(ADMIN) authorization - Audit logging via withAudit for created/updated/deleted events - Unique name validation per environment with CONFLICT error - NOT_FOUND errors for missing groups on update/delete - Register nodeGroupRouter in appRouter as trpc.nodeGroup.* - 12 unit tests covering all CRUD behaviors including error cases
…ent + tests - Add labelCompliant field to fleet.list response (NODE-02) - Queries all NodeGroup requiredLabels for the environment - Sets labelCompliant=true when node has all required label keys - Vacuously compliant when no NodeGroups have required labels - Add NODE-03 label template auto-assignment in enrollment route - After node creation, finds matching NodeGroups by criteria - Merges labelTemplate fields from matching groups into node labels - Non-fatal: enrollment succeeds even if template application fails - Add 3 new fleet.list label compliance tests - Add 3 enrollment auto-assignment unit tests (match, non-match, empty)
…pth guard - Add parentId to create/update input schemas - Replace findUnique compound key check with findFirst for application-layer uniqueness per (environmentId, name, parentId) - Add depth guard: rejects nesting beyond 3 levels (BAD_REQUEST) - Update list to include children count in _count - Update update to support parentId changes with depth enforcement - Add 11 new tests covering nesting, depth guard, and duplicate name scenarios
…e router
- bulkAddTags: validates tags against team.availableTags before loop, deduplicates via Set, handles partial failures, max 100 pipelines
- bulkRemoveTags: filters specified tags from each pipeline, handles partial failures, max 100 pipelines
- Both procedures return { results, total, succeeded } summary
- 11 tests covering all behaviors including partial failures, deduplication, and validation
- Create NodeGroupManagement card component with full CRUD (list/create/update/delete) - Key-value pair editor for criteria and label template fields - Tag input for required labels - Warning banner when criteria is empty (matches all enrolling nodes) - Delete confirmation via ConfirmDialog - Add NodeGroupManagement section to fleet-settings.tsx - Add Non-compliant amber badge to fleet node list when labelCompliant === false - Fix pre-existing rawNodes useMemo dependency warning in fleet page
- Add "Node groups" section with field reference table and enrollment hint - Add "Label compliance" section explaining Non-compliant badge behavior
…e-to-group menu - Create PipelineGroupTree component with recursive collapsible tree, expand/collapse, folder icons, colored dots, pipeline counts - Export buildGroupTree and buildBreadcrumbs helpers for reuse in pipelines page - Add parent group selector to ManageGroupsDialog create form (filters eligible parents to depth < 3) - Integrate PipelineGroupTree as sidebar in pipelines page with group selection - Add breadcrumb navigation above pipeline list when a group is selected - Replace flat move-to-group dropdown with recursive nested hierarchy via renderGroupMenuItems
- Add bulkAddTags and bulkRemoveTags mutations using the Plan 02 tRPC endpoints - Show tag selection dialog for each operation (checkbox list when team has availableTags, text input otherwise) - Loading toast during mutation via toast.loading, dismissed on settle - Partial failure display reuses existing resultSummary dialog pattern - Separate dialogs per plan decision: "Separate add-tags and remove-tags operations"
…seline - NodeGroup Prisma model + PipelineGroup parentId migration - NodeGroup tRPC router with CRUD + enrollment auto-assignment - Fleet label compliance, node group management UI - Pipeline group tree, bulk tags, nested groups
…ed nodeMatchesGroup util - Extract nodeMatchesGroup to src/lib/node-group-utils.ts (shared util) - Update enrollment route to use shared util instead of inline logic - Add groupHealthStats procedure: per-group onlineCount/alertCount/complianceRate/totalNodes in 3 parallel queries - Add nodesInGroup procedure: per-node drill-down sorted by status (worst first) with cpuLoad and labelCompliant - Synthetic '__ungrouped__' entry for nodes matching no group criteria - 27 tests passing: 15 for new procedures + 12 existing tests unchanged
# Conflicts: # .planning/STATE.md # src/app/api/agent/enroll/route.ts # src/server/routers/__tests__/node-group.test.ts # src/server/routers/node-group.ts
…seline - NodeGroup Prisma model + PipelineGroup parentId migration - NodeGroup tRPC router with CRUD + enrollment auto-assignment - Fleet label compliance, node group management UI - Pipeline group tree, bulk tags, nested groups
…ed nodeMatchesGroup util - Extract nodeMatchesGroup to src/lib/node-group-utils.ts (shared util) - Update enrollment route to use shared util instead of inline logic - Add groupHealthStats procedure: per-group onlineCount/alertCount/complianceRate/totalNodes in 3 parallel queries - Add nodesInGroup procedure: per-node drill-down sorted by status (worst first) with cpuLoad and labelCompliant - Synthetic '__ungrouped__' entry for nodes matching no group criteria - 27 tests passing: 15 for new procedures + 12 existing tests unchanged
…and filter toolbar - Add FleetHealthDashboard: group-level summary cards with polling (30s) - Add NodeGroupHealthCard: collapsible with online/alert/compliance metrics - Add NodeGroupDetailTable: per-node drill-down with status, CPU, last seen, compliance - Add FleetHealthToolbar: group filter, label filter, compliance toggle pills - Wire URL query param state (group, label, compliance) for shareable links - Add Health tab to fleet page navigation - Create /fleet/health route page
- Document group summary cards (online, alerts, compliance metrics) - Document drill-down per-node table and sort order - Document group/label/compliance filter toolbar with URL param sharing - Document Ungrouped card behavior
- New page at operations/outbound-webhooks.md covering setup, payload format, signature verification, retry schedule, delivery history, and endpoint management - Add to docs/public/SUMMARY.md nav
… tRPC router - Add PromotionRequest Prisma model with PENDING/APPROVED/DEPLOYED/REJECTED/CANCELLED statuses - Add migration 20260327000000_add_promotion_request with FK constraints and indexes - Add relation fields to Pipeline, Environment, and User models - Create promotion-service.ts: preflightSecrets, executePromotion, generateDiffPreview - Create promotionRouter with 7 procedures: preflight, diffPreview, initiate, approve, reject, cancel, history - Wire approval workflow: self-review guard, atomic updateMany race prevention - executePromotion preserves SECRET[name] refs (no transformConfig stripping) - fires promotion_completed outbound webhook after execute - Register promotionRouter on appRouter as "promotion" - Add PromotionRequest team resolution in withTeamAccess middleware
…l paths - 22 tests across preflight, diffPreview, initiate, approve, reject, cancel, history, SECRET refs - Tests: preflight blocks when secrets missing, passes when all present, passes with no refs - Tests: initiate creates PENDING (approval required), auto-executes (no approval), same-env guard, cross-team guard, name collision, missing secrets - Tests: approve self-review blocked, atomic race guard, succeeds for different user - Tests: reject sets REJECTED with note, cancel only allows promoter - Tests: history ordered by createdAt desc with take 20 - Tests: clone preserves SECRET refs (no stripping), diffPreview shows env var placeholders
…wizard - 5-step state machine: target -> preflight -> diff -> confirm -> result - Step 2: preflight check with missing secrets list, blocks promotion if canProceed=false - Step 2: name collision warning with amber alert - Step 3: ConfigDiff showing source vs target YAML with env var substitution note - Step 4: fires promotion.initiate mutation with spinner - Step 5: pending-approval (Clock) vs auto-deployed (CheckCircle) result messages - Invalidates pipeline.list and promotion.history query caches on success - Component export name and props interface unchanged - pipelines/page.tsx unaffected
…docs - PromotionHistory component queries promotion.history, renders table with source env, target env, promoted by, date, and status badge (DEPLOYED=default, PENDING/APPROVED=secondary, REJECTED=destructive, CANCELLED=outline) - Returns null when no promotion history to avoid empty section clutter - Rendered at bottom of pipeline editor layout after logs panel - Docs: added Cross-Environment Promotion section covering workflow, approval, secret pre-flight validation, and promotion history
… v1 endpoints - Install @asteasolutions/zod-to-openapi 8.5.0 (Zod v4 compatible) - Create generateOpenAPISpec() covering all 16 REST v1 operations - BearerAuth security scheme registered; every operation references it - Schemas match exact wire shapes from route handlers (dates as ISO strings) - TDD: 7 tests verify structure, security, request/response schemas
…ion script - Add GET /api/v1/openapi.json (public, no auth) with CORS headers - Add OPTIONS preflight handler for CORS - Create scripts/generate-openapi.ts that writes public/openapi.json - Add generate:openapi script to package.json (tsx scripts/generate-openapi.ts) - Spec generates 12 paths / 16 operations
- Register CookieAuth security scheme (apiKey in cookie)
- Add 15 tRPC procedures: pipeline.list/get/create/update/delete,
deploy.agent/undeploy, fleet.list/get, environment.list,
secret.list/create, alert.listRules, serviceAccount.list/create
- Queries map to GET with ?input= SuperJSON param, mutations to POST
with {"json": <input>} body
- All tRPC ops tagged "tRPC" and secured with CookieAuth
- 8 new TDD tests verify tRPC paths, methods, tags, security, count
…ction - generate-openapi.ts: log REST v1 vs tRPC operation counts separately, add duplicate operationId and empty-path validation checks - docs/public/reference/api.md: add OpenAPI Specification section with fetch/import/client-generation instructions and surface comparison table
…ps promotion - Install @octokit/rest 22.0.1 for GitHub API interactions - Add prUrl and prNumber fields to PromotionRequest model - Update gitOpsMode comment to document "promotion" as valid value - Add AWAITING_PR_MERGE and DEPLOYING to status comment - Create migration 20260327100000_add_gitops_promotion_fields
…sing @octokit/rest - Implements createPromotionPR() that creates branch, commits YAML, opens PR - Parses owner/repo from both HTTPS and SSH GitHub URL formats - Embeds promotion request ID in PR body for merge webhook correlation - Branch name includes requestId prefix to prevent collision - Unit tests covering all PR creation steps and URL parsing (14 tests)
- Load gitOpsMode, gitRepoUrl, gitToken, gitBranch from target environment - When gitOpsMode=promotion: generate pipeline YAML, call createPromotionPR, update PromotionRequest with prUrl/prNumber/AWAITING_PR_MERGE status - Existing UI path (Phase 5) unchanged when gitOpsMode != promotion - Add 4 new tests: AWAITING_PR_MERGE return, prUrl/prNumber update, fallthrough to UI path for off and push modes (26 total tests pass)
- Handle X-GitHub-Event header: ping returns pong, pull_request routes to merge handler - Update HMAC lookup to include both bidirectional and promotion gitOpsMode environments - PR merge handler: checks action=closed AND merged=true (not-merged PRs ignored) - Extracts VF promotion ID from PR body comment - Atomic updateMany AWAITING_PR_MERGE->DEPLOYING for idempotency (GitHub retry safe) - Calls executePromotion with original promoter as audit actor - 11 unit tests covering merge, ignore cases, idempotency, HMAC validation
- Extend gitOpsMode Zod enum to include "promotion" value - Auto-generate webhook secret when switching to "promotion" mode (same as bidirectional) - Clear webhook secret when switching away from webhook-based modes
…uide - Add "Promotion (PR-based)" option to gitOpsMode dropdown - Show inline step-by-step setup guide when promotion mode is selected - Display webhook URL and one-time webhook secret with copy buttons - Guide explains GitHub webhook configuration: pull_request events, payload URL, secret - Fix handleSave type cast to accept "promotion" value
- Add 07-03-PLAN.md describing GitOps promotion mode UI implementation - Add 07-03-SUMMARY.md with implementation details and decisions - Update STATE.md: advance plan counter, record metrics, add decisions - Update ROADMAP.md: phase 07 progress updated
These are local planning artifacts that should not be committed. The .planning/ directory is already in .gitignore.
Greptile SummaryThis PR delivers 7 phases of enterprise-scale features: fleet node groups with label enforcement, a fleet health dashboard, outbound webhooks (Standard-Webhooks signed), cross-environment pipeline promotion (UI + GitOps PR flow), an OpenAPI 3.1 spec, and several performance improvements. The scope is large (~11k lines) but well-structured — each feature has its own router, service, and tests. Three issues need attention before merge.
Confidence Score: 3/5Not safe to merge as-is — cross-team node data exposure is a live security bug, and the msgId mismatch breaks the Standard-Webhooks contract on day one. Three confirmed P1 issues: a cross-team data exposure vulnerability in the node-group router, a correctness bug in outbound webhook ID tracking that will manifest on every first delivery, and a missing audit trail on the test-delivery mutation. Additionally, withTeamAccess resolution from requestId needs explicit verification. The rest of the code is well-written and structurally sound. src/server/routers/node-group.ts (cross-team exposure), src/server/services/outbound-webhook.ts (msgId mismatch), src/server/routers/webhook-endpoint.ts (missing audit), src/server/routers/promotion.ts (withTeamAccess field name) Important Files Changed
Sequence DiagramsequenceDiagram
participant UI as Browser
participant tRPC as tRPC Router
participant Octokit as GitHub API
participant GHWebhook as /api/webhooks/git
participant PromSvc as promotion-service
participant OutWH as outbound-webhook
Note over UI,OutWH: Phase 5/7 – Pipeline Promotion (GitOps path)
UI->>tRPC: promotion.initiate(pipelineId, targetEnvId)
tRPC->>PromSvc: preflightSecrets()
PromSvc-->>tRPC: {canProceed, missing[]}
tRPC->>Octokit: createPromotionPR(encryptedToken, yaml)
Octokit-->>tRPC: {prNumber, prUrl}
tRPC-->>UI: {status: AWAITING_PR_MERGE, prUrl}
Note over Octokit,GHWebhook: Developer merges PR on GitHub
Octokit->>GHWebhook: POST pull_request (merged)
GHWebhook->>GHWebhook: HMAC verify + parse requestId from PR body
GHWebhook->>GHWebhook: atomic updateMany(AWAITING_PR_MERGE→DEPLOYING)
GHWebhook->>PromSvc: executePromotion(requestId, promotedById)
PromSvc-->>GHWebhook: {pipelineId, pipelineName}
GHWebhook->>OutWH: fireOutboundWebhooks(promotion_completed)
GHWebhook-->>Octokit: {deployed: true}
Note over UI,OutWH: Phase 4 – Outbound Webhooks (event delivery)
PromSvc->>OutWH: fireOutboundWebhooks(metric, teamId, payload)
OutWH->>OutWH: dispatchWithTracking() — creates WebhookDelivery record
OutWH->>OutWH: deliverOutboundWebhook() — HMAC sign + POST
Note right of OutWH: ⚠ msgId in DB ≠ webhook-id header
OutWH-->>PromSvc: success / retryable / dead_letter
|
- Fix cross-team node data exposure in nodesInGroup by scoping group lookup to input.environmentId (IDOR prevention) - Fix msgId mismatch between WebhookDelivery record and webhook-id HTTP header by passing msgId through to deliverOutboundWebhook - Add missing withAudit middleware to testDelivery mutation
Summary
Enterprise-scale features for VectorFlow — enabling corporate platform teams to manage hundreds of pipelines across multi-environment fleets of 100+ nodes.
7 phases, 31 requirements, 19 plans executed:
Stats: 69 files changed, ~11,000 lines added, 935 tests passing (up from 792)
Test plan
pnpm test— all 935 tests passpnpm lint— no new errorspnpm build— production build succeeds/fleet/healthwith node group cards/settings/webhooks/api/v1/openapi.json