Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
d9d3957
feat(gitops): add schema for multi-provider git sync
TerrifiedBug Mar 28, 2026
4785a02
feat: add configChecksum to NodePipelineStatus and drift alert metrics
TerrifiedBug Mar 28, 2026
85eb0ba
feat(gitops): add GitProvider interface and GitHub implementation
TerrifiedBug Mar 28, 2026
da86507
feat(api-v1): add rate limiting middleware and permission constants
TerrifiedBug Mar 28, 2026
ad52a2e
feat: accept configChecksum in heartbeat schema and batch upsert
TerrifiedBug Mar 28, 2026
c7054a5
feat(gitops): add GitLab provider implementation
TerrifiedBug Mar 28, 2026
1eb92d3
feat: add server-side cursor pagination to pipeline.list()
TerrifiedBug Mar 28, 2026
3cb8aaa
feat(gitops): add Bitbucket provider implementation
TerrifiedBug Mar 28, 2026
e48b640
feat: store and report config checksum in Go agent heartbeat
TerrifiedBug Mar 28, 2026
19c7a84
feat: wire pipeline list page to server-side pagination
TerrifiedBug Mar 28, 2026
6a4bb1e
refactor(gitops): webhook handler uses GitProvider abstraction
TerrifiedBug Mar 28, 2026
b992c04
feat: add FilterPreset model for saved filter presets
TerrifiedBug Mar 28, 2026
94d858d
feat(api-v1): add pipeline lifecycle endpoints (Phase 1)
TerrifiedBug Mar 28, 2026
97ea852
refactor(gitops): promotion service uses GitProvider abstraction
TerrifiedBug Mar 28, 2026
dd9a49b
feat: add version drift and config drift detection metrics
TerrifiedBug Mar 28, 2026
de9dda3
feat: add FilterPreset tRPC router with CRUD operations
TerrifiedBug Mar 28, 2026
36da3fb
feat: add config_drift metric evaluation in per-node alert evaluator
TerrifiedBug Mar 28, 2026
b3f23b8
feat(gitops): add git sync retry service with exponential backoff
TerrifiedBug Mar 28, 2026
0584a2e
feat(api-v1): add fleet and monitoring endpoints (Phase 2)
TerrifiedBug Mar 28, 2026
d9a4241
test: add version_drift evaluation tests for FleetAlertService
TerrifiedBug Mar 28, 2026
85df65f
feat(gitops): decouple pipeline name from git filename via gitPath
TerrifiedBug Mar 28, 2026
1612d0f
feat: add drift metrics to client-safe alert constants
TerrifiedBug Mar 28, 2026
26aacb7
feat: add Version Drift and Config Drift alert rule templates
TerrifiedBug Mar 28, 2026
9c50f4e
feat: add filter preset UI components and integrate with toolbars
TerrifiedBug Mar 28, 2026
12e7b2d
feat(api-v1): add advanced operations endpoints (Phase 3)
TerrifiedBug Mar 28, 2026
3e77092
feat(gitops): add git sync tRPC router for status, jobs, and retries
TerrifiedBug Mar 28, 2026
6319109
feat: add drift indicators and overall compliance to node group cards
TerrifiedBug Mar 28, 2026
567fe03
feat: add fleet.matrixSummary procedure for node aggregate cards
TerrifiedBug Mar 28, 2026
59849a0
feat(gitops): add git sync status UI to environment settings
TerrifiedBug Mar 28, 2026
a8c6c3f
feat: add version drift warning icon to deployment matrix cells
TerrifiedBug Mar 28, 2026
79173cc
feat: add NodeSummaryCards component for fleet matrix overview
TerrifiedBug Mar 28, 2026
016a935
feat(gitops): show sync failure warning badge on environment list
TerrifiedBug Mar 28, 2026
ad57e68
feat: redesign fleet matrix with node summary cards and filtered view
TerrifiedBug Mar 28, 2026
40e3a16
feat(api-v1): register 25 new endpoints in OpenAPI spec, bump to v2.0.0
TerrifiedBug Mar 28, 2026
9d42e66
feat: add drift KPI card to fleet overview dashboard
TerrifiedBug Mar 28, 2026
99fb8a0
feat(gitops): multi-provider support in Git Integration UI
TerrifiedBug Mar 28, 2026
f8daa3e
feat: add row density toggle to pipeline list toolbar
TerrifiedBug Mar 28, 2026
fe49ed8
test: verify configChecksum flows through heartbeat to batch upsert
TerrifiedBug Mar 28, 2026
e13e99a
test(gitops): add integration tests for provider detection and registry
TerrifiedBug Mar 28, 2026
ed0160a
docs(gitops): update docs for multi-provider support and sync reliabi…
TerrifiedBug Mar 28, 2026
66020d7
feat: add keyboard navigation to pipeline list table
TerrifiedBug Mar 28, 2026
f74ea0f
fix: add drift query mocks to node-group test groupHealthStats suite
TerrifiedBug Mar 28, 2026
b2bd1f5
feat: auto-apply default filter preset on page load
TerrifiedBug Mar 28, 2026
ef62b5a
feat: production polish — log viewer, pipeline editor, alerts, settin…
TerrifiedBug Mar 28, 2026
00fb733
test: add scale test for fleet.matrixSummary (200 pipelines x 10 nodes)
TerrifiedBug Mar 28, 2026
5e9c89d
fix(gitops): update existing tests for multi-provider webhook handler
TerrifiedBug Mar 28, 2026
a3f0564
fix: use const for configDriftCount that is never reassigned
TerrifiedBug Mar 28, 2026
47063e1
fix(gitops): resolve lint warnings in bitbucket provider and webhook …
TerrifiedBug Mar 28, 2026
24bf061
fix(gitops): pass gitPath to git sync functions in retry service and …
TerrifiedBug Mar 28, 2026
4beb4a9
fix(api-v1): add missing audit logs, date validation, rate limiter cl…
TerrifiedBug Mar 28, 2026
810ed80
fix: resolve hook ordering bug, add canvas search highlighting, write…
TerrifiedBug Mar 28, 2026
ac1cd6a
fix: wire up server-side filters, matrix UX, and filter preset safety
TerrifiedBug Mar 28, 2026
9bcdcf2
merge: scale ux — server-side pagination, matrix redesign, saved filters
TerrifiedBug Mar 28, 2026
cbb1673
merge: api v1 — 25 new REST endpoints, rate limiting, expanded permis…
TerrifiedBug Mar 28, 2026
1f4ed07
merge: gitops — multi-provider support, retry, approval gate, sync st…
TerrifiedBug Mar 28, 2026
ee8912a
merge: production polish — log viewer, editor, alerts, settings, dash…
TerrifiedBug Mar 28, 2026
7b219c0
fix: resolve 11 TypeScript errors across enterprise scale features
TerrifiedBug Mar 28, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions agent/internal/agent/agent.go
Original file line number Diff line number Diff line change
Expand Up @@ -165,11 +165,15 @@ func (a *Agent) pollAndApply() {
slog.Info("starting pipeline", "name", action.Name, "version", action.Version)
if err := a.supervisor.Start(action.PipelineID, action.ConfigPath, action.Version, action.LogLevel, action.Secrets); err != nil {
slog.Error("failed to start pipeline", "pipeline", action.PipelineID, "error", err)
} else if action.Checksum != "" {
a.supervisor.SetConfigChecksum(action.PipelineID, action.Checksum)
}
case ActionRestart:
slog.Info("restarting pipeline", "name", action.Name, "version", action.Version, "reason", "config changed")
if err := a.supervisor.Restart(action.PipelineID, action.ConfigPath, action.Version, action.LogLevel, action.Secrets); err != nil {
slog.Error("failed to restart pipeline", "pipeline", action.PipelineID, "error", err)
} else if action.Checksum != "" {
a.supervisor.SetConfigChecksum(action.PipelineID, action.Checksum)
}
case ActionStop:
slog.Info("stopping pipeline", "pipeline", action.PipelineID, "reason", "removed from config")
Expand Down
3 changes: 3 additions & 0 deletions agent/internal/agent/heartbeat.go
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,9 @@ func buildHeartbeat(sup *supervisor.Supervisor, vectorVersion string, deployment
}
}

// Include config checksum from last applied config
ps.ConfigChecksum = s.ConfigChecksum

// Include recent stdout/stderr lines (max 100 per heartbeat)
logs := sup.GetRecentLogs(s.PipelineID)
if len(logs) > 100 {
Expand Down
3 changes: 3 additions & 0 deletions agent/internal/agent/poller.go
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ type PipelineAction struct {
ConfigPath string
LogLevel string
Secrets map[string]string
Checksum string
}

// Poll fetches config from VectorFlow and returns actions to take.
Expand Down Expand Up @@ -117,6 +118,7 @@ func (p *poller) Poll() ([]PipelineAction, error) {
ConfigPath: configPath,
LogLevel: pc.LogLevel,
Secrets: pc.Secrets,
Checksum: pc.Checksum,
})
} else if prev.checksum != pc.Checksum {
// Config changed — rewrite and restart
Expand All @@ -132,6 +134,7 @@ func (p *poller) Poll() ([]PipelineAction, error) {
ConfigPath: configPath,
LogLevel: pc.LogLevel,
Secrets: pc.Secrets,
Checksum: pc.Checksum,
})
} else if prev.version != pc.Version {
// Version bumped but config unchanged — update version without restart
Expand Down
1 change: 1 addition & 0 deletions agent/internal/client/client.go
Original file line number Diff line number Diff line change
Expand Up @@ -167,6 +167,7 @@ type PipelineStatus struct {
ComponentMetrics []ComponentMetric `json:"componentMetrics,omitempty"`
Utilization float64 `json:"utilization,omitempty"`
RecentLogs []string `json:"recentLogs,omitempty"`
ConfigChecksum string `json:"configChecksum,omitempty"`
}

// ComponentMetric holds per-component metrics for editor node overlays.
Expand Down
53 changes: 32 additions & 21 deletions agent/internal/supervisor/supervisor.go
Original file line number Diff line number Diff line change
Expand Up @@ -14,20 +14,21 @@ import (
)

type ProcessInfo struct {
PipelineID string
Version int
PID int
Status string // RUNNING, STARTING, STOPPED, CRASHED
StartedAt time.Time
MetricsPort int
APIPort int
LogLevel string
Secrets map[string]string
cmd *exec.Cmd
configPath string
restarts int
done chan struct{}
logBuf *logbuf.RingBuffer
PipelineID string
Version int
PID int
Status string // RUNNING, STARTING, STOPPED, CRASHED
StartedAt time.Time
MetricsPort int
APIPort int
LogLevel string
Secrets map[string]string
ConfigChecksum string
cmd *exec.Cmd
configPath string
restarts int
done chan struct{}
logBuf *logbuf.RingBuffer
}

type Supervisor struct {
Expand Down Expand Up @@ -239,18 +240,28 @@ func (s *Supervisor) Statuses() []ProcessInfo {
var result []ProcessInfo
for _, info := range s.processes {
result = append(result, ProcessInfo{
PipelineID: info.PipelineID,
Version: info.Version,
PID: info.PID,
Status: info.Status,
StartedAt: info.StartedAt,
MetricsPort: info.MetricsPort,
APIPort: info.APIPort,
PipelineID: info.PipelineID,
Version: info.Version,
PID: info.PID,
Status: info.Status,
StartedAt: info.StartedAt,
MetricsPort: info.MetricsPort,
APIPort: info.APIPort,
ConfigChecksum: info.ConfigChecksum,
})
}
return result
}

// SetConfigChecksum stores the config checksum applied for a pipeline.
func (s *Supervisor) SetConfigChecksum(pipelineID, checksum string) {
s.mu.Lock()
defer s.mu.Unlock()
if info, ok := s.processes[pipelineID]; ok {
info.ConfigChecksum = checksum
}
}

// GetRecentLogs returns and clears the recent log lines for a pipeline.
func (s *Supervisor) GetRecentLogs(pipelineID string) []string {
s.mu.Lock()
Expand Down
80 changes: 62 additions & 18 deletions docs/public/operations/gitops.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,28 @@

VectorFlow supports **pipeline-as-code** workflows where pipeline configurations are stored in a Git repository and kept in sync between VectorFlow and your version control system.

## Supported Git Providers

VectorFlow supports the following Git hosting providers:

| Provider | Webhook Verification | API Operations |
|----------|---------------------|----------------|
| **GitHub** | HMAC-SHA256 (`X-Hub-Signature-256`) | Contents API, Pulls API |
| **GitLab** | Shared secret (`X-Gitlab-Token`) | Repository Files API, Merge Requests API |
| **Bitbucket** | HMAC-SHA256 (`X-Hub-Signature`) | Source API, Pullrequests API |

The provider is **auto-detected** from the repository URL domain (e.g., `github.com`, `gitlab.com`, `bitbucket.org`). For self-hosted instances (e.g., `gitlab.internal.corp`), you can explicitly set the provider in the Git Integration settings.

## Modes

Each environment can operate in one of three GitOps modes:
Each environment can operate in one of four GitOps modes:

| Mode | Direction | Description |
|------|-----------|-------------|
| **Off** | -- | Git integration is disabled (default). |
| **Push Only** | VectorFlow -> Git | Pipeline YAML is committed to the repo whenever you deploy or delete a pipeline. The repo serves as an audit trail. |
| **Bi-directional** | VectorFlow <-> Git | In addition to push, a webhook from GitHub triggers VectorFlow to import changed YAML files automatically. |
| **Bi-directional** | VectorFlow <-> Git | In addition to push, a webhook from your Git provider triggers VectorFlow to import changed YAML files automatically. |
| **Promotion** | VectorFlow -> Git -> VectorFlow | Promoting a pipeline creates a pull request (or merge request). Merging it automatically deploys the promoted config. |

## Setting up Push Only

Expand All @@ -23,6 +36,7 @@ On the environment detail page, fill in the **Git Integration** card:
- **Repository URL** -- HTTPS URL of the target repo (e.g., `https://github.com/org/pipeline-configs.git`)
- **Branch** -- The branch to push to (default: `main`)
- **Access Token** -- A personal access token with write access
- **Git Provider** -- Leave as "Auto-detect" for hosted providers, or explicitly select for self-hosted instances
{% endstep %}
{% step %}
### Set GitOps Mode to Push Only
Expand All @@ -37,7 +51,7 @@ Click **Save**. You can verify connectivity with **Test Connection** before savi
From this point forward, every pipeline deploy writes the generated YAML to `{environment-name}/{pipeline-name}.yaml` in the configured repository, and every pipeline deletion removes the file.

{% hint style="info" %}
Git sync is a post-deploy side effect. If the Git push fails, the pipeline deploy still succeeds -- you will see a warning in the VectorFlow logs.
Git sync is a post-deploy side effect. If the Git push fails, the pipeline deploy still succeeds -- VectorFlow automatically queues the failed sync for retry (up to 3 attempts with exponential backoff). You can monitor sync status in the **Git Sync Status** section on the environment page.
{% endhint %}

## Setting up Bi-directional GitOps
Expand All @@ -56,44 +70,74 @@ Select **Bi-directional** from the **GitOps Mode** dropdown and click **Save**.
{% step %}
### Copy the webhook details
After saving, the card shows:
- **Webhook URL** -- The endpoint GitHub should send push events to.
- **Webhook Secret** -- The HMAC secret for signature verification.
- **Webhook URL** -- The endpoint your Git provider should send push events to.
- **Webhook Secret** -- The secret for signature verification.
{% endstep %}
{% step %}
### Create a GitHub Webhook
In your GitHub repository, go to **Settings > Webhooks > Add webhook** and enter:
- **Payload URL** -- Paste the Webhook URL from VectorFlow.
- **Content type** -- Select `application/json`.
- **Secret** -- Paste the Webhook Secret from VectorFlow.
- **Events** -- Select **Just the push event**.

Click **Add webhook**.
### Create a Webhook in your Git provider
{% endstep %}
{% endstepper %}

{% tabs %}
{% tab title="GitHub" %}
Navigate to your repository on GitHub, then go to **Settings > Webhooks > Add webhook**. Fill in the Payload URL, select `application/json`, paste the secret, and choose the push event.
In your GitHub repository, go to **Settings > Webhooks > Add webhook**:
- **Payload URL** -- Paste the Webhook URL from VectorFlow.
- **Content type** -- Select `application/json`.
- **Secret** -- Paste the Webhook Secret from VectorFlow.
- **Events** -- Select **Just the push event** (and **Pull requests** if using Promotion mode).
{% endtab %}
{% tab title="GitLab" %}
GitLab uses a different header (`X-Gitlab-Token`) for secret verification. GitLab support is not yet available -- contact the team if you need it.
In your GitLab project, go to **Settings > Webhooks > Add new webhook**:
- **URL** -- Paste the Webhook URL from VectorFlow.
- **Secret token** -- Paste the Webhook Secret from VectorFlow.
- **Trigger** -- Check **Push events** (and **Merge request events** if using Promotion mode).
{% endtab %}
{% tab title="Bitbucket" %}
In your Bitbucket repository, go to **Repository settings > Webhooks > Add webhook**:
- **URL** -- Paste the Webhook URL from VectorFlow.
- **Secret** -- Paste the Webhook Secret from VectorFlow.
- **Triggers** -- Select **Repository push** (and **Pull request merged/declined** if using Promotion mode).
{% endtab %}
{% endtabs %}

## How the import works

When a push event arrives:

1. VectorFlow verifies the HMAC signature using the webhook secret.
1. VectorFlow verifies the webhook signature using the appropriate method for the Git provider.
2. It checks that the push targets the configured branch.
3. For each added or modified `.yaml` / `.yml` file in the push, it fetches the file content via the GitHub API.
3. For each added or modified `.yaml` / `.yml` file in the push, it fetches the file content via the provider's API.
4. The pipeline name is derived from the filename (e.g., `production/my-pipeline.yaml` becomes `my-pipeline`).
5. If a pipeline with that name already exists in the environment, its graph is replaced. Otherwise, a new pipeline is created.
5. If a pipeline with a matching `gitPath` or name already exists in the environment, its graph is replaced. Otherwise, a new pipeline is created.
6. If the environment has **Require Deploy Approval** enabled, imported pipelines are saved as drafts with a pending deploy request instead of being deployed immediately.

{% hint style="warning" %}
Bi-directional mode means the Git repository is the source of truth. Any manual edits made in the VectorFlow UI may be overwritten on the next push to the repository. The pipeline editor shows a banner to remind users of this.
{% endhint %}

## Pipeline Name / Filename Decoupling

VectorFlow uses a stable `gitPath` field to track the file path in Git for each pipeline. This means:

- **Renaming a pipeline** in VectorFlow does not change its filename in Git. The original path is preserved.
- **First sync** automatically assigns a `gitPath` based on the environment and pipeline name slugs.
- **Webhook imports** match files by `gitPath` first, then by name as a fallback.

This prevents broken sync when pipelines are renamed after initial setup.

## Sync Status and Retries

The **Git Sync Status** section on the environment detail page shows:

- **Health badge** -- Green (healthy), yellow (pending retries), or red (failed).
- **Last successful sync** timestamp.
- **Recent sync jobs** with status, attempt count, and per-job retry buttons.
- **Import errors** from webhook events (YAML parse failures, invalid filenames, etc.).

Failed sync operations are automatically retried up to 3 times with exponential backoff (30 seconds, 2 minutes, 10 minutes). After all retries are exhausted, a `git_sync_failed` alert is fired (if subscribed). You can also manually retry failed jobs from the UI.

The environment list page shows a warning badge when an environment has unresolved sync failures.

## File layout

VectorFlow expects pipeline YAML files to follow the standard Vector configuration format:
Expand Down
4 changes: 4 additions & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
"@prisma/client": "^7.4.2",
"@prisma/client-runtime-utils": "^7.4.2",
"@tanstack/react-query": "^5.90.21",
"@tanstack/react-virtual": "^3.13.23",
"@trpc/client": "^11.8.0",
"@trpc/server": "^11.8.0",
"@trpc/tanstack-react-query": "^11.8.0",
Expand Down Expand Up @@ -74,6 +75,8 @@
"@asteasolutions/zod-to-openapi": "^8.5.0",
"@next/bundle-analyzer": "^16.2.1",
"@tailwindcss/postcss": "^4",
"@testing-library/jest-dom": "^6.9.1",
"@testing-library/react": "^16.3.2",
"@types/bcryptjs": "^3.0.0",
"@types/dagre": "^0.7.54",
"@types/js-yaml": "^4.0.9",
Expand All @@ -84,6 +87,7 @@
"@types/react-dom": "^19",
"eslint": "^9",
"eslint-config-next": "16.1.6",
"jsdom": "^29.0.1",
"monaco-editor": "^0.55.1",
"prisma": "^7.4.2",
"shadcn": "^3.8.5",
Expand Down
Loading
Loading