feat: notification channels for alerts#37
Conversation
Greptile SummaryThis PR introduces a full notification channels system as a first-class replacement for legacy webhooks, adding Slack (Block Kit), Email (nodemailer/SMTP), PagerDuty (Events API v2), and generic HMAC-signed Webhook delivery drivers alongside a complete tRPC CRUD API and a polished channel management UI. Key changes:
Confidence Score: 4/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Agent
participant Heartbeat as /api/agent/heartbeat
participant AlertEval as alert-evaluator
participant LegacyWH as deliverWebhooks
participant Channels as deliverToChannels
participant DB as PostgreSQL
Agent->>Heartbeat: POST metrics/statuses
Heartbeat->>AlertEval: evaluateAlerts()
AlertEval-->>Heartbeat: alerts[] to deliver
loop for each alert
Heartbeat->>LegacyWH: await deliverWebhooks(envId, payload)
LegacyWH->>DB: findMany AlertWebhook where envId
LegacyWH-->>Heartbeat: done (awaited)
Heartbeat->>Channels: deliverToChannels(envId, ruleId, payload) [fire-and-forget]
Note over Channels: Resolves channel routing
Channels->>DB: findMany AlertRuleChannel where ruleId
alt linked channels exist
Channels->>DB: use enabled linked channels only
else no links
Channels->>DB: findMany NotificationChannel where envId+enabled
end
Channels->>Channels: dispatch to slack/email/pagerduty/webhook drivers
end
Heartbeat-->>Agent: 200 OK (before channel delivery completes)
Last reviewed commit: 1aa7c05 |
Add NotificationChannel and AlertRuleChannel models to Prisma schema with corresponding SQL migration. Create channel delivery service with drivers for Slack (Block Kit), Email (nodemailer/SMTP), PagerDuty (Events API v2), and generic Webhook (HMAC-signed). Install nodemailer dependency.
Add listChannels, createChannel, updateChannel, deleteChannel, and testChannel tRPC procedures. Update createRule and updateRule to accept optional channelIds for linking alert rules to specific channels. Update withTeamAccess and audit middleware to resolve NotificationChannel entities.
Call deliverToChannels alongside existing deliverWebhooks in the heartbeat route for each fired/resolved alert. Legacy webhook delivery is preserved for backward compatibility.
Replace the standalone Webhooks section with a full Notification Channels section supporting Slack, Email, PagerDuty, and Webhook types. Each type has a dedicated config form. Channels can be tested, toggled, edited, and deleted. Alert rule create/edit dialogs now include a multi-select for notification channels. Legacy webhooks section is preserved but only shown when legacy webhooks exist.
Cover Slack, Email, PagerDuty, and Webhook channel setup with type-specific examples in tabbed format. Document channel routing, legacy webhook migration path, and updated alert rule configuration.
- Validate channelIds belong to the rule's environment before creating AlertRuleChannel records (prevents cross-environment channel linking) - Wrap channel link replacement in updateRule in a $transaction for atomicity - Add withAudit middleware to testChannel mutation - Fix deliverToChannels fallback: when channels are explicitly linked but all disabled, do not fall back to all environment channels
…uard for SMTP hosts
- Rebase onto main to pick up DashboardView, PipelineSli, node labels, nodeSelector, and OIDC group sync fields added in recent PRs - Fix TS error: use storeKey instead of undefined componentKey variable in LiveTailPanel prop - Restructure updateChannel validation to run required-field checks AFTER sensitive-field preservation, preventing empty config payloads from creating broken channels
1b9e587 to
a015226
Compare
The API redacts integrationKey in responses, but the client-side form validation required it to be non-empty. This made it impossible to edit PagerDuty channels. Skip the integrationKey requirement when editing (the server already preserves the existing key) and show a placeholder hint so users know they can leave it blank.
| <DialogHeader> | ||
| <DialogTitle> | ||
| {editingChannelId | ||
| ? "Edit Notification Channel" |
There was a problem hiding this comment.
Shared testMutation disables all test buttons simultaneously
testMutation is a single hook instance shared across every channel row. When a test is in-progress for any channel, testMutation.isPending is true for the whole component, so every test button in the table becomes disabled — even for channels that aren't being tested.
A user who wants to test a second channel while the first is still running will see all buttons grayed out with no indication of why. Consider tracking which channel is currently under test:
const [testingChannelId, setTestingChannelId] = useState<string | null>(null);
const testMutation = useMutation(
trpc.alert.testChannel.mutationOptions({
onSuccess: (result) => {
setTestingChannelId(null);
// ...
},
onError: (error) => {
setTestingChannelId(null);
// ...
},
}),
);
// In the row:
onClick={() => {
setTestingChannelId(channel.id);
testMutation.mutate({ id: channel.id });
}}
disabled={testingChannelId === channel.id}Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/(dashboard)/alerts/page.tsx
Line: 991
Comment:
**Shared `testMutation` disables all test buttons simultaneously**
`testMutation` is a single hook instance shared across every channel row. When a test is in-progress for any channel, `testMutation.isPending` is `true` for the whole component, so every test button in the table becomes disabled — even for channels that aren't being tested.
A user who wants to test a second channel while the first is still running will see all buttons grayed out with no indication of why. Consider tracking which channel is currently under test:
```typescript
const [testingChannelId, setTestingChannelId] = useState<string | null>(null);
const testMutation = useMutation(
trpc.alert.testChannel.mutationOptions({
onSuccess: (result) => {
setTestingChannelId(null);
// ...
},
onError: (error) => {
setTestingChannelId(null);
// ...
},
}),
);
// In the row:
onClick={() => {
setTestingChannelId(channel.id);
testMutation.mutate({ id: channel.id });
}}
disabled={testingChannelId === channel.id}
```
How can I resolve this? If you propose a fix, please make it concise.
Summary
NotificationChannelandAlertRuleChannelPrisma models with SQL migrationchannelIdsfor per-rule channel routingdeliverToChannelsinto the heartbeat alert delivery pipeline alongside existing webhook deliveryTest plan
npx prisma migrate deployagainst a database to apply the migrationAlertRuleChannelrecords are created