Skip to content

feat: endpoint security hardening — auth, rate limiting, info leak fixes#118

Merged
TerrifiedBug merged 11 commits intomainfrom
feat/endpoint-security-hardening
Mar 29, 2026
Merged

feat: endpoint security hardening — auth, rate limiting, info leak fixes#118
TerrifiedBug merged 11 commits intomainfrom
feat/endpoint-security-hardening

Conversation

@TerrifiedBug
Copy link
Copy Markdown
Owner

Summary

Secures 7 unauthenticated endpoints identified in a security audit:

  • /api/metrics — now requires Bearer token auth (BREAKING: METRICS_AUTH_REQUIRED env var removed, auth is always on)
  • /api/agent/enroll — IP rate limited to 10 req/min
  • /api/webhooks/git — IP rate limited to 30 req/min
  • /api/setup — IP rate limited to 5 req/min
  • /api/v1/docs — requires NextAuth session + CSP header added
  • /api/v1/openapi.json — requires NextAuth session, wildcard CORS removed
  • /api/health — stripped db field from response (BREAKING: status code-only health checks unaffected)

New infrastructure

  • src/app/api/_lib/ip-rate-limit.ts — IP-based rate limit helper reusing existing RateLimiter with rightmost x-forwarded-for extraction
  • RateLimiter.checkKey() method for explicit key-based rate limiting without tier suffix

Breaking changes

Endpoint Change Migration
/api/metrics Auth required always Add metrics.read service account token to Prometheus scrape config
/api/health No db field in response Use HTTP status code (200/503) instead of parsing body
/api/v1/openapi.json No CORS, requires session Access via browser when logged in, or use local copy

Test plan

  • IP rate limit helper — 8 unit tests (IP extraction, 429 response, isolation, window reset)
  • /api/metrics — 4 tests (401 without token, 401 invalid token, 200 with valid token, 500 on error)
  • /api/health — 2 tests (200 without db field, 503 without db field)
  • Existing enroll, webhook, setup tests pass with rate limiters added
  • OpenAPI spec tests pass (15/15)
  • tsc --noEmit clean
  • pnpm lint clean
  • Full suite: 1155/1156 pass (1 pre-existing flaky leader-guard test)

@TerrifiedBug TerrifiedBug force-pushed the feat/endpoint-security-hardening branch from cb3ee50 to 2a6fc86 Compare March 29, 2026 17:48
@TerrifiedBug
Copy link
Copy Markdown
Owner Author

@greptile

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 29, 2026

Greptile Summary

This PR hardens 7 previously unauthenticated endpoints identified in a security audit, adding always-on Bearer token auth to /api/metrics, IP-based rate limiting to /api/agent/enroll, /api/webhooks/git, and /api/setup, session guards to the OpenAPI spec and docs routes, and stripping the db field from /api/health responses. The new ip-rate-limit.ts helper and RateLimiter.checkKey() method are clean additions that integrate well with the existing rate-limiting infrastructure.

One security defect found:

  • /api/metrics — missing metrics.read permission check. The route calls authenticateApiKey but never calls hasPermission(ctx, \"metrics.read\"). As implemented, any service account holding a valid token (regardless of its assigned permissions) can read the Prometheus metrics. The PR description, documentation, and test fixtures all assume only metrics.read accounts are permitted, but the enforcement is absent from the route itself.

Confidence Score: 4/5

Safe to merge after adding the metrics.read permission check — one auth bypass exists on the metrics endpoint.

All rate-limiting and session-guard changes are correct. The single P1 finding is that the metrics route authenticates callers but never enforces the metrics.read permission, allowing any valid service account to read metrics data.

src/app/api/metrics/route.ts — missing hasPermission(ctx, "metrics.read") check after successful authentication.

Important Files Changed

Filename Overview
src/app/api/metrics/route.ts Auth guard always required now (good), but the route never checks metrics.read permission — any valid API key can read metrics.
src/app/api/_lib/ip-rate-limit.ts New IP rate-limit helper: rightmost XFF extraction, falls back to x-real-ip, then "unknown". Logic and response format look correct.
src/app/api/v1/_lib/rate-limiter.ts Adds checkKey() for explicit-key rate limiting without tier suffix; logic mirrors check() correctly and shares the same sliding-window store.
src/app/api/v1/docs/route.ts New route serving Swagger UI behind NextAuth session guard with a tight CSP; specUrl is constructed from the parsed request URL safely.
src/app/api/v1/openapi.json/route.ts Wildcard CORS removed and session guard added; OPTIONS handler also removed cleanly.
src/app/api/health/route.ts db field stripped from both 200 and 503 responses; HTTP status codes remain the authoritative health signal.
src/app/api/metrics/tests/route.test.ts Tests updated for always-on auth; however no test exercises the metrics.read permission check because the route never performs it.
src/app/api/agent/enroll/route.ts IP rate limit (10 req/min) prepended before all existing enrollment logic; no changes to existing auth or token handling.
src/app/api/webhooks/git/route.ts IP rate limit (30 req/min) prepended before HMAC signature verification; existing webhook auth logic unchanged.
src/app/api/setup/route.ts IP rate limit (5 req/min) added to both GET and POST handlers; both share the same "setup" bucket key, capping all setup activity per IP at 5 req/min combined.
docs/public/operations/configuration.md New Prometheus scrape-config documentation section accurately reflects the breaking auth change.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Incoming Request] --> B{Endpoint}

    B --> C["/api/agent/enroll"]
    B --> D["/api/webhooks/git"]
    B --> E["/api/setup GET/POST"]
    B --> F["/api/metrics"]
    B --> G["/api/v1/docs"]
    B --> H["/api/v1/openapi.json"]
    B --> I["/api/health"]

    C --> C1{IP rate limit 10 req/min}
    D --> D1{IP rate limit 30 req/min}
    E --> E1{IP rate limit 5 req/min}

    C1 -->|exceeded| R429[429 Too Many Requests]
    D1 -->|exceeded| R429
    E1 -->|exceeded| R429

    C1 -->|allowed| C2[Enrollment token validation]
    D1 -->|allowed| D2[HMAC signature verification]
    E1 -->|allowed| E2[Setup handler]

    F --> F1{authenticateApiKey Bearer token}
    F1 -->|invalid| R401[401 Unauthorized]
    F1 -->|valid but metrics.read not checked| F2[Collect and return Prometheus metrics]

    G --> G1{NextAuth session?}
    H --> H1{NextAuth session?}
    G1 -->|no| R401
    H1 -->|no| R401
    G1 -->|yes| G2[Serve Swagger UI + CSP header]
    H1 -->|yes| H2[Return OpenAPI JSON spec]

    I --> I2[DB ping]
    I2 -->|ok| I3[200 status ok]
    I2 -->|fail| I4[503 status error]
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/app/api/metrics/route.ts
Line: 14-20

Comment:
**`metrics.read` permission never checked**

`authenticateApiKey` verifies the token is valid and returns the `ServiceAccountContext` (which includes `permissions`), but the route never calls `hasPermission(ctx, "metrics.read")`. Any service account holding any valid token — `pipelines.deploy`, `secrets.manage`, etc. — can currently read the Prometheus metrics data. The PR description and docs explicitly state that only accounts with `metrics.read` should be granted access.

```suggestion
  const authHeader = request.headers.get("authorization");
  const ctx = await authenticateApiKey(authHeader);
  if (!ctx || !hasPermission(ctx, "metrics.read")) {
    return new Response("Unauthorized\n", {
      status: 401,
      headers: { "Content-Type": "text/plain; charset=utf-8" },
    });
  }
```

Also import `hasPermission` at the top of the file:
```ts
import { authenticateApiKey, hasPermission } from "@/server/middleware/api-auth";
```

**Rule Used:** ## Security & Cryptography Review Rules

When revi... ([source](https://app.greptile.com/review/custom-context?memory=7cb20c56-ca6a-40aa-8660-7fa75e6e3db2))

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "docs: document metrics endpoint auth req..." | Re-trigger Greptile

Comment on lines +14 to 20
const ctx = await authenticateApiKey(authHeader);
if (!ctx) {
return new Response("Unauthorized\n", {
status: 401,
headers: { "Content-Type": "text/plain; charset=utf-8" },
});
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 metrics.read permission never checked

authenticateApiKey verifies the token is valid and returns the ServiceAccountContext (which includes permissions), but the route never calls hasPermission(ctx, "metrics.read"). Any service account holding any valid token — pipelines.deploy, secrets.manage, etc. — can currently read the Prometheus metrics data. The PR description and docs explicitly state that only accounts with metrics.read should be granted access.

Suggested change
const ctx = await authenticateApiKey(authHeader);
if (!ctx) {
return new Response("Unauthorized\n", {
status: 401,
headers: { "Content-Type": "text/plain; charset=utf-8" },
});
}
const authHeader = request.headers.get("authorization");
const ctx = await authenticateApiKey(authHeader);
if (!ctx || !hasPermission(ctx, "metrics.read")) {
return new Response("Unauthorized\n", {
status: 401,
headers: { "Content-Type": "text/plain; charset=utf-8" },
});
}

Also import hasPermission at the top of the file:

import { authenticateApiKey, hasPermission } from "@/server/middleware/api-auth";

Rule Used: ## Security & Cryptography Review Rules

When revi... (source)

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/api/metrics/route.ts
Line: 14-20

Comment:
**`metrics.read` permission never checked**

`authenticateApiKey` verifies the token is valid and returns the `ServiceAccountContext` (which includes `permissions`), but the route never calls `hasPermission(ctx, "metrics.read")`. Any service account holding any valid token — `pipelines.deploy`, `secrets.manage`, etc. — can currently read the Prometheus metrics data. The PR description and docs explicitly state that only accounts with `metrics.read` should be granted access.

```suggestion
  const authHeader = request.headers.get("authorization");
  const ctx = await authenticateApiKey(authHeader);
  if (!ctx || !hasPermission(ctx, "metrics.read")) {
    return new Response("Unauthorized\n", {
      status: 401,
      headers: { "Content-Type": "text/plain; charset=utf-8" },
    });
  }
```

Also import `hasPermission` at the top of the file:
```ts
import { authenticateApiKey, hasPermission } from "@/server/middleware/api-auth";
```

**Rule Used:** ## Security & Cryptography Review Rules

When revi... ([source](https://app.greptile.com/review/custom-context?memory=7cb20c56-ca6a-40aa-8660-7fa75e6e3db2))

How can I resolve this? If you propose a fix, please make it concise.

@TerrifiedBug TerrifiedBug merged commit 114ed8a into main Mar 29, 2026
11 checks passed
@TerrifiedBug TerrifiedBug deleted the feat/endpoint-security-hardening branch March 29, 2026 19:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant