Skip to content

feat: OpenID Federation 1.0 — Trust Anchor, TMI, CLI, MCP + 16 security fixes#13

Open
teamktown wants to merge 17 commits intomainfrom
feat/uplift-tdd
Open

feat: OpenID Federation 1.0 — Trust Anchor, TMI, CLI, MCP + 16 security fixes#13
teamktown wants to merge 17 commits intomainfrom
feat/uplift-tdd

Conversation

@teamktown
Copy link
Copy Markdown
Owner

@teamktown teamktown commented Mar 16, 2026

Summary

Full OpenID Federation 1.0 (draft-43) implementation for MCP server deployments with comprehensive security hardening from CodeRabbit review.

Core infrastructure (Increments A–G)

  • `@letsfederate/kms` — KeyProvider interface, SoftKmsProvider (EC P-256, jose), PKCS#11 stub
  • `@letsfederate/tmi-server` — Trustmark Issuer: `POST /trustmarks/issue`, OIDF entity statement
  • `@letsfederate/ta-server` — Trust Anchor: SQLite persistence, enrollment API, revocation, admin dashboard
  • `@letsfederate/fedmgr-cli` — CLI: `trustmark issue/verify/check/adopt`, `keys init/decrypt`
  • `@letsfederate/fedmgr-mcp` — MCP server with 10 tools for federation management
  • Two-step proof-of-key enrollment (nonce challenge → signed proof)
  • `federation_list` / `federation_fetch` / `trust-mark-status` per draft-43 §8/§12
  • SQLite WAL store: subordinates, enrollment_requests, revocations, audit_log

Security hardening — 16 CodeRabbit findings addressed

# Sev Fix
1 CRIT Entity binding: `jwks_url` origin must match `entity_id` origin (prevents victim enrollment)
2 CRIT Strip private JWK fields (`d`,`p`,`q`,`dp`,`dq`,`qi`) before signing subordinate statements
3 CRIT Bootstrap JWKS sanitization — strip private fields at startup
4 CRIT `assertSafeUrl()` on `jku` in trust-validator (SSRF via jku header)
5 CRIT `TMI_ISSUE_TOKEN` Bearer auth on `POST /trustmarks/issue`
6–7 MAJ `asyncHandler` wrapper for Express 4.x async routes (ta-server + tmi-server)
8 MAJ 5s `AbortController` timeout on bootstrap JWKS fetches
9 MAJ Validate `TA_JWKS_URL` with `new URL()` at startup
11 MAJ `jose` added to `fedmgr-cli` runtime deps (was undeclared)
12 MAJ CLI `--ttl` validation: reject NaN, enforce 60–86400s
13 MAJ `assertSafeUrl()` on `jku` in CLI `trustmark verify`
14 MAJ MCP `fetchJson`: 10s timeout + 1MB response size cap
15 MAJ `requireSafeUrl()` on `ta_url` in MCP `verify_trustmark` tool
16 MAJ GitHub Actions: `secrets` context requires `${{ }}` wrapper

Additional security

  • SSRF utility `assertSafeUrl()` blocks RFC 1918, loopback, link-local, non-HTTPS
  • Algorithm confusion prevention: `algorithms: ["ES256"]` pinned in all `jwtVerify()` calls
  • `ADMIN_TOKEN` Bearer auth on all management + dashboard endpoints
  • `npm audit` → 0 vulnerabilities (serialize-javascript RCE fixed via overrides)

Docs

  • `docs/security-faq.md` — security mindfulness guide (10 attack patterns, checklists)
  • `docs/claude/skills/openid-federation/SKILL.md` — OpenID Federation design patterns skill

Test plan

  • 54 ta-server unit tests pass
  • 25 kms/tmi-server/cli unit tests pass
  • `npm run build` — all 5 packages compile cleanly
  • `npm audit` — 0 vulnerabilities
  • Integration + security tests pass

Reviewer focus areas

  1. Entity binding `enrollment/index.ts:99–116` — same-origin check
  2. Private key stripping `subordinate-statements.ts:33–45` — `stripPrivateFields()`
  3. SSRF guards — `assertSafeUrl()` in kms, ta-server, fedmgr-mcp
  4. TMI auth `tmi-server/index.ts:164–178` — `TMI_ISSUE_TOKEN` guard
  5. asyncHandler `ta-server/utils/async-handler.ts`

🤖 Generated with claude-flow

Summary by CodeRabbit

  • New Features

    • CLI for keys, trustmarks, and OCI attestations; MCP server tools; Trust Anchor and Trustmark Issuer services with federation endpoints, enrollment flows, health checks, and web dashboard.
    • KMS enhancements: in-process soft KMS and PKCS#11 HSM support; container images and local trust-lab compose for secure key handling.
  • Documentation

    • Large new/expanded docs: architecture, design decisions, testing guide, trust/signing workflows, and increment roadmaps.
  • Tests & CI

    • New unit/integration suites, test-report tooling, GitHub Actions for CI, release, SBOM, and security scans.
  • Chores

    • Expanded .gitignore and workspace/package configuration updates.

teamktown and others added 5 commits March 16, 2026 02:00
- packages/kms: KeyProvider interface + SoftKmsProvider (EC P-256 via jose)
  - Private key read from caller-supplied path (expects tmpfs-decrypted file)
  - Strips `d` field before serving JWKS; validates EC P-256 on load
  - Factory createProvider() for future pkcs11 provider
  - 7 unit tests: kid continuity, JWKS safety, JWS verify, key rejection

- packages/tmi-server: Express TMI service
  - GET  /.well-known/jwks.json  (public JWKS, Cache-Control 1h)
  - POST /trustmarks/issue       (zod-validated; returns JWS with jku header)
  - GET  /health                 (issuer echo)
  - Startup validation of TMI_ISSUER / TMI_JWKS_URL env vars
  - console replaced with process.stdout.write; x-powered-by disabled

- scripts/tmi-keys-init.sh: step CLI encrypted JWK generation (PBES2)
- scripts/tmi-decrypt.sh: decrypts priv JWE → /dev/shm tmpfs; wipes on EXIT
- root package.json: npm workspaces, drop file: deps, node>=18, new scripts
- tsconfig.base.json: composite TS config for monorepo
- .gitignore: block *.priv.jwe, *.priv.jwk, keys/.pass

Co-Authored-By: claude-flow <ruv@ruv.net>
Increment A — Docker + Local Trust Lab

- Dockerfile.tmi-server: multi-stage builder→runtime, non-root tmi user
  no private key in any image layer; TMI_PRIVATE_JWK defaults to /run/secrets
- examples/lab/docker-compose.yml:
  - registry:2 on :5000 for OCI push/pull
  - tmi-decrypt sidecar (step-cli): decrypts JWE → tmpfs, exits
  - tmi-server: depends on sidecar completing, reads key from tmpfs volume
  - tmi-secrets volume is type:tmpfs — key never persists to disk
- examples/lab/.env.example: override TMI_ISSUER / TMI_JWKS_URL
- docs/ralph-loop/increment-A-docker-lab.md: full runbook + design notes

Co-Authored-By: claude-flow <ruv@ruv.net>
Increment B — fedmgr-cli

- packages/fedmgr-cli: new @letsfederate/fedmgr-cli package
  - keys init softkms: step CLI encrypted JWK pair generation
    - asserts step in PATH; creates .pass (chmod 600); PBES2 private JWK
  - trustmark issue: POST to TMI, print compact JWS or --json full response
  - trustmark verify: fetch jku JWKS, verify JWS, exit 1 on failure
  - oci attach-trustmark: cosign attest wrapper, predicate to mkdtemp
  - oci verify-trustmark: cosign verify-attestation wrapper
- 8 structural unit tests; 15 total across packages (all pass)
- docs/ralph-loop/increment-B-cli.md: full command reference + round-trip guide

Co-Authored-By: claude-flow <ruv@ruv.net>
…resh

Increment C — OIDF Federation Scaffolding

- packages/tmi-server/src/federation/entity-statements.ts:
  - signEntityStatement(): signs OIDF entity config JWT (typ=entity-statement+jwt)
  - tmiMetadata(): federation_entity block for TMI role
  - trustAnchorMetadata(): federation_entity block with fetch/list endpoints
  - authority_hints omitted when empty (spec-compliant)

- packages/tmi-server: add GET /.well-known/openid-federation endpoint
  - Content-Type: application/entity-statement+jwt
  - Cache-Control: public, max-age=3600
  - Self-signed: iss=sub=ISSUER, jwks (pub only), metadata.federation_entity

- packages/tmi-server/test/entity-statements.test.mjs: 10 unit tests
  - Claims: iss==sub, exp>iat, jwks no `d`, authority_hints conditional
  - Headers: typ, alg, verified signature against JWKS public key

- scripts/oidf-cert.sh: conformance test runner
  - Health check, content-type check, payload decode
  - Pulls official harness or prints manual certification steps

- docs/architecture.md: full architecture refresh (supersedes april2025 doc)
  - Monorepo layout, component descriptions, OIDF compliance table
  - Security properties table, testing totals, roadmap through Increment F

Total tests: 25/25 pass (kms:7, fedmgr-cli:8, tmi-server:10)

Co-Authored-By: claude-flow <ruv@ruv.net>
Increment D — Trust Anchor + Documentation

packages/ta-server (@letsfederate/ta-server):
  - GET /.well-known/openid-federation (self-signed TA entity statement)
  - GET /.well-known/jwks.json (TA public JWKS)
  - GET /federation_list (subordinate entity IDs — spec §8.3)
  - GET /federation_fetch?sub=<entityId> (signed subordinate statement — spec §8.1)
  - GET /trust-mark-status (501 stub — Increment E)
  - GET /health (entity_id + subordinate count)
  - SubordinateRegistry: in-memory, populated at startup from TA_SUBORDINATES env
  - signSubordinateStatement(): iss=TA, sub=subordinate, typ=entity-statement+jwt
  - 12 unit tests: registry ops, JWS structure, iss/sub/exp, JWKS safety, sig check
  - All 37 tests pass across 4 packages (kms:7, cli:8, tmi:10, ta:12)

Dockerfile.ta-server: multi-stage, non-root ta user, tmpfs for private key
examples/lab/docker-compose.yml: +ta-decrypt sidecar, +ta-server
  - Startup order: decrypt sidecars → tmi-server healthy → ta-server starts
  - npm run lab:up / lab:down added to root package.json

Decisions locked (docs/decisions.md):
  - KMS order: SoftKMS → SoftHSM2 → YubiKey → Cloud KMS
  - SoftKMS-only for all dev/CI (PKCS#11 not required for tests)
  - Trustmark v1 schema frozen: iss,sub,id,iat,exp,image_digest?,repo?,evidence?

Extension guide (docs/extending-trustmarks.md):
  - How to add new trustmark types (step-by-step with code examples)
  - How to add optional fields within a version
  - Key files reference table

docs/architecture.md: refresh for Increment D
  - Added ta-server to layout and component descriptions
  - Updated OIDF compliance table (federation_list/fetch now ✅)
  - Updated test totals (37/37)
  - Updated roadmap (D → ✅)

Co-Authored-By: claude-flow <ruv@ruv.net>
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 16, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a TypeScript monorepo with KMS (SoftKMS + PKCS#11), TMI and TA servers, a fedmgr CLI and MCP tools, SQLite-backed TA persistence and enrollment/management APIs, PKCS#11 integration and tests, container/Docker lab and CI workflows, key tooling/scripts, extensive docs, and many tests.

Changes

Cohort / File(s) Summary
Repo root & tooling
\.gitignore, package.json, tsconfig.base.json, .github/workflows/*
Workspace enablement, Node >=18, added scripts, CI workflows (ci/containers/release/daily-scan), base TS config, expanded gitignore patterns and test-results ignores.
KMS core & providers
packages/kms/..., packages/kms/src/providers/pkcs11.ts, packages/kms/src/types/pkcs11js.d.ts, packages/kms/test/*
New KeyProvider API, SoftKmsProvider, full Pkcs11Provider (DER→raw, PKCS#11 signing), pkcs11 type shim, factory, and unit/integration HSM tests.
TMI server
Dockerfile.tmi-server, packages/tmi-server/..., packages/tmi-server/test/*, examples/lab/*
New TMI Express app (JWKS, /trustmarks/issue, entity-statement signing), package manifest, tests, Dockerfile, lab compose entries and healthchecks.
TA server
Dockerfile.ta-server, packages/ta-server/..., packages/ta-server/src/db/*, packages/ta-server/test/*
New Trust Anchor app with SqliteFederationStore, enrollment router, management/dashboard routers, subordinate/entity signing, chain verifier, file-backed registry, migrations, UI, and tests.
CLI & MCP tooling
packages/fedmgr-cli/*, packages/fedmgr-mcp/*, packages/fedmgr-cli/test/*, scripts/tests/*
New fedmgr CLI package (keys, trustmark, oci commands), fedmgr-mcp MCP server package, CLI entrypoints, tests, and updated test harnesses.
Trust validation & chain verifier
packages/kms/src/trust-validator.ts, packages/ta-server/src/federation/chain-verifier.ts
Add trustmark validation, full OIDF-style chain-walking verifier, policy enforcement, and rich TrustResult types.
DB & persistence
packages/ta-server/src/db/store.ts, packages/ta-server/src/db/sqlite-store.ts, packages/ta-server/src/db/migrations/001-init.sql
FederationStore interface, SQLite implementation, migrations and APIs for subordinates, enrollments, revocations, audit log.
PKCS#11 test & lab
examples/lab/Dockerfile.softhsm-test, examples/lab/docker-compose.softhsm-test.yml, scripts/docker-pkcs11-entrypoint.sh
SoftHSM2 test image and compose flow plus idempotent entrypoint to run PKCS#11 integration tests.
Key scripts & secret handling
scripts/tmi-keys-init.sh, scripts/tmi-decrypt.sh, scripts/oidf-cert.sh, scripts/test-report.sh, scripts/sbom-diff.sh
Key generation (encrypted JWE), tmpfs decryption with secure cleanup, OIDF conformance runner, multi-package test reporter, and SBOM diff tooling.
Local lab & examples
examples/lab/docker-compose.yml, examples/lab/.env.example
Local trust-lab docker-compose with tmpfs secret volumes, decrypt sidecars, TA/TMI services, env examples and healthchecks.
Dashboard & UI
packages/ta-server/src/dashboard/*
Dashboard router and UI (ui.html) with API endpoints for overview, subordinates, revocations and audit.
Docs & ADRs
docs/*, docs/ralph-loop/*, docs/decisions.md, docs/architecture.md
Extensive architecture, decisions, testing, signing, trust guides, incremental delivery docs and runbooks.
Tests & results
packages/*/test/*, scripts/tests/*, test-results/*
Many new unit/integration/e2e tests, test harness adjustments, and committed test-results artifacts (JUnit/JSON).
Utilities & safety
packages/*/src/utils/async-handler.ts, packages/*/src/validate-url.ts
Express async handler utilities and SSRF-safe URL validators added across packages.
Dashboard & management APIs
packages/ta-server/src/management/*, packages/ta-server/src/enrollment/*
Admin management router, enrollment flow (nonce proof-of-key-ownership), auditing and API surface for subordinate lifecycle.

Sequence Diagram(s)

sequenceDiagram
    participant CLI as fedmgr CLI
    participant TMI as TMI Server
    participant KMS as KMS (SoftKMS / PKCS#11)
    participant TA as TA Server

    CLI->>KMS: ensure/load keys (private/public paths)
    KMS-->>CLI: provide kid() / jwks()

    CLI->>TMI: POST /trustmarks/issue {sub, trustmark_id,...}
    TMI->>KMS: signJwt(trustmarkPayload, {alg: ES256, kid/jku})
    KMS-->>TMI: trustmark JWS
    TMI-->>CLI: 201 { trustmark_jws }

    CLI->>TA: GET /federation_fetch?sub=<entityId>
    TA->>TA: lookup subordinate, compose entity-statement
    TA->>KMS: signJwt(entityStatement, typ: "entity-statement+jwt")
    KMS-->>TA: signed entity-statement+jwt
    TA-->>CLI: entity-statement+jwt
Loading
sequenceDiagram
    participant Startup as Service Startup
    participant Decrypt as decrypt script
    participant tmpfs as tmpfs (in-memory)
    participant KMS as SoftKmsProvider
    participant External as External JWKS

    Startup->>Decrypt: decrypt private JWE -> tmpfs path
    Decrypt->>tmpfs: write decrypted private JWK (600 perms)
    Startup->>KMS: init SoftKmsProvider with tmpfs path
    KMS-->>Startup: keys validated, jwks available
    Startup->>TA: TA boots, DB migrate, load subordinates
    TA->>External: GET subordinate JWKS (from authority)
    External-->>TA: jwks JSON
    TA->>TA: upsert subordinate into store
    TA->>KMS: sign entity statements as needed
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

"🐰 I hopped through keys and docs tonight,
I signed some JWTs by moonlight bright,
TMI chirps trustmarks, TA keeps the gate,
The CLI scurries round to orchestrate,
We hide our keys and keep the lab upright."

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/uplift-tdd

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 16

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

🟡 Minor comments (8)
docs/extending-trustmarks.md-12-15 (1)

12-15: ⚠️ Potential issue | 🟡 Minor

Add language tags to fenced code blocks to satisfy markdownlint (MD040).

These fences are currently untyped; please annotate them (text, typescript, etc.) to remove lint warnings and improve readability.

Suggested diff
-```
+```text
 packages/tmi-server/src/index.ts      — IssueRequestSchema (zod)
 packages/schemas/trustmark-v1.json    — (planned, Increment F)

- +typescript
// Option A: discriminated union (recommended for small type sets)
const IssueRequestSchema = z.discriminatedUnion("trustmark_id", [
ExistingV1Schema,
CertifiedForMCPUse_v1,
]);
...


-```
+```typescript
export function tmiMetadata(jwksUri: string): Record<string, unknown> {
  return {
    federation_entity: {
      ...existingFields,
      ...
    },
  };
}
</details>


Also applies to: 78-87, 119-132

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @docs/extending-trustmarks.md around lines 12 - 15, The markdown code fences
in the docs are missing language tags and trigger MD040; update the three shown
blocks and the other affected ranges (lines 78-87 and 119-132) to include
appropriate languages: use text for the file-list block and typescript for
TypeScript snippets such as the IssueRequestSchema discriminated union and the
tmiMetadata function; locate the blocks containing "IssueRequestSchema",
"tmiMetadata", and the file list and add the language identifiers so each fenced
code block is annotated (e.g., text and typescript).


</details>

</blockquote></details>
<details>
<summary>docs/ralph-loop/increment-A-docker-lab.md-13-19 (1)</summary><blockquote>

`13-19`: _⚠️ Potential issue_ | _🟡 Minor_

**Add a language to the fenced block.**

The fenced code block is missing a language tag, which triggers markdownlint MD040.


<details>
<summary>Suggested fix</summary>

```diff
-```
+```text
 examples/lab/
 ├── docker-compose.yml   # registry:2 + tmi-decrypt sidecar + tmi-server
 └── .env.example         # override TMI_ISSUER and TMI_JWKS_URL

 Dockerfile.tmi-server    # multi-stage: builder (tsc) → slim runtime
 ```
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @docs/ralph-loop/increment-A-docker-lab.md around lines 13 - 19, The fenced
code block in the markdown lacks a language tag (triple backticks at the start
of the block), causing markdownlint MD040; update the opening fence in the block
that contains the directory listing (the snippet starting with "examples/lab/")
to include a language identifier such as "text" (i.e., change totext) so
the block is explicitly tagged.


</details>

</blockquote></details>
<details>
<summary>package.json-27-27 (1)</summary><blockquote>

`27-27`: _⚠️ Potential issue_ | _🟡 Minor_

**`source` in npm script won't export variables to caller's shell.**

The `source scripts/tmi-decrypt.sh` command runs in a subshell spawned by npm. Environment variables set by the script won't persist to the user's terminal after the script completes. Consider documenting that users should run `source scripts/tmi-decrypt.sh` directly in their shell instead of via npm.


<details>
<summary>📝 Suggested documentation or alternative</summary>

Either update the script comment to clarify usage:
```diff
-    "tmi:decrypt": "source scripts/tmi-decrypt.sh",
+    "tmi:decrypt": "echo 'Run directly: source scripts/tmi-decrypt.sh'",
```

Or use the `--start` variant (line 29) which handles this internally.
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@package.json` at line 27, The npm script "tmi:decrypt" uses "source
scripts/tmi-decrypt.sh" which runs in a subshell so exported env vars won't
persist; update the repo so users know to run "source scripts/tmi-decrypt.sh"
directly in their shell (e.g., update README or the scripts section comment) or
change the package.json approach: replace the direct source usage with a
documented alternative such as providing a non-subshell usage (e.g., make the
script print export lines that can be eval'd by the caller) or point consumers
to the existing "--start" variant that handles environment injection; reference
the "tmi:decrypt" npm script and the scripts/tmi-decrypt.sh file when applying
the change.
```

</details>

</blockquote></details>
<details>
<summary>scripts/tmi-decrypt.sh-30-30 (1)</summary><blockquote>

`30-30`: _⚠️ Potential issue_ | _🟡 Minor_

**Quote command substitution to prevent word splitting.**

The `wc -c` output may include leading whitespace on some systems, causing `dd` to fail or behave unexpectedly. As flagged by static analysis (SC2046), the command substitution should be quoted.


<details>
<summary>🔧 Proposed fix</summary>

```diff
-    dd if=/dev/urandom of="${TMPFS_KEY}" bs=1 count=$(wc -c <"${TMPFS_KEY}") 2>/dev/null || true
+    dd if=/dev/urandom of="${TMPFS_KEY}" bs=1 count="$(wc -c <"${TMPFS_KEY}")" 2>/dev/null || true
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@scripts/tmi-decrypt.sh` at line 30, The dd invocation using TMPFS_KEY uses an
unquoted command substitution for count=$(wc -c <"${TMPFS_KEY}") which can
produce leading whitespace and cause word-splitting; update the dd call so the
count argument uses a quoted command substitution (e.g., count="$(wc -c <
"${TMPFS_KEY}")") to ensure the byte count is passed as a single token and avoid
SC2046 issues.
```

</details>

</blockquote></details>
<details>
<summary>docs/ralph-loop/increment-D-trust-anchor.md-14-14 (1)</summary><blockquote>

`14-14`: _⚠️ Potential issue_ | _🟡 Minor_

**Add language identifiers for fenced code blocks (MD040).**

A few fenced blocks are missing language tags, which triggers markdownlint and reduces rendering quality.


<details>
<summary>📝 Proposed fix</summary>

```diff
-```
+```text
 packages/ta-server/
 ├── src/
 │   ├── index.ts                              # Express TA server (4 endpoints)
@@
-```
+```text
@@
-```
+```text
 ✔ registry: empty list when no entries
 ✔ registry: register() and listEntityIds()
```
</details>


Also applies to: 78-78, 118-118

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @docs/ralph-loop/increment-D-trust-anchor.md at line 14, The Markdown file
docs/ralph-loop/increment-D-trust-anchor.md has fenced code blocks missing
language identifiers; update each triple-backtick fence shown (the
packages/ta-server tree block and the trailing test output block, plus the other
occurrences noted) to include a language tag (e.g., text) so markdownlint MD040 is satisfied and rendering is correct—search for the untagged blocks
in that document and replace them with ```text.


</details>

</blockquote></details>
<details>
<summary>docs/architecture.md-23-23 (1)</summary><blockquote>

`23-23`: _⚠️ Potential issue_ | _🟡 Minor_

**Add language identifiers to fenced code blocks (markdownlint MD040).**

These fences are missing a language tag and currently trigger lint warnings.



<details>
<summary>📝 Example fix</summary>

```diff
-```
+```text
```
</details>


Also applies to: 59-59, 83-83

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @docs/architecture.md at line 23, The markdown contains fenced code blocks
without language identifiers (triggering markdownlint MD040); update each
triple-backtick fence in docs/architecture.md (including the blocks around the
snippets shown and the other instances flagged) to include an appropriate
language tag (e.g., text, json, or the actual language used) immediately after
the opening so the fences becometext or ```json etc., ensuring all three
backtick code blocks have language identifiers.


</details>

</blockquote></details>
<details>
<summary>docs/architecture.md-107-109 (1)</summary><blockquote>

`107-109`: _⚠️ Potential issue_ | _🟡 Minor_

**Document `jku` as a JOSE header field, not a payload claim.**

Current docs show `jku` inside the trustmark payload shape, but issuance code passes `jku` via JWS protected header.



<details>
<summary>🧾 Suggested wording</summary>

```diff
-Trustmark payload:
-  { iss, sub, id, iat, exp, jku, image_digest?, repo?, evidence? }
+Trustmark:
+  Protected header: { alg, kid, jku, typ? }
+  Payload:          { iss, sub, id, iat, exp, image_digest?, repo?, evidence? }
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@docs/architecture.md` around lines 107 - 109, Update the documentation to
state that `jku` is a JOSE/JWS protected header field rather than a claim in the
trustmark payload: remove `jku` from the payload shape listing and add a note
that the `jku` value is set on the JWS protected header when creating/signing
the trustmark (match the issuance code that passes `jku` via the JWS protected
header), referencing the trustmark payload fields `{ iss, sub, id, iat, exp,
image_digest?, repo?, evidence? }` and the JWS protected header usage for `jku`.
```

</details>

</blockquote></details>
<details>
<summary>packages/ta-server/src/federation/subordinate-statements.ts-41-48 (1)</summary><blockquote>

`41-48`: _⚠️ Potential issue_ | _🟡 Minor_

**Reject invalid TTL values before signing.**

`ttlSeconds` currently accepts `0`/negative values, which can produce `exp <= iat` and invalid statements.



<details>
<summary>⏱️ Proposed guard</summary>

```diff
   const now = Math.floor(Date.now() / 1000);
   const ttl = config.ttlSeconds ?? 86400;
+  if (!Number.isInteger(ttl) || ttl <= 0) {
+    throw new Error("ttlSeconds must be a positive integer");
+  }
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@packages/ta-server/src/federation/subordinate-statements.ts` around lines 41
- 48, The TTL handling allows 0/negative values which can make exp <= iat and
produce invalid statements; validate config.ttlSeconds before building the
payload (in the code around now, ttl, and payload) and reject or throw when
ttlSeconds is provided but not a positive integer > 0; when ttlSeconds is
undefined continue to use the default (86400) but do not accept 0 or negative
numbers—ensure the calculated ttl (used for exp = now + ttl) is > 0 so exp > iat
before signing.
```

</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>🧹 Nitpick comments (13)</summary><blockquote>

<details>
<summary>packages/kms/test/softkms.test.mjs (1)</summary><blockquote>

`27-34`: **Consider using `test.before()` for test setup.**

Top-level await for setup works but provides less control over error handling. Using `test.before()` would make setup failures more visible in test output and align with Node test runner conventions.


<details>
<summary>♻️ Proposed refactor</summary>

```diff
+let pubJwk, privJwk;
+
+test.before(async () => {
+  await fs.mkdir(TMP, { recursive: true });
+  const { publicKey, privateKey } = await generateKeyPair("ES256");
+  pubJwk = { ...(await exportJWK(publicKey)), kty: "EC", crv: "P-256", use: "sig", kid: "unit-test-kid" };
+  privJwk = { ...(await exportJWK(privateKey)), kty: "EC", crv: "P-256", use: "sig", kid: "unit-test-kid" };
+  await fs.writeFile(PUB_PATH, JSON.stringify(pubJwk), "utf8");
+  await fs.writeFile(PRIV_PATH, JSON.stringify(privJwk), "utf8");
+});
-await fs.mkdir(TMP, { recursive: true });
-
-const { publicKey, privateKey } = await generateKeyPair("ES256");
-const pubJwk = { ...(await exportJWK(publicKey)), kty: "EC", crv: "P-256", use: "sig", kid: "unit-test-kid" };
-const privJwk = { ...(await exportJWK(privateKey)), kty: "EC", crv: "P-256", use: "sig", kid: "unit-test-kid" };
-
-await fs.writeFile(PUB_PATH, JSON.stringify(pubJwk), "utf8");
-await fs.writeFile(PRIV_PATH, JSON.stringify(privJwk), "utf8");
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@packages/kms/test/softkms.test.mjs` around lines 27 - 34, Move the top-level
setup that creates TMP and writes PUB_PATH/PRIV_PATH into a test.before() hook
so failures surface in test output and follow Node test runner conventions: wrap
the calls that use generateKeyPair, exportJWK, fs.mkdir, and fs.writeFile inside
test.before(async () => { ... }), using the same TMP, PUB_PATH, PRIV_PATH,
publicKey/privateKey, pubJwk and privJwk symbols, and keep the JSON.stringify
and utf8 options; ensure any errors are allowed to propagate from the hook so
the test runner reports setup failures.
```

</details>

</blockquote></details>
<details>
<summary>docs/ralph-loop/increment-B-cli.md (2)</summary><blockquote>

`50-53`: **Add language specifiers to remaining code blocks.**

Lines 50-53 and 108-119 also have fenced code blocks without language specifiers. Use `text` for plain output blocks to satisfy the markdownlint-cli2 warnings (MD040).




Also applies to: 108-119

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @docs/ralph-loop/increment-B-cli.md around lines 50 - 53, The fenced code
blocks showing CLI flags like "[--tmi http://localhost:8080] \ [--ttl 3600]
[--image-digest sha256:] \ [--repo ] " and the later block around the
repeated flags at the bottom are missing language specifiers; update those
three- and multi-line fenced blocks in increment-B-cli.md to use ```text (not

language tag to each relevant fenced code block.

14-25: Add language specifier to fenced code blocks.

Static analysis flagged missing language specifiers for fenced code blocks. Use text or plaintext for the directory tree structure to satisfy markdown linting.

📝 Proposed fix
-```
+```text
 packages/fedmgr-cli/
 ├── src/
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/ralph-loop/increment-B-cli.md` around lines 14 - 25, The fenced code
block that contains the directory tree starting with "packages/fedmgr-cli/" is
missing a language specifier; update the opening backticks to include a language
(use "text" or "plaintext") so the block becomes ```text, ensuring markdown
linting passes for the directory tree snippet in increment-B-cli.md.
scripts/tmi-keys-init.sh (1)

43-48: Redundant --no-password=false flag.

The --no-password=false flag is redundant when --password-file is already specified. The step CLI will use the password file regardless. Consider removing it for clarity.

♻️ Proposed simplification
 step crypto jwk create "${PUB_JWK}" "${PRIV_JWE}" \
   --kty EC --curve P-256 \
   --use sig \
   --password-file "${PASS_FILE}" \
-  --no-password=false \
   --force
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/tmi-keys-init.sh` around lines 43 - 48, The step crypto jwk create
invocation includes a redundant flag: remove the unnecessary
`--no-password=false` from the command so the `step crypto jwk create
"${PUB_JWK}" "${PRIV_JWE}" --kty EC --curve P-256 --use sig --password-file
"${PASS_FILE}" --force` call relies on the provided `PASS_FILE`; update the
command in the script where `step crypto jwk create` is used (referencing
variables `${PUB_JWK}`, `${PRIV_JWE}`, and `${PASS_FILE}`) to drop
`--no-password=false` for clarity.
packages/ta-server/src/federation/entity-statements.ts (1)

64-76: String replacement for endpoint derivation is brittle.

If jwksUri doesn't contain the exact string /.well-known/jwks.json, the replacement silently returns the original URL, leading to an incorrect federation_trust_mark_issuer_endpoint. Consider accepting a base URL and constructing both endpoints explicitly.

♻️ Proposed refactor
-export function tmiMetadata(jwksUri: string): Record<string, unknown> {
+export function tmiMetadata(baseUrl: string): Record<string, unknown> {
+  const base = baseUrl.replace(/\/$/, ""); // strip trailing slash
   return {
     federation_entity: {
       organization_name: "letsfederate TMI",
       contacts: ["ops@letsfederate.org"],
-      jwks_uri: jwksUri,
-      federation_trust_mark_issuer_endpoint: jwksUri.replace(
-        "/.well-known/jwks.json",
-        "/trustmarks/issue"
-      ),
+      jwks_uri: `${base}/.well-known/jwks.json`,
+      federation_trust_mark_issuer_endpoint: `${base}/trustmarks/issue`,
     },
   };
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ta-server/src/federation/entity-statements.ts` around lines 64 - 76,
The current tmiMetadata function derives federation_trust_mark_issuer_endpoint
by doing a brittle string replace on jwksUri; instead update tmiMetadata to
build both endpoints robustly (either change the signature to accept a baseUrl
or parse jwksUri with the URL API) and then construct jwks_uri and
federation_trust_mark_issuer_endpoint via URL resolution (e.g., new
URL('/.well-known/jwks.json', base) and new URL('/trustmarks/issue', base)) so
the trust mark endpoint is correct even if jwksUri has different paths or query
params; modify tmiMetadata and callers to use the new parameter/logic and ensure
federation_trust_mark_issuer_endpoint and jwks_uri values are set from those
computed URLs.
scripts/oidf-cert.sh (1)

54-59: Base64URL decoding may silently fail for some JWTs.

JWTs use base64url encoding (with - and _ characters), but base64 -d expects standard base64 (with + and /). While the || true fallback handles failures gracefully for this diagnostic output, the decoding may produce garbled or no output for valid JWTs.

🔧 Proposed fix using tr for base64url conversion
 echo "[oidf-cert] Entity statement header:"
 curl -sf "${TMI_ISSUER}/.well-known/openid-federation" \
-  | cut -d. -f1 | base64 -d 2>/dev/null | python3 -m json.tool 2>/dev/null || true
+  | cut -d. -f1 | tr '_-' '/+' | base64 -d 2>/dev/null | python3 -m json.tool 2>/dev/null || true

 echo "[oidf-cert] Entity statement payload:"
 curl -sf "${TMI_ISSUER}/.well-known/openid-federation" \
-  | cut -d. -f2 | base64 -d 2>/dev/null | python3 -m json.tool 2>/dev/null || true
+  | cut -d. -f2 | tr '_-' '/+' | base64 -d 2>/dev/null | python3 -m json.tool 2>/dev/null || true
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/oidf-cert.sh` around lines 54 - 59, The diagnostic curl pipelines
that use cut and base64 -d (the two occurrences of "curl ... | cut -d. -f1/2 |
base64 -d") can silently fail for base64url-encoded JWT parts; convert base64url
to standard base64 before decoding by replacing '-' with '+' and '_' with '/'
and ensuring proper '=' padding, then pass the result to base64 --decode (or
base64 -d) and json.tool; update both pipelines (the header extraction using cut
-f1 and the payload extraction using cut -f2) to perform this base64url→base64
translation and padding prior to decoding so valid JWTs decode reliably.
packages/ta-server/package.json (1)

15-18: Consider using workspace:* protocol for internal dependencies.

Using "*" for the internal KMS dependency works with npm workspaces, but "workspace:*" is more explicit and prevents accidental resolution from npm registry if the workspace setup is misconfigured.

♻️ Proposed change
   "dependencies": {
-    "@letsfederate/kms": "*",
+    "@letsfederate/kms": "workspace:*",
     "express": "^4.19.2",
     "zod": "^3.23.8"
   },
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ta-server/package.json` around lines 15 - 18, The dependency entry
for the internal package "@letsfederate/kms" uses "*" which can allow npm to
resolve from the registry; change the version spec to "workspace:*" in
package.json so npm workspaces will explicitly resolve the local workspace
package, then reinstall/update the lockfile to pick up the change.
packages/tmi-server/src/index.ts (1)

50-55: Use strict URL parsing for startup validation.

startsWith("http") accepts malformed values (e.g., httpx://...). Parse with new URL(...) and enforce http:/https: protocols.

✅ Proposed validation
-if (!ISSUER.startsWith("http")) {
-  throw new Error(`TMI_ISSUER must be an HTTP(S) URL, got: ${ISSUER}`);
-}
-if (!JWKS_URL.startsWith("http")) {
-  throw new Error(`TMI_JWKS_URL must be an HTTP(S) URL, got: ${JWKS_URL}`);
-}
+for (const [name, value] of [
+  ["TMI_ISSUER", ISSUER],
+  ["TMI_JWKS_URL", JWKS_URL],
+] as const) {
+  let parsed: URL;
+  try {
+    parsed = new URL(value);
+  } catch {
+    throw new Error(`${name} must be a valid URL, got: ${value}`);
+  }
+  if (parsed.protocol !== "http:" && parsed.protocol !== "https:") {
+    throw new Error(`${name} must use http or https, got: ${value}`);
+  }
+}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/tmi-server/src/index.ts` around lines 50 - 55, Replace the fragile
startsWith("http") checks for ISSUER and JWKS_URL by parsing each with new
URL(...) and validating the protocol is exactly "http:" or "https:"; if parsing
fails or protocol is not allowed, throw an Error that includes the offending
ISSUER or JWKS_URL value so startup fails on invalid URLs. Ensure you update the
validation logic around the ISSUER and JWKS_URL variables in index.ts
accordingly.
examples/lab/.env.example (1)

1-9: Consider expanding lab env template with TA-related defaults.

Given this PR adds TA server/subordinate wiring, documenting TA_ENTITY_ID, TA_JWKS_URL, and TA_SUBORDINATES here would reduce setup drift for new users.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/lab/.env.example` around lines 1 - 9, Add default TA-related env
vars to the lab env template by documenting TA_ENTITY_ID, TA_JWKS_URL, and
TA_SUBORDINATES in the examples/lab/.env.example so new users have sensible
defaults; update the file to include commented descriptions and example values
for TA_ENTITY_ID (e.g. an issuer URL or UUID), TA_JWKS_URL (full JWKS endpoint
URL) and TA_SUBORDINATES (comma-separated subordinate identifiers), mirroring
the style used for TMI_ISSUER and TMI_JWKS_URL to keep consistency.
packages/tmi-server/src/federation/entity-statements.ts (1)

64-73: Derive federation_trust_mark_issuer_endpoint from a parsed URL, not string replacement.

Current replace is fragile and can silently generate a wrong endpoint when jwksUri format/path differs.

🧭 Proposed refactor
 export function tmiMetadata(jwksUri: string): Record<string, unknown> {
+  const url = new URL(jwksUri);
+  if (url.pathname !== "/.well-known/jwks.json") {
+    throw new Error(
+      `jwksUri must end with /.well-known/jwks.json, got: ${jwksUri}`
+    );
+  }
+  url.pathname = "/trustmarks/issue";
+
   return {
     federation_entity: {
@@
-      federation_trust_mark_issuer_endpoint: jwksUri.replace(
-        "/.well-known/jwks.json",
-        "/trustmarks/issue"
-      ),
+      federation_trust_mark_issuer_endpoint: url.toString(),
     },
   };
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/tmi-server/src/federation/entity-statements.ts` around lines 64 -
73, The code uses string replace on jwksUri to build
federation_trust_mark_issuer_endpoint which is brittle; instead parse jwksUri
with the URL API inside tmiMetadata, then derive the endpoint by creating a new
URL that preserves the origin and sets pathname to '/trustmarks/issue' (e.g.
const u = new URL(jwksUri); u.pathname = '/trustmarks/issue';
federation_trust_mark_issuer_endpoint = u.toString()). Update tmiMetadata to use
this parsed-URL construction for federation_trust_mark_issuer_endpoint rather
than jwksUri.replace.
examples/lab/docker-compose.yml (3)

22-22: Consider removing the deprecated version field.

The version key is obsolete in Docker Compose V2 and is ignored. Removing it avoids deprecation warnings and aligns with current best practices.

Suggested change
-version: "3.9"
-
 networks:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/lab/docker-compose.yml` at line 22, Remove the deprecated top-level
version key from the docker-compose.yml (the line containing version: "3.9")
since Compose V2 ignores it; simply delete that line so the file relies on the
Compose specification without the version field and avoids deprecation warnings.

1-11: Consider adding resource limits for production deployments.

For production use of this pattern, consider adding deploy.resources.limits for memory/CPU and potentially restricting port bindings to localhost (127.0.0.1:port:port). The current configuration is appropriate for a local lab environment as documented.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/lab/docker-compose.yml` around lines 1 - 11, The compose file lacks
production resource and binding constraints; update each service (registry,
ta-server, tmi-server) to include deploy.resources.limits (set CPU and memory
limits) and, where appropriate, change port mappings from bare "host:container"
to bind to localhost (e.g., 127.0.0.1:host:container) to restrict external
access; ensure tmpfs and volume mounts remain unchanged and validate the
constraints are added under each service definition (registry, ta-server,
tmi-server).

66-67: Pin smallstep/step-cli to a specific version for reproducibility.

Using the :latest tag can lead to unpredictable behavior if the upstream image changes. For a security-focused setup, pinning to a known version ensures reproducible builds and allows for deliberate upgrade decisions. Consider pinning to 0.29.0 or a later stable release.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/lab/docker-compose.yml` around lines 66 - 67, Replace the floating
image tag under the ta-decrypt service (currently "smallstep/step-cli:latest")
with a specific version to ensure reproducible builds; update the image
reference to a pinned release such as "smallstep/step-cli:0.29.0" (or another
chosen stable version) in the docker-compose service definition so the
ta-decrypt service uses a fixed image.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/extending-trustmarks.md`:
- Line 68: The docs use a non-existent Zod API z.string().date() for the
assessed_date field; update the assessed_date schema to use a supported
validation such as z.string().datetime({ offset: true }) for full ISO 8601
timestamps or use z.string().regex(...) or z.preprocess((s) => new Date(s),
z.date()) / z.coerce.date() for date-only strings—replace the offending
z.string().date() reference for assessed_date with one of these supported
patterns and include a short note indicating which approach validates timestamp
vs date-only formats.

In `@packages/fedmgr-cli/src/bin.ts`:
- Around line 9-11: The fatal handler for program.parseAsync should preserve
error details and CommanderError exit semantics: inspect the caught err in the
.catch of program.parseAsync, if err is an instance of CommanderError use its
exitCode for process.exit and extract a meaningful message (err.message when
available), else if err is an Error use err.message, otherwise fall back to
String(err); then write the formatted message to stderr (e.g. `[fedmgr] fatal:
${message}\n`) and call process.exit with the determined exit code instead of
always exiting with 1.

In `@packages/fedmgr-cli/src/commands/oci.ts`:
- Around line 84-105: The CLI currently ignores user-specified JWKS and
certificate-regexp options: update the action that builds the args array (the
closure using opts, inside the .action after assertCosign) to use opts.jwks and
the camelCased regex options instead of hardcoded ".+"; specifically, if
opts.jwks is present push "--jwks" and opts.jwks, and replace the hardcoded
"--certificate-identity-regexp", ".+" and "--certificate-oidc-issuer-regexp",
".+" pairs with "--certificate-identity-regexp", opts.certificateIdentityRegexp
and "--certificate-oidc-issuer-regexp", opts.certificateOidcIssuerRegexp (and
guard them so you only push those flags when the corresponding opts value is
defined); also correct the .option declarations if needed to include the
<regexp>/ <url> parameter and default values so commander maps them to
opts.certificateIdentityRegexp, opts.certificateOidcIssuerRegexp and opts.jwks.

In `@packages/fedmgr-cli/src/commands/trustmark.ts`:
- Around line 94-111: The code currently accepts a jku from
decodeProtectedHeader(opts.jws) and fetches it directly; update the trustmark
verify flow to reject or require confirmation for untrusted JWKS URLs by
validating jwksUrl before calling fetch: parse jwksUrl and ensure it uses the
https: protocol and matches an allowed/whitelisted hostname pattern (or, if you
prefer stricter behavior, require opts.jwks to be explicitly provided and error
if jku is present but not in the allowed list), and if validation fails write a
clear stderr message and process.exit(1) before any network call; implement this
validation around the jwksUrl assignment and before the fetch in the same block
that uses decodeProtectedHeader/opts.jwks so functions/variables to touch are
decodeProtectedHeader, opts.jws, opts.jwks, jwksUrl and the fetch call.
- Around line 28-47: The CLI action in trustmark.ts currently sets ttl_s using
parseInt(opts.ttl, 10) without validating, allowing NaN or out-of-range values;
before building the body in the .action callback validate opts.ttl by parsing it
to an integer, check Number.isInteger and not NaN, enforce the allowed range
(60–86400), and return an error (throw, console.error+process.exit, or commander
error) when invalid; only set trustmark_id.ttl_s on the request body after
successful validation so invalid values cannot be serialized.
- Line 11: The CLI imports `import { importJWK, jwtVerify, decodeProtectedHeader
} from "jose";` which is a runtime dependency but not declared; add "jose" to
the dependencies of the `@letsfederate/fedmgr-cli` package manifest
(packages/fedmgr-cli/package.json) with an appropriate semver version (matching
workspace policy or a recent stable release), then restore/update installs
(yarn/npm install or equivalent) so the CLI can start without module-not-found
errors.

In `@packages/kms/src/index.ts`:
- Around line 67-77: The kid() method currently imports and re-exports the JWK
using importJWK/exportJWK which does not compute a RFC 7638 thumbprint; change
it to compute the thumbprint with calculateJwkThumbprint (await
calculateJwkThumbprint(this._pub as JWK, "sha256") or the library-appropriate
call) and assign that value to this._pub!.kid (remove the "default" fallback),
then set this._kid = this._pub!.kid and return it; keep the ensureLoaded() call
and the import of calculateJwkThumbprint alongside importJWK so you still
validate/normalize keys but derive the stable kid from calculateJwkThumbprint
instead of exportJWK.

In `@packages/ta-server/src/federation/subordinate-statements.ts`:
- Around line 44-50: The payload currently embeds config.subjectJwks directly
which may contain private key fields; create a sanitization step that
deep-clones config.subjectJwks and removes private JWK members (e.g., "d", "p",
"q", "dp", "dq", "qi", "oth", and symmetric "k") from each key before assigning
to payload.jwks; locate the code that builds payload (the payload constant in
subordinate-statements.ts) and replace the direct use of config.subjectJwks with
the sanitizedJwks so only public parameters (kty, use, alg, kid, n, e, x, y,
crv, etc.) are embedded.

In `@packages/ta-server/src/index.ts`:
- Around line 50-53: Add startup validation for TA_JWKS_URL by parsing the
environment value used to build JWKS_URL (const JWKS_URL) and failing fast if
it's malformed: attempt to construct a URL (e.g., using new URL(...)) from
process.env["TA_JWKS_URL"] or from the derived
`http://localhost:${PORT}/.well-known/jwks.json`, catch any thrown error, and
call processLogger.error with the invalid value and process.exit(1). Place this
check next to where JWKS_URL is defined so the app exits with a clear error
before later failures; reference JWKS_URL, TA_JWKS_URL, PORT and processLogger
in the fix.
- Around line 99-103: The JWKS fetch in packages/ta-server/src/index.ts uses
fetch(spec.jwksUrl) without a timeout, which can hang startup; update the code
that performs the JWKS fetch (the call that assigns r from fetch(spec.jwksUrl)
and the surrounding logic handling spec.jwksUrl and spec.entityId) to use an
AbortController with a short configurable timeout, pass controller.signal to
fetch, clear the timeout on success, and throw a clear error if the controller
aborts (including spec.entityId and spec.jwksUrl in the message) so stalled
endpoints cannot block startup.
- Around line 86-109: TA_SUBORDINATES parsing and JWKS fetching currently trusts
shapes and casts directly; validate the env JSON and sanitize JWKS before
creating SubordinateEntry. Ensure the parsed specs is an array and each spec has
a non-empty string entityId and a valid URL jwksUrl (validate using URL parsing)
before calling fetch; on fetch, validate the response JSON has a top-level keys
array and that each key is an object, then strip any private key members (d, p,
q, dp, dq, qi, oth or other algorithm-specific private fields) from each JWK and
validate required public fields (kty, kid, use/x5c/alg/...) before assigning to
SubordinateEntry; surface clear errors for schema or fetch/parse failures and
avoid blind casting to SubordinateEntry or jwks.
- Around line 126-130: The async Express route handlers that call kms.jwks(),
signEntityStatement(), and signSubordinateStatement() must forward promise
rejections to Express error middleware; update the handlers (the app.get for
"/.well-known/jwks.json" and the routes invoking signEntityStatement and
signSubordinateStatement) to either use a small async wrapper that calls
next(err) on rejection or append .catch(next) to the promise chain so any thrown
errors are passed to next() instead of causing unhandled rejections.

In `@packages/tmi-server/package.json`:
- Line 16: Update the internal dependency entry for the package
"@letsfederate/kms" in package.json from the non-deterministic value "*" to the
workspace protocol "workspace:*" so the monorepo uses the local workspace link;
apply the same change wherever "@letsfederate/kms" is listed in other packages'
package.json files to keep dependency resolution consistent across the repo.

In `@packages/tmi-server/src/index.ts`:
- Around line 114-146: Protect the POST "/trustmarks/issue" handler by requiring
a shared secret (e.g., process.env.ISSUE_TOKEN); in the handler (or a small
middleware referenced from this route) read the Authorization header or a
specific header like X-Issue-Token, compare securely to ISSUE_TOKEN, and return
401/403 if missing or invalid before parsing with IssueRequestSchema and before
calling kms.signJwt/JWKS_URL; ensure the check happens early (prior to payload
creation and signing) and fail fast to prevent unauthenticated minting of
trustmarks.
- Around line 159-163: The TMI entity statement currently sets authorityHints:
[] which causes the claim to be omitted; update the object (entityId: ISSUER) to
populate authorityHints with the trust-anchor issuer(s) so the trust-chain can
be discovered (e.g., authorityHints: [TRUST_ANCHOR_ISSUER] or authorityHints:
getConfig().trustAnchors). Locate the block where entityId: ISSUER is built and
replace the empty array with the configured TA issuer(s) (or add a small config
hook that returns TA issuer strings), ensuring tmiMetadata(JWKS_URL) remains
unchanged.
- Around line 78-85: The GET route that calls kms.jwks() (and other async routes
handling KMS/signing between the same section) must be wrapped with an Express
4.x async wrapper so promise rejections are forwarded to next(err). Add a small
helper like asyncRoute(fn) that returns (req,res,next) =>
Promise.resolve(fn(req,res,next)).catch(next), then update the handlers such as
the "/.well-known/jwks.json" route and the routes in the later block (the
handlers that call kms.jwks(), signing, or other async KMS methods) to use
asyncRoute(yourAsyncHandler) instead of passing the raw async function so errors
flow to Express error middleware.

---

Minor comments:
In `@docs/architecture.md`:
- Line 23: The markdown contains fenced code blocks without language identifiers
(triggering markdownlint MD040); update each triple-backtick fence in
docs/architecture.md (including the blocks around the snippets shown and the
other instances flagged) to include an appropriate language tag (e.g., text,
json, or the actual language used) immediately after the opening ``` so the
fences become ```text or ```json etc., ensuring all three backtick code blocks
have language identifiers.
- Around line 107-109: Update the documentation to state that `jku` is a
JOSE/JWS protected header field rather than a claim in the trustmark payload:
remove `jku` from the payload shape listing and add a note that the `jku` value
is set on the JWS protected header when creating/signing the trustmark (match
the issuance code that passes `jku` via the JWS protected header), referencing
the trustmark payload fields `{ iss, sub, id, iat, exp, image_digest?, repo?,
evidence? }` and the JWS protected header usage for `jku`.

In `@docs/extending-trustmarks.md`:
- Around line 12-15: The markdown code fences in the docs are missing language
tags and trigger MD040; update the three shown blocks and the other affected
ranges (lines 78-87 and 119-132) to include appropriate languages: use ```text
for the file-list block and ```typescript for TypeScript snippets such as the
IssueRequestSchema discriminated union and the tmiMetadata function; locate the
blocks containing "IssueRequestSchema", "tmiMetadata", and the file list and add
the language identifiers so each fenced code block is annotated (e.g., ```text
and ```typescript).

In `@docs/ralph-loop/increment-A-docker-lab.md`:
- Around line 13-19: The fenced code block in the markdown lacks a language tag
(triple backticks at the start of the block), causing markdownlint MD040; update
the opening fence in the block that contains the directory listing (the snippet
starting with "examples/lab/") to include a language identifier such as "text"
(i.e., change ``` to ```text) so the block is explicitly tagged.

In `@docs/ralph-loop/increment-D-trust-anchor.md`:
- Line 14: The Markdown file docs/ralph-loop/increment-D-trust-anchor.md has
fenced code blocks missing language identifiers; update each triple-backtick
fence shown (the packages/ta-server tree block and the trailing test output
block, plus the other occurrences noted) to include a language tag (e.g.,
```text) so markdownlint MD040 is satisfied and rendering is correct—search for
the untagged ``` blocks in that document and replace them with ```text.

In `@package.json`:
- Line 27: The npm script "tmi:decrypt" uses "source scripts/tmi-decrypt.sh"
which runs in a subshell so exported env vars won't persist; update the repo so
users know to run "source scripts/tmi-decrypt.sh" directly in their shell (e.g.,
update README or the scripts section comment) or change the package.json
approach: replace the direct source usage with a documented alternative such as
providing a non-subshell usage (e.g., make the script print export lines that
can be eval'd by the caller) or point consumers to the existing "--start"
variant that handles environment injection; reference the "tmi:decrypt" npm
script and the scripts/tmi-decrypt.sh file when applying the change.

In `@packages/ta-server/src/federation/subordinate-statements.ts`:
- Around line 41-48: The TTL handling allows 0/negative values which can make
exp <= iat and produce invalid statements; validate config.ttlSeconds before
building the payload (in the code around now, ttl, and payload) and reject or
throw when ttlSeconds is provided but not a positive integer > 0; when
ttlSeconds is undefined continue to use the default (86400) but do not accept 0
or negative numbers—ensure the calculated ttl (used for exp = now + ttl) is > 0
so exp > iat before signing.

In `@scripts/tmi-decrypt.sh`:
- Line 30: The dd invocation using TMPFS_KEY uses an unquoted command
substitution for count=$(wc -c <"${TMPFS_KEY}") which can produce leading
whitespace and cause word-splitting; update the dd call so the count argument
uses a quoted command substitution (e.g., count="$(wc -c < "${TMPFS_KEY}")") to
ensure the byte count is passed as a single token and avoid SC2046 issues.

---

Nitpick comments:
In `@docs/ralph-loop/increment-B-cli.md`:
- Around line 50-53: The fenced code blocks showing CLI flags like "[--tmi
http://localhost:8080] \ [--ttl 3600] \ [--image-digest sha256:<hex>] \ [--repo
<url>] \" and the later block around the repeated flags at the bottom are
missing language specifiers; update those three- and multi-line fenced blocks in
increment-B-cli.md to use ```text (not ``` or other language) so markdownlint
MD040 is satisfied by adding the text language tag to each relevant fenced code
block.
- Around line 14-25: The fenced code block that contains the directory tree
starting with "packages/fedmgr-cli/" is missing a language specifier; update the
opening backticks to include a language (use "text" or "plaintext") so the block
becomes ```text, ensuring markdown linting passes for the directory tree snippet
in increment-B-cli.md.

In `@examples/lab/.env.example`:
- Around line 1-9: Add default TA-related env vars to the lab env template by
documenting TA_ENTITY_ID, TA_JWKS_URL, and TA_SUBORDINATES in the
examples/lab/.env.example so new users have sensible defaults; update the file
to include commented descriptions and example values for TA_ENTITY_ID (e.g. an
issuer URL or UUID), TA_JWKS_URL (full JWKS endpoint URL) and TA_SUBORDINATES
(comma-separated subordinate identifiers), mirroring the style used for
TMI_ISSUER and TMI_JWKS_URL to keep consistency.

In `@examples/lab/docker-compose.yml`:
- Line 22: Remove the deprecated top-level version key from the
docker-compose.yml (the line containing version: "3.9") since Compose V2 ignores
it; simply delete that line so the file relies on the Compose specification
without the version field and avoids deprecation warnings.
- Around line 1-11: The compose file lacks production resource and binding
constraints; update each service (registry, ta-server, tmi-server) to include
deploy.resources.limits (set CPU and memory limits) and, where appropriate,
change port mappings from bare "host:container" to bind to localhost (e.g.,
127.0.0.1:host:container) to restrict external access; ensure tmpfs and volume
mounts remain unchanged and validate the constraints are added under each
service definition (registry, ta-server, tmi-server).
- Around line 66-67: Replace the floating image tag under the ta-decrypt service
(currently "smallstep/step-cli:latest") with a specific version to ensure
reproducible builds; update the image reference to a pinned release such as
"smallstep/step-cli:0.29.0" (or another chosen stable version) in the
docker-compose service definition so the ta-decrypt service uses a fixed image.

In `@packages/kms/test/softkms.test.mjs`:
- Around line 27-34: Move the top-level setup that creates TMP and writes
PUB_PATH/PRIV_PATH into a test.before() hook so failures surface in test output
and follow Node test runner conventions: wrap the calls that use
generateKeyPair, exportJWK, fs.mkdir, and fs.writeFile inside test.before(async
() => { ... }), using the same TMP, PUB_PATH, PRIV_PATH, publicKey/privateKey,
pubJwk and privJwk symbols, and keep the JSON.stringify and utf8 options; ensure
any errors are allowed to propagate from the hook so the test runner reports
setup failures.

In `@packages/ta-server/package.json`:
- Around line 15-18: The dependency entry for the internal package
"@letsfederate/kms" uses "*" which can allow npm to resolve from the registry;
change the version spec to "workspace:*" in package.json so npm workspaces will
explicitly resolve the local workspace package, then reinstall/update the
lockfile to pick up the change.

In `@packages/ta-server/src/federation/entity-statements.ts`:
- Around line 64-76: The current tmiMetadata function derives
federation_trust_mark_issuer_endpoint by doing a brittle string replace on
jwksUri; instead update tmiMetadata to build both endpoints robustly (either
change the signature to accept a baseUrl or parse jwksUri with the URL API) and
then construct jwks_uri and federation_trust_mark_issuer_endpoint via URL
resolution (e.g., new URL('/.well-known/jwks.json', base) and new
URL('/trustmarks/issue', base)) so the trust mark endpoint is correct even if
jwksUri has different paths or query params; modify tmiMetadata and callers to
use the new parameter/logic and ensure federation_trust_mark_issuer_endpoint and
jwks_uri values are set from those computed URLs.

In `@packages/tmi-server/src/federation/entity-statements.ts`:
- Around line 64-73: The code uses string replace on jwksUri to build
federation_trust_mark_issuer_endpoint which is brittle; instead parse jwksUri
with the URL API inside tmiMetadata, then derive the endpoint by creating a new
URL that preserves the origin and sets pathname to '/trustmarks/issue' (e.g.
const u = new URL(jwksUri); u.pathname = '/trustmarks/issue';
federation_trust_mark_issuer_endpoint = u.toString()). Update tmiMetadata to use
this parsed-URL construction for federation_trust_mark_issuer_endpoint rather
than jwksUri.replace.

In `@packages/tmi-server/src/index.ts`:
- Around line 50-55: Replace the fragile startsWith("http") checks for ISSUER
and JWKS_URL by parsing each with new URL(...) and validating the protocol is
exactly "http:" or "https:"; if parsing fails or protocol is not allowed, throw
an Error that includes the offending ISSUER or JWKS_URL value so startup fails
on invalid URLs. Ensure you update the validation logic around the ISSUER and
JWKS_URL variables in index.ts accordingly.

In `@scripts/oidf-cert.sh`:
- Around line 54-59: The diagnostic curl pipelines that use cut and base64 -d
(the two occurrences of "curl ... | cut -d. -f1/2 | base64 -d") can silently
fail for base64url-encoded JWT parts; convert base64url to standard base64
before decoding by replacing '-' with '+' and '_' with '/' and ensuring proper
'=' padding, then pass the result to base64 --decode (or base64 -d) and
json.tool; update both pipelines (the header extraction using cut -f1 and the
payload extraction using cut -f2) to perform this base64url→base64 translation
and padding prior to decoding so valid JWTs decode reliably.

In `@scripts/tmi-keys-init.sh`:
- Around line 43-48: The step crypto jwk create invocation includes a redundant
flag: remove the unnecessary `--no-password=false` from the command so the `step
crypto jwk create "${PUB_JWK}" "${PRIV_JWE}" --kty EC --curve P-256 --use sig
--password-file "${PASS_FILE}" --force` call relies on the provided `PASS_FILE`;
update the command in the script where `step crypto jwk create` is used
(referencing variables `${PUB_JWK}`, `${PRIV_JWE}`, and `${PASS_FILE}`) to drop
`--no-password=false` for clarity.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 547f8051-b28d-4d96-9ac9-7ef567a3a9ba

📥 Commits

Reviewing files that changed from the base of the PR and between f153a6f and c9baf04.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (42)
  • .gitignore
  • Dockerfile.ta-server
  • Dockerfile.tmi-server
  • docs/architecture.md
  • docs/decisions.md
  • docs/extending-trustmarks.md
  • docs/ralph-loop/increment-A-docker-lab.md
  • docs/ralph-loop/increment-B-cli.md
  • docs/ralph-loop/increment-C-oidf-federation.md
  • docs/ralph-loop/increment-D-trust-anchor.md
  • examples/lab/.env.example
  • examples/lab/docker-compose.yml
  • package.json
  • packages/fedmgr-cli/package.json
  • packages/fedmgr-cli/src/bin.ts
  • packages/fedmgr-cli/src/commands/keys.ts
  • packages/fedmgr-cli/src/commands/oci.ts
  • packages/fedmgr-cli/src/commands/trustmark.ts
  • packages/fedmgr-cli/src/index.ts
  • packages/fedmgr-cli/test/cli.test.mjs
  • packages/fedmgr-cli/tsconfig.json
  • packages/kms/package.json
  • packages/kms/src/index.ts
  • packages/kms/test/softkms.test.mjs
  • packages/kms/tsconfig.json
  • packages/ta-server/keys/.gitkeep
  • packages/ta-server/package.json
  • packages/ta-server/src/federation/entity-statements.ts
  • packages/ta-server/src/federation/subordinate-statements.ts
  • packages/ta-server/src/index.ts
  • packages/ta-server/test/ta-server.test.mjs
  • packages/ta-server/tsconfig.json
  • packages/tmi-server/keys/.gitkeep
  • packages/tmi-server/package.json
  • packages/tmi-server/src/federation/entity-statements.ts
  • packages/tmi-server/src/index.ts
  • packages/tmi-server/test/entity-statements.test.mjs
  • packages/tmi-server/tsconfig.json
  • scripts/oidf-cert.sh
  • scripts/tmi-decrypt.sh
  • scripts/tmi-keys-init.sh
  • tsconfig.base.json

"https://letsfederate.org/trustmarks/CertifiedForMCPUse_v1"
),
mcp_profile: z.string(), // new field specific to this type
assessed_date: z.string().date(), // ISO 8601
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Zod example likely uses a non-existent API (z.string().date()).

This snippet may fail for users copying it. Prefer a supported form (z.string().datetime(...) for timestamp, or regex/date coercion for date-only).

Suggested diff
-  assessed_date: z.string().date(),  // ISO 8601
+  assessed_date: z.string().regex(/^\d{4}-\d{2}-\d{2}$/), // ISO 8601 date (YYYY-MM-DD)
In Zod (current stable), is `z.string().date()` a valid API? What is the recommended way to validate an ISO 8601 date-only string?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/extending-trustmarks.md` at line 68, The docs use a non-existent Zod API
z.string().date() for the assessed_date field; update the assessed_date schema
to use a supported validation such as z.string().datetime({ offset: true }) for
full ISO 8601 timestamps or use z.string().regex(...) or z.preprocess((s) => new
Date(s), z.date()) / z.coerce.date() for date-only strings—replace the offending
z.string().date() reference for assessed_date with one of these supported
patterns and include a short note indicating which approach validates timestamp
vs date-only formats.

Comment on lines +9 to +11
program.parseAsync(process.argv).catch((err: unknown) => {
process.stderr.write(`[fedmgr] fatal: ${String(err)}\n`);
process.exit(1);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Commander.js v12: what does parseAsync reject with, and how should custom handlers preserve CommanderError exitCode?

💡 Result:

What does parseAsync() reject with (Commander.js v12)?

parseAsync() returns a Promise<this>. It will reject with whatever error is thrown during parsing/execution:

  • If Commander decides to exit (e.g., unknown option/command, missing required args/options, program.error(), help/version exit) and you have enabled exitOverride(), Commander will invoke your override callback with a CommanderError (with .exitCode, .code, .message). The default exitOverride() implementation throws that CommanderError, which means an await program.parseAsync(...) will reject/throw that CommanderError. (github.com)
  • If your action handler (or hook) throws/rejects, parseAsync() rejects with that same error (it is not automatically wrapped).

How should custom handlers preserve CommanderError.exitCode?

Two common patterns:

1) You use exitOverride() and want to “handle but still fail” with the same exit code

Make sure your override either exits with err.exitCode or rethrows after setting process.exitCode (so upstream code/tests still see the failure):

import { CommanderError } from 'commander';

program.exitOverride((err) => {
  // preserve the intended status
  process.exitCode = err.exitCode;

  // keep the error semantics (parseAsync rejects, tests can assert on it)
  throw err;
});

Commander documents that if your override callback returns, Commander will still call process.exit() itself (so returning without throwing/exiting can short-circuit your custom flow). (github.com)

2) You catch around parseAsync() and want correct exit status

If you swallow or replace the error, explicitly carry over the exit code:

try {
  await program.parseAsync(process.argv);
} catch (err) {
  if (err instanceof CommanderError) process.exitCode = err.exitCode;
  throw err; // or exit(process.exitCode)
}

This way, custom handling doesn’t accidentally turn an intended non-zero exit into exit code 0. (github.com)

Citations:


🏁 Script executed:

cat -n packages/fedmgr-cli/src/bin.ts

Repository: teamktown/fedmgr

Length of output: 442


🏁 Script executed:

head -50 packages/fedmgr-cli/src/index.ts

Repository: teamktown/fedmgr

Length of output: 806


🏁 Script executed:

fd -t f "test|spec" packages/fedmgr-cli/src --type f | head -20

Repository: teamktown/fedmgr

Length of output: 42


🏁 Script executed:

cat packages/fedmgr-cli/package.json

Repository: teamktown/fedmgr

Length of output: 882


🏁 Script executed:

ls -la packages/fedmgr-cli/test/ 2>/dev/null || echo "No test directory found"

Repository: teamktown/fedmgr

Length of output: 232


🏁 Script executed:

cat packages/fedmgr-cli/test/cli.test.mjs

Repository: teamktown/fedmgr

Length of output: 3274


Preserve error detail and exit semantics in fatal path.

Current handling flattens errors via String(err) and always exits with code 1; instead, extract the message from Error objects and preserve CommanderError's exitCode when present.

Suggested adjustment
program.parseAsync(process.argv).catch((err: unknown) => {
-  process.stderr.write(`[fedmgr] fatal: ${String(err)}\n`);
+  const message = err instanceof Error ? err.message : String(err);
+  process.stderr.write(`[fedmgr] fatal: ${message}\n`);
+  const exitCode = err instanceof Error && typeof (err as any).exitCode === "number" ? (err as any).exitCode : 1;
-  process.exit(1);
+  process.exit(exitCode);
});
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
program.parseAsync(process.argv).catch((err: unknown) => {
process.stderr.write(`[fedmgr] fatal: ${String(err)}\n`);
process.exit(1);
program.parseAsync(process.argv).catch((err: unknown) => {
const message = err instanceof Error ? err.message : String(err);
process.stderr.write(`[fedmgr] fatal: ${message}\n`);
const exitCode = err instanceof Error && typeof (err as any).exitCode === "number" ? (err as any).exitCode : 1;
process.exit(exitCode);
});
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/fedmgr-cli/src/bin.ts` around lines 9 - 11, The fatal handler for
program.parseAsync should preserve error details and CommanderError exit
semantics: inspect the caught err in the .catch of program.parseAsync, if err is
an instance of CommanderError use its exitCode for process.exit and extract a
meaningful message (err.message when available), else if err is an Error use
err.message, otherwise fall back to String(err); then write the formatted
message to stderr (e.g. `[fedmgr] fatal: ${message}\n`) and call process.exit
with the determined exit code instead of always exiting with 1.

Comment on lines +84 to +105
.option("--jwks <url>", "JWKS URL to verify the trustmark JWS against")
.option("--certificate-identity-regexp", ".+", "cosign --certificate-identity-regexp")
.option("--certificate-oidc-issuer-regexp", ".+", "cosign --certificate-oidc-issuer-regexp")
.option("--allow-insecure-registry", "Allow HTTP registry", false)
.action((opts: {
image: string;
type: string;
jwks?: string;
allowInsecureRegistry: boolean;
}) => {
assertCosign();

const args = [
"verify-attestation",
"--type", opts.type,
"--certificate-identity-regexp", ".+",
"--certificate-oidc-issuer-regexp", ".+",
];
if (opts.allowInsecureRegistry) {
args.push("--allow-insecure-registry");
}
args.push(opts.image);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

verify-trustmark exposes options that are currently no-ops.

--jwks is never used, and the certificate regex options are ignored because the command always passes hardcoded ".+" values. This makes CLI behavior misleading.

💡 Proposed fix
-    .option("--jwks <url>", "JWKS URL to verify the trustmark JWS against")
-    .option("--certificate-identity-regexp", ".+", "cosign --certificate-identity-regexp")
-    .option("--certificate-oidc-issuer-regexp", ".+", "cosign --certificate-oidc-issuer-regexp")
+    .option("--certificate-identity-regexp <regexp>", "cosign --certificate-identity-regexp", ".+")
+    .option("--certificate-oidc-issuer-regexp <regexp>", "cosign --certificate-oidc-issuer-regexp", ".+")
@@
-    .action((opts: {
+    .action((opts: {
       image: string;
       type: string;
-      jwks?: string;
+      certificateIdentityRegexp: string;
+      certificateOidcIssuerRegexp: string;
       allowInsecureRegistry: boolean;
     }) => {
@@
       const args = [
         "verify-attestation",
         "--type", opts.type,
-        "--certificate-identity-regexp", ".+",
-        "--certificate-oidc-issuer-regexp", ".+",
+        "--certificate-identity-regexp", opts.certificateIdentityRegexp,
+        "--certificate-oidc-issuer-regexp", opts.certificateOidcIssuerRegexp,
       ];
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/fedmgr-cli/src/commands/oci.ts` around lines 84 - 105, The CLI
currently ignores user-specified JWKS and certificate-regexp options: update the
action that builds the args array (the closure using opts, inside the .action
after assertCosign) to use opts.jwks and the camelCased regex options instead of
hardcoded ".+"; specifically, if opts.jwks is present push "--jwks" and
opts.jwks, and replace the hardcoded "--certificate-identity-regexp", ".+" and
"--certificate-oidc-issuer-regexp", ".+" pairs with
"--certificate-identity-regexp", opts.certificateIdentityRegexp and
"--certificate-oidc-issuer-regexp", opts.certificateOidcIssuerRegexp (and guard
them so you only push those flags when the corresponding opts value is defined);
also correct the .option declarations if needed to include the <regexp>/ <url>
parameter and default values so commander maps them to
opts.certificateIdentityRegexp, opts.certificateOidcIssuerRegexp and opts.jwks.

"prepublishOnly": "npm run build && npm test"
},
"dependencies": {
"@letsfederate/kms": "*",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify whether internal `@letsfederate/`* deps are using "*" instead of workspace protocol.

set -euo pipefail

fd 'package\.json$' -HI | while read -r f; do
  rg -nP '"@letsfederate/[^"]+"\s*:\s*"\*"' "$f" || true
done

Repository: teamktown/fedmgr

Length of output: 158


🏁 Script executed:

#!/bin/bash
# Identify all files with `@letsfederate/`* packages using "*" instead of workspace protocol

fd 'package\.json$' -HI | while read -r f; do
  matches=$(rg -nP '"@letsfederate/[^"]+"\s*:\s*"\*"' "$f" 2>/dev/null || true)
  if [ -n "$matches" ]; then
    echo "=== $f ==="
    echo "$matches"
  fi
done

Repository: teamktown/fedmgr

Length of output: 280


🏁 Script executed:

#!/bin/bash
# Find ALL `@letsfederate/`* packages using "*" versioning

fd 'package\.json$' -HI | while read -r f; do
  matches=$(rg -nP '"@letsfederate/[^"]+"\s*:\s*"\*"' "$f" 2>/dev/null || true)
  if [ -n "$matches" ]; then
    echo "=== $f ==="
    echo "$matches"
  fi
done | sort

Repository: teamktown/fedmgr

Length of output: 280


Use workspace protocol for internal package dependency.

"@letsfederate/kms": "*" is non-deterministic for a monorepo and may pull registry versions unexpectedly; use workspace:* for reliable local linking. This pattern is used across multiple packages and should be updated consistently.

Suggested fix
-    "@letsfederate/kms": "*",
+    "@letsfederate/kms": "workspace:*",
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"@letsfederate/kms": "*",
"@letsfederate/kms": "workspace:*",
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/tmi-server/package.json` at line 16, Update the internal dependency
entry for the package "@letsfederate/kms" in package.json from the
non-deterministic value "*" to the workspace protocol "workspace:*" so the
monorepo uses the local workspace link; apply the same change wherever
"@letsfederate/kms" is listed in other packages' package.json files to keep
dependency resolution consistent across the repo.

Fix jest.mock hoisting, path resolution, CLI command signatures, and
assertion patterns across the full test suite.

Unit tests:
- trust-chain-verification: fix 3-level chain key convention (each
  statement's jwks contains signer's public key for verifying prev link)
- token-exchange-service: move all jest.mock() before require() to fix
  CJS hoisting; replace app._router.handle() with supertest
- auth-commands: fix mock ordering, add crypto require, fix error
  messages in localOidcLogin, skip non-exported githubOAuthLogin
- npx-execution: add registry cleanup in afterEach, fix output assertions
- mcp-server: skip init tests (module is validateToken utility, not server)

Integration tests:
- mcp-federation: fix module path src/server/ → src/fedmgr/server/;
  mock federation-admin to avoid module-level side effects (key gen)
- federation-endpoints: use toContain() for content-type (charset suffix)

E2E tests (cli-federation-workflow):
- Use absolute path for fedmgr.js entry point
- Add workspace-scoped env vars (FEDMGR_HOME, FEDMGR_FEDERATIONS_DIR,
  FEDMGR_FED_REG, FEDMGR_FED_REG_FILE) for full isolation
- Fix distribute command to pass federation name as required arg
- Fix 4 invalid || Jest assertion chains → toMatch(/a|b/)

Source fixes:
- fedmgr.js: fix mcpInstancesDir from build/install/mcp_instances →
  mcp_instances (E2E tests checked wrong path)
- mcp-interface.js: add FEDMGR_HOME support for mcpInstancesDir/
  mcpProtocolServersDir defaults
- inspect.js: add inspect fed <name> and inspect mcp <name> type-aware
  handlers for entity config display (backward compat preserved)
- auth-commands.js: export localOidcLogin, fix error messages, fix
  password flow header

fixtures/setup.js: add FEDMGR_FED_REG_FILE env var

docs/testing.md: update directory structure, add status table,
key conventions (mock ordering, workspace isolation, trust chain keys)

Result: 130 passed, 1 pre-existing failure (federation-workflow.test.js
requires live servers), 4 skipped, 3 todo — across 14 suites

Co-Authored-By: claude-flow <ruv@ruv.net>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
scripts/tests/unit/server/mcp-server.test.js (1)

238-309: ⚠️ Potential issue | 🟠 Major

Re-enable these initialization tests (or replace with active equivalents) before merge.

Line 238, Line 270, and Line 309 convert core startup-validation tests to test.skip, which drops coverage for valid boot and required-flag failure paths. This creates a regression blind spot in server initialization behavior.

Suggested patch
-  test.skip('should initialize correctly with valid configuration', () => {
+  test('should initialize correctly with valid configuration', () => {
...
-  test.skip('should exit if name is missing', () => {
+  test('should exit if name is missing', () => {
...
-  test.skip('should exit if port is missing', () => {
+  test('should exit if port is missing', () => {

If these are flaky, prefer a temporary quarantine pattern (e.g., conditional describe behind an env flag in CI + linked tracking issue) instead of permanently skipped tests.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/tests/unit/server/mcp-server.test.js` around lines 238 - 309, The
three skipped startup-validation tests ("should initialize correctly with valid
configuration", "should exit if name is missing", "should exit if port is
missing") must be re-enabled (remove .skip) or replaced with an
active/quarantined pattern so critical boot paths remain covered; edit the test
declarations in scripts/tests/unit/server/mcp-server.test.js to either remove
".skip" from those test(...) calls or wrap the describe/tests in a conditional
guard (e.g., process.env.RUN_STARTUP_TESTS) and add a linked tracking issue
comment, keep the existing assertions that use jest.isolateModules,
assertMockCalled/getMockCallArg, and mocks (mockExpress, mockHttp, mockServer,
mockExit, mockConsoleError, mockFs) so the same validation runs in CI, and if
flakiness is the concern, implement the temporary quarantine gate rather than
permanently skipping.
scripts/tests/e2e/cli-federation-workflow.test.js (1)

319-333: ⚠️ Potential issue | 🟠 Major

Error-path tests are not using TEST_CONFIG.env, so they may hit non-workspace state.

These blocks assert error text, but command calls only pass cwd. Since path resolution is env-driven, they can read/write default locations and become flaky.

Suggested fix
 const result = await executeCommand('create', ['fed', TEST_CONFIG.federationName], {
-  cwd: TEST_CONFIG.workspaceDir
+  cwd: TEST_CONFIG.workspaceDir,
+  env: TEST_CONFIG.env
 });

 await executeCommand('create', ['mcp', 'test-invalid-mcp'], {
-  cwd: TEST_CONFIG.workspaceDir
+  cwd: TEST_CONFIG.workspaceDir,
+  env: TEST_CONFIG.env
 });

 await executeCommand('inspect', ['fed', 'non-existent-federation'], {
-  cwd: TEST_CONFIG.workspaceDir
+  cwd: TEST_CONFIG.workspaceDir,
+  env: TEST_CONFIG.env
 });

Also applies to: 338-350, 355-367

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/tests/e2e/cli-federation-workflow.test.js` around lines 319 - 333,
The tests call executeCommand('create', ...) and related commands only with cwd,
so they may operate against global/default state; update each error-path block
(the try/catch blocks around executeCommand in this test file, including the
block shown and the similar blocks at the other ranges) to pass the test
environment via the env option (i.e., include TEST_CONFIG.env alongside cwd) so
executeCommand uses the workspace-specific environment; locate calls to
executeCommand in the duplicate-creation/error-assertion sections and add the
env: TEST_CONFIG.env property to the options object.
🧹 Nitpick comments (3)
docs/testing.md (1)

145-160: Consider making suite-status numbers reproducible to avoid doc drift.

The hard-coded totals on Line 147 and per-suite counts can become stale quickly. Recommend adding a short “how these numbers were generated” note (or linking CI artifact) so updates stay reliable.

Suggested doc tweak
 ## 5. Test Suite Status (as of 2026-03-16)

 **Overall: 130 passing, 1 failing (pre-existing), 4 skipped, 3 todo — across 14 test suites**
+_Generated from local run: `npm test -- --reporters=default` (or latest CI run artifact). Update this section when test totals change._
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/testing.md` around lines 145 - 160, The document hard-codes aggregate
test counts under the "## 5. Test Suite Status (as of 2026-03-16)" heading and
per-suite counts in the table, which will drift; update the section to include a
short reproducibility note describing exactly how those totals were generated
(e.g., the CI job/artifact or the test-summary script/command used) and/or add a
link to the CI artifact that produced the numbers so future editors can re-run
or verify them; reference the "## 5. Test Suite Status" header and the table
rows (e.g., `unit/trust-chain-verification`, `e2e/federation-workflow`) when
adding the note so readers know which counts the note covers.
scripts/tests/unit/server/token-exchange-service.test.js (1)

135-207: Good HTTP endpoint test coverage.

The tests properly exercise the /token-exchange endpoint via supertest, covering success and 400 error cases. One optional improvement: consider adding a test for the 500 error path when exchangeGitHubCode or convertToFederationToken throws an exception.

💡 Optional: Add 500 error case test
test('token-exchange endpoint handles internal error', async () => {
  jest.spyOn(service, 'exchangeGitHubCode').mockRejectedValue(
    new Error('Internal failure')
  );

  const res = await request(service.app)
    .post('/token-exchange')
    .send({
      code: 'test-code',
      client_id: 'test-client-id',
      redirect_uri: 'http://localhost:3456/callback',
      provider: 'github'
    });

  expect(res.status).toBe(500);
  expect(res.body).toEqual({ error: 'Internal failure' });
});
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/tests/unit/server/token-exchange-service.test.js` around lines 135 -
207, Add a test for the 500 error path on the /token-exchange endpoint by spying
on service.exchangeGitHubCode (or service.convertToFederationToken) to mock a
rejected promise (e.g., mockRejectedValue(new Error('Internal failure'))) and
then POST to service.app '/token-exchange' with the usual payload; assert that
the response status is 500 and the body contains the error message (e.g., {
error: 'Internal failure' }) to ensure internal exceptions are propagated as a
500 response.
src/fedmgr/auth-commands.js (1)

46-46: Include status code and discovery URL in discovery failure message.

Line 46 uses statusText only; many servers leave it blank. Include status and URL for actionable diagnostics.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/fedmgr/auth-commands.js` at line 46, The throw on discovery failure only
includes discovery.statusText which is often empty; update the error to include
discovery.status and the discovery request URL along with the existing
statusText. Locate the check using discovery.ok (and the surrounding discovery
fetch logic) and change the Error message thrown when !discovery.ok to include
discovery.status and the variable/constant holding the OIDC discovery URL (the
same URL used when fetching the discovery object) so the message contains
status, statusText, and the endpoint URL for diagnostics.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/tests/e2e/cli-federation-workflow.test.js`:
- Around line 260-267: The test currently asserts executeCommand(...).success
must be true which makes it hard-fail when the CLI exits non-zero (no server);
change the test to tolerate a non-zero exit by invoking executeCommand inside a
try/catch (or by awaiting and capturing the thrown error) and then assert that
either result.success is true OR the captured result/output/error message
contains an expected connection failure string (e.g. "connection refused" /
"connection error" / similar), keeping the existing check that typeof output is
'string'; update the assertions around executeCommand to accept both success and
a connection-error outcome.

In `@scripts/tests/fixtures/setup.js`:
- Around line 16-17: The setup uses process.env.FEDMGR_FED_REG_FILE to define
the full registry JSON path but the bootstrap/create logic still reads and
writes process.env.FEDMGR_FED_REG; update the initialization and all subsequent
checks/writes (including the block around the creation logic at lines 31-36) to
consistently use process.env.FEDMGR_FED_REG_FILE (instead of FEDMGR_FED_REG)
when checking for existence, creating directories/files, and writing the
registry.json so the CLI and setup target the same file path.

In `@scripts/tests/integration/mcp-federation.test.js`:
- Around line 53-58: The test currently replaces the real integration entrypoint
by mocking '../../../src/fedmgr/server/federation-admin' with fake functions
(fetchEntityStatements, validateTrustChain, distributeEntityStatements), which
hides the actual Express app surface; update the test so it no longer mocks the
federation-admin module itself but instead either imports and exercises the real
server entrypoint or only mocks external I/O boundaries (e.g., HTTP clients,
filesystem, network calls) that the real federation-admin Express app uses;
locate the jest.mock call in mcp-federation.test.js and remove or restrict it,
then adjust setup to start the real Express app exported by
src/fedmgr/server/federation-admin.js (or wire in the app's middleware/route
handlers) and mock only the external dependencies the app calls into so the test
validates runtime behavior.
- Around line 101-110: The mock for federationAdmin.distributeEntityStatements
currently returns a simple summary ({ success, distributed }) which differs from
the real distribution contract that returns per-member distributionResults;
update the mockImplementation for federationAdmin.distributeEntityStatements to
return a per-member results structure matching the runtime shape (an
array/object keyed by member with status/ok info for each request) so consumers
can inspect each member's result; locate the mock in
scripts/tests/integration/mcp-federation.test.js and replace the summary return
with a distributionResults-style payload (include each member and its
response.ok/status) to mirror the production contract.

In `@scripts/tests/unit/cli/auth-commands.test.js`:
- Line 204: Unskip the skipped test for 'githubOAuthLogin opens browser and
handles callback' and convert it to a deterministic unit test by mocking
external side effects: stub the function that opens the browser (e.g.,
open/browserOpen), mock the local callback HTTP server or request handler that
invokes the callback route, and mock the token-exchange network call used by
githubOAuthLogin (e.g., exchangeCodeForToken or fetchAccessToken) to return a
controlled token; use Jest fake timers
(jest.useFakeTimers()/advanceTimersByTime) if the flow relies on timeouts, and
assert that githubOAuthLogin calls the expected helpers and returns the mocked
token/state handling, ensuring the test reliably simulates the callback and
token exchange without real network/browser interaction.

In `@scripts/tests/unit/cli/npx-execution.test.js`:
- Around line 42-45: The test uses a REGISTRY_FILE constant that checks
FEDMGR_FED_REG first, but the CLI prefers FEDMGR_FED_REG_FILE; update the
REGISTRY_FILE resolution to match CLI precedence by checking
process.env.FEDMGR_FED_REG_FILE first, then process.env.FEDMGR_FED_REG, and
finally falling back to the existing path.resolve default so tests and CLI
target the same registry file (update the REGISTRY_FILE declaration
accordingly).

In `@src/fedmgr/auth-commands.js`:
- Around line 53-54: The code assumes the token endpoint always returns JSON
(using await tokenRes.json()) which throws on text/HTML error responses; update
the error handling around tokenRes to attempt JSON parsing in a try/catch and,
if parsing fails, fall back to await tokenRes.text() (or use
tokenRes.statusText) so you can throw a meaningful Error containing either
errData.error_description/errData.error when JSON, or the raw text/status when
not; change the throw site that currently uses errData to use this safe-parsed
fallback to avoid hiding the real auth failure.
- Around line 49-51: The password-grant token request drops client
authentication and assumes JSON errors; update the token request in the function
that constructs tokenRes (using cfg.token_endpoint, username, password,
clientId, clientSecret) to include client credentials: if clientSecret is
provided send an Authorization: Basic header (base64(clientId:clientSecret)),
otherwise include client_id in the URLSearchParams body; also change error
handling after the fetch so you only call tokenRes.json() when the response
Content-Type is JSON and otherwise call tokenRes.text() (or include the raw
text) before throwing so non-JSON error responses are handled safely.

In `@src/fedmgr/commands/inspect.js`:
- Around line 28-30: Wrap the JSON.parse calls that read configFile in try/catch
to handle malformed entity-configuration.json safely: locate the two places
where fs.readFileSync(configFile, 'utf-8') is passed into JSON.parse (the block
that sets const config and logs `Entity ID: ${config.sub}` and the similar block
at the later occurrence) and catch JSON parse errors, logging a clear error
message (including the file name and the caught error) and either skip printing
the config values or return gracefully instead of letting the process crash.
- Around line 16-17: The inspect command currently lets "inspect fed" or
"inspect mcp" fall through as if the literal string were an MCP name; add an
explicit validation that when typeOrName === 'fed' or typeOrName === 'mcp' a
non-empty name is required and throw/exit with a clear error if name is missing.
Concretely, in src/fedmgr/commands/inspect.js update the conditional blocks that
check (typeOrName === 'fed' && name) and the analogous (typeOrName === 'mcp' &&
name) to first test if typeOrName is 'fed' or 'mcp' and name is falsy, then emit
an error like "must specify a name for 'inspect fed'/'inspect mcp'" and
return/exit; otherwise continue with the existing logic that constructs
fedPath/mcpPath and the legacy inspector fallback. Ensure you update all three
similar sites in the file where these checks appear (the fed branch, the mcp
branch, and the legacy-mcp handling).

In `@src/fedmgr/server/mcp-interface.js`:
- Around line 27-32: The fallback paths for defaultInstancesDir and
defaultProtocolDir resolve to src/mcp_* because path.resolve(__dirname,
'../../...') only climbs two levels; when FEDMGR_HOME is unset change the
fallback to climb three levels so the workspace root mcp_instances and
mcp_protocol_servers are used instead: update the path.resolve calls in
defaultInstancesDir and defaultProtocolDir from '../../mcp_instances' and
'../../mcp_protocol_servers' to '../../../mcp_instances' and
'../../../mcp_protocol_servers' respectively (referencing defaultInstancesDir,
defaultProtocolDir and FEDMGR_HOME).

---

Outside diff comments:
In `@scripts/tests/e2e/cli-federation-workflow.test.js`:
- Around line 319-333: The tests call executeCommand('create', ...) and related
commands only with cwd, so they may operate against global/default state; update
each error-path block (the try/catch blocks around executeCommand in this test
file, including the block shown and the similar blocks at the other ranges) to
pass the test environment via the env option (i.e., include TEST_CONFIG.env
alongside cwd) so executeCommand uses the workspace-specific environment; locate
calls to executeCommand in the duplicate-creation/error-assertion sections and
add the env: TEST_CONFIG.env property to the options object.

In `@scripts/tests/unit/server/mcp-server.test.js`:
- Around line 238-309: The three skipped startup-validation tests ("should
initialize correctly with valid configuration", "should exit if name is
missing", "should exit if port is missing") must be re-enabled (remove .skip) or
replaced with an active/quarantined pattern so critical boot paths remain
covered; edit the test declarations in
scripts/tests/unit/server/mcp-server.test.js to either remove ".skip" from those
test(...) calls or wrap the describe/tests in a conditional guard (e.g.,
process.env.RUN_STARTUP_TESTS) and add a linked tracking issue comment, keep the
existing assertions that use jest.isolateModules,
assertMockCalled/getMockCallArg, and mocks (mockExpress, mockHttp, mockServer,
mockExit, mockConsoleError, mockFs) so the same validation runs in CI, and if
flakiness is the concern, implement the temporary quarantine gate rather than
permanently skipping.

---

Nitpick comments:
In `@docs/testing.md`:
- Around line 145-160: The document hard-codes aggregate test counts under the
"## 5. Test Suite Status (as of 2026-03-16)" heading and per-suite counts in the
table, which will drift; update the section to include a short reproducibility
note describing exactly how those totals were generated (e.g., the CI
job/artifact or the test-summary script/command used) and/or add a link to the
CI artifact that produced the numbers so future editors can re-run or verify
them; reference the "## 5. Test Suite Status" header and the table rows (e.g.,
`unit/trust-chain-verification`, `e2e/federation-workflow`) when adding the note
so readers know which counts the note covers.

In `@scripts/tests/unit/server/token-exchange-service.test.js`:
- Around line 135-207: Add a test for the 500 error path on the /token-exchange
endpoint by spying on service.exchangeGitHubCode (or
service.convertToFederationToken) to mock a rejected promise (e.g.,
mockRejectedValue(new Error('Internal failure'))) and then POST to service.app
'/token-exchange' with the usual payload; assert that the response status is 500
and the body contains the error message (e.g., { error: 'Internal failure' }) to
ensure internal exceptions are propagated as a 500 response.

In `@src/fedmgr/auth-commands.js`:
- Line 46: The throw on discovery failure only includes discovery.statusText
which is often empty; update the error to include discovery.status and the
discovery request URL along with the existing statusText. Locate the check using
discovery.ok (and the surrounding discovery fetch logic) and change the Error
message thrown when !discovery.ok to include discovery.status and the
variable/constant holding the OIDC discovery URL (the same URL used when
fetching the discovery object) so the message contains status, statusText, and
the endpoint URL for diagnostics.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7855627f-4a4d-4748-909d-f165c3629bd1

📥 Commits

Reviewing files that changed from the base of the PR and between c9baf04 and 443efed.

📒 Files selected for processing (14)
  • docs/testing.md
  • scripts/tests/e2e/cli-federation-workflow.test.js
  • scripts/tests/fixtures/setup.js
  • scripts/tests/integration/federation-endpoints.test.js
  • scripts/tests/integration/mcp-federation.test.js
  • scripts/tests/unit/cli/auth-commands.test.js
  • scripts/tests/unit/cli/npx-execution.test.js
  • scripts/tests/unit/server/mcp-server.test.js
  • scripts/tests/unit/server/token-exchange-service.test.js
  • scripts/tests/unit/trust-chain-verification.test.js
  • src/fedmgr/auth-commands.js
  • src/fedmgr/commands/inspect.js
  • src/fedmgr/fedmgr.js
  • src/fedmgr/server/mcp-interface.js

Comment on lines +260 to +267
const result = await executeCommand('distribute', [TEST_CONFIG.federationName], {
cwd: TEST_CONFIG.workspaceDir,
env: TEST_CONFIG.env
});

expect(result.success).toBe(true);
expect(result.output).toContain('distributed') || expect(result.output).toContain('Distribution');
// Command may report success or a connection error if no server is running — either is acceptable
expect(typeof result.output).toBe('string');
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Distribution test still hard-fails despite “connection error is acceptable” note.

executeCommand rejects on non-zero exit, and this test still requires success at Line 265. In a no-server environment, this remains flaky.

Suggested fix
-const result = await executeCommand('distribute', [TEST_CONFIG.federationName], {
-  cwd: TEST_CONFIG.workspaceDir,
-  env: TEST_CONFIG.env
-});
-
-expect(result.success).toBe(true);
-// Command may report success or a connection error if no server is running — either is acceptable
-expect(typeof result.output).toBe('string');
+try {
+  const result = await executeCommand('distribute', [TEST_CONFIG.federationName], {
+    cwd: TEST_CONFIG.workspaceDir,
+    env: TEST_CONFIG.env
+  });
+  expect(result.success).toBe(true);
+  expect(typeof result.output).toBe('string');
+} catch (error) {
+  expect(error.message).toMatch(/connection|ECONNREFUSED|ENOTFOUND|timeout/i);
+}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const result = await executeCommand('distribute', [TEST_CONFIG.federationName], {
cwd: TEST_CONFIG.workspaceDir,
env: TEST_CONFIG.env
});
expect(result.success).toBe(true);
expect(result.output).toContain('distributed') || expect(result.output).toContain('Distribution');
// Command may report success or a connection error if no server is running — either is acceptable
expect(typeof result.output).toBe('string');
try {
const result = await executeCommand('distribute', [TEST_CONFIG.federationName], {
cwd: TEST_CONFIG.workspaceDir,
env: TEST_CONFIG.env
});
expect(result.success).toBe(true);
expect(typeof result.output).toBe('string');
} catch (error) {
expect(error.message).toMatch(/connection|ECONNREFUSED|ENOTFOUND|timeout/i);
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/tests/e2e/cli-federation-workflow.test.js` around lines 260 - 267,
The test currently asserts executeCommand(...).success must be true which makes
it hard-fail when the CLI exits non-zero (no server); change the test to
tolerate a non-zero exit by invoking executeCommand inside a try/catch (or by
awaiting and capturing the thrown error) and then assert that either
result.success is true OR the captured result/output/error message contains an
expected connection failure string (e.g. "connection refused" / "connection
error" / similar), keeping the existing check that typeof output is 'string';
update the assertions around executeCommand to accept both success and a
connection-error outcome.

Comment on lines +16 to +17
// FEDMGR_FED_REG_FILE is the full path to the registry JSON file
process.env.FEDMGR_FED_REG_FILE = process.env.FEDMGR_FED_REG_FILE || path.join(process.env.FEDMGR_FED_REG, 'registry.json');
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use FEDMGR_FED_REG_FILE consistently for registry bootstrap.

You define the full registry file path at Line 17, but creation logic still checks/writes using FEDMGR_FED_REG. If FEDMGR_FED_REG_FILE is overridden, setup can initialize a different file than the CLI reads.

Suggested fix
-const fedRegDir = path.dirname(process.env.FEDMGR_FED_REG);
+const fedRegDir = path.dirname(process.env.FEDMGR_FED_REG_FILE);

 // Create an empty registry file if it doesn't exist
-if (!fs.existsSync(process.env.FEDMGR_FED_REG)) {
-  const registryPath = path.join(process.env.FEDMGR_FED_REG, 'registry.json');
+if (!fs.existsSync(process.env.FEDMGR_FED_REG_FILE)) {
+  const registryPath = process.env.FEDMGR_FED_REG_FILE;
   fs.mkdirSync(path.dirname(registryPath), { recursive: true });
   fs.writeFileSync(registryPath, JSON.stringify({ federations: [] }));
   console.log(`Created empty registry file: ${registryPath}`);
 }

Also applies to: 31-36

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/tests/fixtures/setup.js` around lines 16 - 17, The setup uses
process.env.FEDMGR_FED_REG_FILE to define the full registry JSON path but the
bootstrap/create logic still reads and writes process.env.FEDMGR_FED_REG; update
the initialization and all subsequent checks/writes (including the block around
the creation logic at lines 31-36) to consistently use
process.env.FEDMGR_FED_REG_FILE (instead of FEDMGR_FED_REG) when checking for
existence, creating directories/files, and writing the registry.json so the CLI
and setup target the same file path.

Comment on lines +53 to +58
// Mock federation-admin to avoid module-level side effects (key generation, server startup)
jest.mock('../../../src/fedmgr/server/federation-admin', () => ({
fetchEntityStatements: jest.fn(),
validateTrustChain: jest.fn(),
distributeEntityStatements: jest.fn()
}));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

This integration test is mocking a non-production API contract.

federation-admin is mocked with fetchEntityStatements, validateTrustChain, and distributeEntityStatements, but src/fedmgr/server/federation-admin.js exposes an Express app surface instead. This makes the test pass against a synthetic contract rather than real integration behavior.

Please retarget this suite to the real integration entrypoint (or only mock external I/O boundaries) so failures reflect runtime behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/tests/integration/mcp-federation.test.js` around lines 53 - 58, The
test currently replaces the real integration entrypoint by mocking
'../../../src/fedmgr/server/federation-admin' with fake functions
(fetchEntityStatements, validateTrustChain, distributeEntityStatements), which
hides the actual Express app surface; update the test so it no longer mocks the
federation-admin module itself but instead either imports and exercises the real
server entrypoint or only mocks external I/O boundaries (e.g., HTTP clients,
filesystem, network calls) that the real federation-admin Express app uses;
locate the jest.mock call in mcp-federation.test.js and remove or restrict it,
then adjust setup to start the real Express app exported by
src/fedmgr/server/federation-admin.js (or wire in the app's middleware/route
handlers) and mock only the external dependencies the app calls into so the test
validates runtime behavior.

Comment on lines +101 to +110
federationAdmin.distributeEntityStatements.mockImplementation(async (statement, members) => {
const results = await Promise.all(members.map(member =>
fetch(member + '/entity-statements', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(statement)
})
));
return { success: true, distributed: results.filter(r => r.ok).length };
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Mock return shape diverges from runtime distribution contract.

The mocked distributeEntityStatements response ({ success, distributed }) does not match the runtime-oriented shape used by federation distribution logic (distributionResults per member). This can mask real failures in consumers that inspect per-member status.

Proposed contract-aligned mock response
-    federationAdmin.distributeEntityStatements.mockImplementation(async (statement, members) => {
-      const results = await Promise.all(members.map(member =>
-        fetch(member + '/entity-statements', {
-          method: 'POST',
-          headers: { 'Content-Type': 'application/json' },
-          body: JSON.stringify(statement)
-        })
-      ));
-      return { success: true, distributed: results.filter(r => r.ok).length };
-    });
+    federationAdmin.distributeEntityStatements.mockImplementation(async (statement, members) => {
+      const results = await Promise.all(
+        members.map(member =>
+          fetch(member + '/entity-statements', {
+            method: 'POST',
+            headers: { 'Content-Type': 'application/json' },
+            body: JSON.stringify(statement)
+          })
+        )
+      );
+
+      const distributionResults = results.map((r, i) => ({
+        member: members[i],
+        success: r.ok,
+        status: r.status
+      }));
+
+      return {
+        success: distributionResults.every(r => r.success),
+        distributionResults
+      };
+    });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/tests/integration/mcp-federation.test.js` around lines 101 - 110, The
mock for federationAdmin.distributeEntityStatements currently returns a simple
summary ({ success, distributed }) which differs from the real distribution
contract that returns per-member distributionResults; update the
mockImplementation for federationAdmin.distributeEntityStatements to return a
per-member results structure matching the runtime shape (an array/object keyed
by member with status/ok info for each request) so consumers can inspect each
member's result; locate the mock in
scripts/tests/integration/mcp-federation.test.js and replace the summary return
with a distributionResults-style payload (include each member and its
response.ok/status) to mirror the production contract.


describe('GitHub OAuth Login', () => {
test('githubOAuthLogin opens browser and handles callback', async () => {
test.skip('githubOAuthLogin opens browser and handles callback', async () => {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Re-enable GitHub OAuth flow coverage before merge.

Line 204 skips the only end-to-end unit path for githubOAuthLogin, which leaves callback/state/token-exchange behavior unguarded. Please convert this to a stable deterministic test (or track an explicit follow-up with criteria and timeline) before release.

I can help rewrite this test with controlled callback invocation and timers so it can run reliably—want me to draft that patch?

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/tests/unit/cli/auth-commands.test.js` at line 204, Unskip the skipped
test for 'githubOAuthLogin opens browser and handles callback' and convert it to
a deterministic unit test by mocking external side effects: stub the function
that opens the browser (e.g., open/browserOpen), mock the local callback HTTP
server or request handler that invokes the callback route, and mock the
token-exchange network call used by githubOAuthLogin (e.g., exchangeCodeForToken
or fetchAccessToken) to return a controlled token; use Jest fake timers
(jest.useFakeTimers()/advanceTimersByTime) if the flow relies on timeouts, and
assert that githubOAuthLogin calls the expected helpers and returns the mocked
token/state handling, ensuring the test reliably simulates the callback and
token exchange without real network/browser interaction.

Comment on lines 49 to 51
const tokenRes = await fetch(cfg.token_endpoint, { method:'POST', headers:{
'Content-Type':'application/x-www-form-urlencoded','Accept':'application/json',
'Authorization':`Basic ${Buffer.from(`${clientId}:${clientSecret}`).toString('base64')}`
'Content-Type':'application/x-www-form-urlencoded','Accept':'application/json'
}, body:new URLSearchParams({grant_type:'password',username,password,scope:'openid profile'}).toString() });
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 1) Find expected token endpoint auth methods in repo tests/config
rg -n -C3 --type=js 'token_endpoint_auth_method|client_secret_basic|client_secret_post' .

# 2) Find password-grant token request implementations and whether they send Basic auth/client_id
rg -n -C4 --type=js "grant_type:'password'|grant_type=\"password\"|Authorization.*Basic|client_id" .

# Expected: if confidential client auth is expected in scenarios, localOidcLogin should include an auth path.

Repository: teamktown/fedmgr

Length of output: 14938


🏁 Script executed:

# Find localOidcLogin function signature and full implementation
rg -n -B5 -A20 'function localOidcLogin|export.*localOidcLogin|const localOidcLogin' --type=js src/fedmgr/auth-commands.js

Repository: teamktown/fedmgr

Length of output: 1825


🏁 Script executed:

# Check if localOidcLogin is used in the federation test
rg -n 'localOidcLogin' scripts/tests/integration/federation-endpoints.test.js

Repository: teamktown/fedmgr

Length of output: 42


🏁 Script executed:

# Check all references to localOidcLogin in the codebase
rg -n 'localOidcLogin' --type=js

Repository: teamktown/fedmgr

Length of output: 1047


🏁 Script executed:

# Read the unit tests for localOidcLogin to understand what they mock
sed -n '117,200p' scripts/tests/unit/cli/auth-commands.test.js

Repository: teamktown/fedmgr

Length of output: 2741


Reintroduce client authentication/identification for password grant requests.

Lines 49–51 drop both Authorization: Basic … and client_id usage, even though clientId and clientSecret are accepted as function parameters. This will fail against OPs requiring confidential-client authentication (see scripts/tests/integration/federation-endpoints.test.js:118, token_endpoint_auth_method: 'client_secret_basic'). The unit tests mock a permissive endpoint and do not validate client credential transmission.

Additionally, line 53 unconditionally calls tokenRes.json() on error, which can fail if the endpoint returns non-JSON error responses (e.g., HTML or plain text).

🔧 Proposed fix
-  const tokenRes = await fetch(cfg.token_endpoint, { method:'POST', headers:{
-    'Content-Type':'application/x-www-form-urlencoded','Accept':'application/json'
-  }, body:new URLSearchParams({grant_type:'password',username,password,scope:'openid profile'}).toString() });
+  const authMethods = Array.isArray(cfg.token_endpoint_auth_methods_supported)
+    ? cfg.token_endpoint_auth_methods_supported
+    : [];
+  const headers = {
+    'Content-Type':'application/x-www-form-urlencoded',
+    'Accept':'application/json'
+  };
+  const body = new URLSearchParams({
+    grant_type:'password',
+    username,
+    password,
+    scope:'openid profile'
+  });
+
+  if (authMethods.includes('client_secret_basic') && clientId && clientSecret) {
+    headers['Authorization'] = `Basic ${Buffer.from(`${clientId}:${clientSecret}`).toString('base64')}`;
+  } else if (clientId) {
+    body.set('client_id', clientId);
+    if (authMethods.includes('client_secret_post') && clientSecret) {
+      body.set('client_secret', clientSecret);
+    }
+  }
+
+  const tokenRes = await fetch(cfg.token_endpoint, {
+    method:'POST',
+    headers,
+    body: body.toString()
+  });

Also improve error handling at line 53:

  if (!tokenRes.ok) {
-   const errData = await tokenRes.json();
+   let errData;
+   try {
+     errData = await tokenRes.json();
+   } catch {
+     throw new Error(`Authentication failed: HTTP ${tokenRes.status}`);
+   }
    throw new Error(`Authentication failed: ${errData.error_description || errData.error}`);
  }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const tokenRes = await fetch(cfg.token_endpoint, { method:'POST', headers:{
'Content-Type':'application/x-www-form-urlencoded','Accept':'application/json',
'Authorization':`Basic ${Buffer.from(`${clientId}:${clientSecret}`).toString('base64')}`
'Content-Type':'application/x-www-form-urlencoded','Accept':'application/json'
}, body:new URLSearchParams({grant_type:'password',username,password,scope:'openid profile'}).toString() });
if (!tokenRes.ok) {
let errData;
try {
errData = await tokenRes.json();
} catch {
throw new Error(`Authentication failed: HTTP ${tokenRes.status}`);
}
throw new Error(`Authentication failed: ${errData.error_description || errData.error}`);
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/fedmgr/auth-commands.js` around lines 49 - 51, The password-grant token
request drops client authentication and assumes JSON errors; update the token
request in the function that constructs tokenRes (using cfg.token_endpoint,
username, password, clientId, clientSecret) to include client credentials: if
clientSecret is provided send an Authorization: Basic header
(base64(clientId:clientSecret)), otherwise include client_id in the
URLSearchParams body; also change error handling after the fetch so you only
call tokenRes.json() when the response Content-Type is JSON and otherwise call
tokenRes.text() (or include the raw text) before throwing so non-JSON error
responses are handled safely.

Comment on lines +53 to +54
const errData = await tokenRes.json();
throw new Error(`Authentication failed: ${errData.error_description || errData.error}`);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Handle non-JSON token endpoint error bodies safely.

Lines 53-54 assume JSON on all failures. If the OP returns text/HTML, this throws a parse error and hides the real auth failure.

🔧 Proposed fix
   if (!tokenRes.ok) {
-    const errData = await tokenRes.json();
-    throw new Error(`Authentication failed: ${errData.error_description || errData.error}`);
+    const contentType = tokenRes.headers.get('content-type') || '';
+    let detail = `HTTP ${tokenRes.status}`;
+    if (contentType.includes('application/json')) {
+      const errData = await tokenRes.json().catch(() => ({}));
+      detail = errData.error_description || errData.error || detail;
+    } else {
+      const text = await tokenRes.text().catch(() => '');
+      if (text) detail = text;
+    }
+    throw new Error(`Authentication failed (${tokenRes.status}): ${detail}`);
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/fedmgr/auth-commands.js` around lines 53 - 54, The code assumes the token
endpoint always returns JSON (using await tokenRes.json()) which throws on
text/HTML error responses; update the error handling around tokenRes to attempt
JSON parsing in a try/catch and, if parsing fails, fall back to await
tokenRes.text() (or use tokenRes.statusText) so you can throw a meaningful Error
containing either errData.error_description/errData.error when JSON, or the raw
text/status when not; change the throw site that currently uses errData to use
this safe-parsed fallback to avoid hiding the real auth failure.

Comment on lines +16 to +17
if (typeOrName === 'fed' && name) {
const fedPath = path.join(ctx.FEDMGR_FED_DIR, name);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add an explicit error for inspect fed|mcp without <name>.

inspect fed / inspect mcp currently falls through to legacy MCP-inspector mode and incorrectly treats fed/mcp as an MCP name.

Suggested fix
     .action((typeOrName, name, opts) => {
       const registry = ctx.loadRegistry();
 
+      if ((typeOrName === 'fed' || typeOrName === 'mcp') && !name) {
+        console.error(`❌ Missing required <name>. Usage: inspect ${typeOrName} <name>`);
+        process.exit(1);
+      }
+
       // Handle 'inspect fed <fedname>' - show federation entity configuration
       if (typeOrName === 'fed' && name) {

Also applies to: 40-41, 64-66

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/fedmgr/commands/inspect.js` around lines 16 - 17, The inspect command
currently lets "inspect fed" or "inspect mcp" fall through as if the literal
string were an MCP name; add an explicit validation that when typeOrName ===
'fed' or typeOrName === 'mcp' a non-empty name is required and throw/exit with a
clear error if name is missing. Concretely, in src/fedmgr/commands/inspect.js
update the conditional blocks that check (typeOrName === 'fed' && name) and the
analogous (typeOrName === 'mcp' && name) to first test if typeOrName is 'fed' or
'mcp' and name is falsy, then emit an error like "must specify a name for
'inspect fed'/'inspect mcp'" and return/exit; otherwise continue with the
existing logic that constructs fedPath/mcpPath and the legacy inspector
fallback. Ensure you update all three similar sites in the file where these
checks appear (the fed branch, the mcp branch, and the legacy-mcp handling).

Comment on lines +28 to +30
if (fs.existsSync(configFile)) {
const config = JSON.parse(fs.readFileSync(configFile, 'utf-8'));
console.log(` Entity ID: ${config.sub}`);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Handle malformed entity-configuration.json safely.

Both branches parse JSON without try/catch; a malformed file will crash the command with an uncaught exception.

Suggested fix
         if (fs.existsSync(configFile)) {
-          const config = JSON.parse(fs.readFileSync(configFile, 'utf-8'));
+          let config;
+          try {
+            config = JSON.parse(fs.readFileSync(configFile, 'utf-8'));
+          } catch (err) {
+            console.error(`❌ Invalid JSON in ${configFile}: ${err.message}`);
+            process.exit(1);
+          }
           console.log(`   Entity ID: ${config.sub}`);
           if (config.metadata && config.metadata.federation_entity) {
           if (fs.existsSync(configFile)) {
-            const config = JSON.parse(fs.readFileSync(configFile, 'utf-8'));
+            let config;
+            try {
+              config = JSON.parse(fs.readFileSync(configFile, 'utf-8'));
+            } catch (err) {
+              console.error(`❌ Invalid JSON in ${configFile}: ${err.message}`);
+              process.exit(1);
+            }
             console.log(`   Entity ID: ${config.sub}`);
           }

Also applies to: 56-58

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/fedmgr/commands/inspect.js` around lines 28 - 30, Wrap the JSON.parse
calls that read configFile in try/catch to handle malformed
entity-configuration.json safely: locate the two places where
fs.readFileSync(configFile, 'utf-8') is passed into JSON.parse (the block that
sets const config and logs `Entity ID: ${config.sub}` and the similar block at
the later occurrence) and catch JSON parse errors, logging a clear error message
(including the file name and the caught error) and either skip printing the
config values or return gracefully instead of letting the process crash.

Comment on lines +27 to +32
const defaultInstancesDir = process.env.FEDMGR_HOME
? path.join(process.env.FEDMGR_HOME, 'mcp_instances')
: path.resolve(__dirname, '../../mcp_instances');
const defaultProtocolDir = process.env.FEDMGR_HOME
? path.join(process.env.FEDMGR_HOME, 'mcp_protocol_servers')
: path.resolve(__dirname, '../../mcp_protocol_servers');
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fix fallback directory depth for default MCP paths.

When FEDMGR_HOME is unset, Line 29 and Line 32 resolve to src/mcp_instances and src/mcp_protocol_servers (from src/fedmgr/server), not workspace-level directories. This can split runtime state depending on caller path configuration.

Suggested fix
-    const defaultInstancesDir = process.env.FEDMGR_HOME
-      ? path.join(process.env.FEDMGR_HOME, 'mcp_instances')
-      : path.resolve(__dirname, '../../mcp_instances');
-    const defaultProtocolDir = process.env.FEDMGR_HOME
-      ? path.join(process.env.FEDMGR_HOME, 'mcp_protocol_servers')
-      : path.resolve(__dirname, '../../mcp_protocol_servers');
+    const defaultInstancesDir = process.env.FEDMGR_HOME
+      ? path.join(process.env.FEDMGR_HOME, 'mcp_instances')
+      : path.resolve(__dirname, '../../../mcp_instances');
+    const defaultProtocolDir = process.env.FEDMGR_HOME
+      ? path.join(process.env.FEDMGR_HOME, 'mcp_protocol_servers')
+      : path.resolve(__dirname, '../../../mcp_protocol_servers');
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/fedmgr/server/mcp-interface.js` around lines 27 - 32, The fallback paths
for defaultInstancesDir and defaultProtocolDir resolve to src/mcp_* because
path.resolve(__dirname, '../../...') only climbs two levels; when FEDMGR_HOME is
unset change the fallback to climb three levels so the workspace root
mcp_instances and mcp_protocol_servers are used instead: update the path.resolve
calls in defaultInstancesDir and defaultProtocolDir from '../../mcp_instances'
and '../../mcp_protocol_servers' to '../../../mcp_instances' and
'../../../mcp_protocol_servers' respectively (referencing defaultInstancesDir,
defaultProtocolDir and FEDMGR_HOME).

teamktown and others added 9 commits March 16, 2026 13:07
…trust mark status

Implements RALPH loop Increment E across kms and ta-server packages.

- chain-verifier: verifyTrustChain() walks authority_hints (not iss) from leaf
  to trust anchor per OIDF draft-43 §9; injectable fetchFn for unit tests
- trust-mark-status: TrustMarkStatusRegistry + GET /trust-mark-status endpoint
- isIntermediate(): detects non-leaf entities by federation_fetch_endpoint presence
- Pkcs11Provider: lazy pkcs11js loading, DER→raw ECDSA sig, public key export
  via C_GetAttributeValue; 8 integration tests gated behind SOFTHSM2_MODULE
- pkcs11js.d.ts: ambient declaration so kms builds in CI without native addon
- 22 new tests: 14 passing + 8 skipped (HSM integration); 35/35 ta-server green

Co-Authored-By: claude-flow <ruv@ruv.net>
… SoftHSM2 Docker test

GET /federation_list:
- Add entity_type={intermediate,leaf} filter (practical extension)
- Add limit + after pagination query params
- 11 new HTTP tests covering all filter/pagination combinations

FileSubordinateRegistry (src/federation/file-registry.ts):
- Persistent JSON-backed registry extending SubordinateRegistry
- Reads on construction, writes on every register()/remove()
- TA_REGISTRY_PATH env var activates it in ta-server (zero-dep, drop-in)
- 8 new tests: persist, reload, round-trip metadata, independent paths

SoftHSM2 Docker integration test:
- examples/lab/Dockerfile.softhsm-test: Node 20 + SoftHSM2 + pkcs11js native addon
- examples/lab/docker-compose.softhsm-test.yml: single test service, exit-code-from
- scripts/docker-pkcs11-entrypoint.sh: token init, keygen, env wiring, test run
- Runs the 8 currently-skipped Pkcs11Provider tests against a real PKCS#11 library

ta-server: 54/54 tests passing (was 35)

Co-Authored-By: claude-flow <ruv@ruv.net>
…now pass

Four bugs found and fixed by running against real SoftHSM2:

1. CKO_PUBLIC_KEY was 3 (PRIVATE_KEY) — should be 2; caused _getPublicJwk()
   to find no key object and throw "no public key found"

2. C_FindObjectsInit template values were Buffer.from([0,0,0,N]) — pkcs11js
   expects plain numbers/strings; it marshals to CK_ULONG internally

3. C_Initialize and C_Login throw CKR_CRYPTOKI_ALREADY_INITIALIZED (401) and
   CKR_USER_ALREADY_LOGGED_IN (256) respectively when called again per test.
   PKCS#11 tokens share crypto state across sessions in a process — both are
   safe to ignore.

4. CKM_ECDSA returns raw R||S (not DER); comment was wrong. Also CKM_ECDSA
   signs the *hash* (not the message) so signJwt() now pre-computes
   SHA-256(signingInput) before calling C_Sign. derEcdsaToRaw() was removed
   from the signing path. CKM_ECDSA_SHA256 unavailable in SoftHSM2 2.6.

C_Sign requires 3 args in pkcs11js: (session, data, outputBuffer).
Pre-allocated 128-byte output buffer for DER ECDSA P-256 signatures.

kms: 15/15 pass (with SOFTHSM2_MODULE set), 1 skipped (manual meta-test)

Co-Authored-By: claude-flow <ruv@ruv.net>
…, test tooling

- docs/testing.md: complete rewrite — philosophy, categories, directory structure,
  running tests (standard + native SoftHSM2 + HSM), suite status tables,
  conventions (concurrent-safe tmp dirs, module-level keygen, authority_hints),
  gotchas (8 items: CKO_PUBLIC_KEY, Buffer attrs, C_Sign 3-arg, CKM_ECDSA,
  cross-session errors, overlayfs), and FAQ (10 Q&As)

- docs/architecture.md: updated KMS section (full Pkcs11Provider description),
  ta-server section (FileSubordinateRegistry, TrustChainVerifier,
  federation_list query params, trust-mark-status), test suite totals

- docs/decisions.md: Decision 2 updated (SOFTHSM2_MODULE gating implemented);
  Decision 5 added (native SoftHSM2 over Docker for devcontainer HSM testing,
  including SoftHSM2 API constraints discovered)

- docs/ralph-loop/increment-F-*.md: new RALPH doc covering federation_list
  filters/pagination, FileSubordinateRegistry, PKCS#11 provider, all 9 bugs
  fixed, full test suite status, remaining work

- scripts/test-report.sh: dual reporter (spec→stdout + junit→test-results/);
  HSM tests gated by SOFTHSM2_MODULE; summary.log per run

- packages/kms/package.json: test:hsm, test:report, test:hsm:report scripts
- packages/ta-server/package.json: test:report script
- .gitignore: track test-results/ dir; ignore .xml/.tap/.log generated files
- test-results/: directory tracked so CI can write to known path

Co-Authored-By: claude-flow <ruv@ruv.net>
…g, SBOM, daily scan

GitHub Actions workflows:
  ci.yml         — matrix test (Node 18/20/22), JUnit upload, SBOM artifact on main
  release.yml    — semantic-release per package (kms → tmi/ta/cli in parallel);
                   npm provenance via OIDC (NPM_CONFIG_PROVENANCE=true)
  containers.yml — multi-arch Docker build → Trivy scan → cosign keyless sign →
                   CycloneDX SBOM attestation → OIDF trustmark issuance + attachment
                   (trustmark step gated on TMI_PRIV_JWE secret)
  daily-scan.yml — npm audit + Trivy FS scan + CycloneDX SBOM diff + license-check;
                   opens GitHub issues for new CVEs and new supply-chain dependencies

Semantic release per package:
  packages/{kms,tmi-server,ta-server,fedmgr-cli}/.releaserc.json
  Uses semantic-release-monorepo (scopes commit analysis to package directory)
  Tag format: @letsfederate/<pkg>@<version>
  Release order: kms first (upstream dep), then tmi-server/ta-server/fedmgr-cli

npm package hygiene:
  publishConfig.provenance: true   — links every published tarball to the GHA run
  publishConfig.access: public     — required for @letsfederate/* org packages
  files: [dist, CHANGELOG.md]      — only ship compiled output
  repository.directory: packages/* — monorepo deep links in npm registry

Root devDependencies:
  + @semantic-release/{changelog,git,github,npm}
  + semantic-release-monorepo

scripts/sbom-diff.sh:
  Compares two CycloneDX JSON SBOMs; exits 1 on new components (supply chain gate)

docs/signing.md:
  Full "say what we do / do what we say" reference:
  trust chain diagram, npm provenance verification, cosign verification commands,
  SBOM reading, secrets checklist, commit convention → semver table, release flow

Co-Authored-By: claude-flow <ruv@ruv.net>
…st guide

packages/kms/src/trust-validator.ts (new):
  Core TrustValidator module — three trust states (VALID/WARN/INVALID),
  three policies (strict/permissive/audit).
  validateTrustmark(): JWS decode → JWKS fetch → sig verify → expiry check
  validateTrustChain(): walks authority_hints up to named trust anchor,
    verifies each subordinate statement signature along the way
  applyPolicy(): enforces TRUST_POLICY; strict throws TrustPolicyError (EX_CONFIG)
  trustPolicyFromEnv(): reads TRUST_POLICY env; defaults to strict
  All INVALID states include `recommendedAction` with concrete next steps

packages/kms/src/index.ts:
  Exports validateTrustmark, validateTrustChain, applyPolicy, TrustPolicyError,
  trustPolicyFromEnv, TrustResult, TrustState, TrustPolicy types

packages/fedmgr-cli/src/commands/trustmark.ts:
  trustmark check: validates JWS + full OIDF chain to TA; exits 0/1/2
  trustmark adopt: validates source trustmark then issues adoption trustmark
    with adopted_from_iss/adopted_from_id/adopted_from_jws provenance claims
  verify: improved error messages with [TRUST:FAIL] prefix + recommended actions
  All error paths include structured [TRUST:*] prefixes and Recommended: lines

packages/tmi-server/src/index.ts:
  SELF_TRUSTMARK_JWS: validates own trustmark at startup; strict aborts with EX_CONFIG
  TMI_AUTHORITY_HINTS: JSON array propagated into entity statement authority_hints
  TRUST_POLICY env: strict (default) | permissive | audit
  adopted_from_iss/adopted_from_id/adopted_from_jws pass-through in IssueRequestSchema
  Config validation emits [TRUST:FAIL] messages with recommended actions before exit
  startServer() wraps app.listen with async trust validation

packages/ta-server/src/index.ts:
  TRUST_POLICY: governs subordinate JWKS fetch failure behaviour at startup
  strict: exits 1 if any JWKS fetch fails
  permissive/audit: skips failed subordinate with [TRUST:POLICY] notice
  loadSubordinates: [TRUST:VALID]/[TRUST:FAIL] per subordinate with recommended actions
  Config validation uses [TRUST:FAIL] + EX_CONFIG (78) exit code

docs/trust-guide.md (new — 300 lines):
  Step-by-step guide for org operators and solo developers:
  Track A: TA/TMI setup, DNS/TLS, key generation, lab, subordinate registration
  Track B: certifying tools — trustmark issue, chain check, npm provenance,
           container signing, SBOM, trustmark attachment
  Track C: personal CA, trustmark adoption, fail-secure error examples,
           permissive/audit mode, self-trustmark for servers
  Track D: scale operations — renewal automation, revocation, key rotation, multi-TA
  Trust state reference table, policy config table, env var quick reference, FAQ

Co-Authored-By: claude-flow <ruv@ruv.net>
…ashboard, and MCP server

Increment G — fedmgr enrollment & revocation infrastructure:

ta-server:
- SqliteFederationStore (better-sqlite3, WAL mode) replaces in-memory registry
  as source of truth for all subordinate state
- Two-step proof-of-key enrollment: POST /enroll → nonce challenge →
  POST /enroll/:id/complete with signed JWS → returns signed entity statement
- Management API: revoke/restore/decommission subordinates, trustmark revocation
  log, audit trail (optional ADMIN_TOKEN Bearer auth)
- Admin dashboard: dark-theme HTML UI at GET /dashboard with stat cards,
  subordinate table (Revoke/Restore/Decommission), enrollment modal, audit log
- federation_list and federation_fetch now DB-backed: only 'active' subordinates
  are served; revoked/decommissioned return 403/404 with [TRUST:WARN] messages
- trust-mark-status endpoint wired to DB revocations table via StatusChecker
  interface (no longer in-memory only)
- StatusChecker interface replaces hard TrustMarkStatusRegistry dependency
- Add better-sqlite3 and jose as direct dependencies

fedmgr-mcp:
- New @letsfederate/fedmgr-mcp package: stdio MCP server with 10 tools
  (federation_status, list_subordinates, enroll_server, complete_enrollment,
  issue_trustmark, verify_trustmark, check_trust_chain, revoke_subordinate,
  get_signed_config, initialize_local_ca)
- federation://status resource for live TA+TMI health polling
- Add npm run mcp:dev / mcp:start scripts to root

Co-Authored-By: claude-flow <ruv@ruv.net>
- data/ and test-data/ are generated by tests and CLI runs
- *.db / *.db-wal / *.db-shm are SQLite runtime files from ta-server

Co-Authored-By: claude-flow <ruv@ruv.net>
…on, XSS, auth

Closes 13 security issues identified in code review:

CRITICAL fixes:
- enrollment: add assertSafeUrl() validation on entity_id and jwks_url at
  POST /enroll to block SSRF via attacker-controlled JWKS URLs (RFC 1918,
  loopback, link-local, non-HTTP schemes all rejected; localhost allowed in
  NODE_ENV=development only)
- enrollment: pin jwtVerify() to algorithms:["ES256"] to prevent algorithm
  confusion / authentication bypass attacks
- enrollment: cap JWKS response body to 64 KB (content-length check + text
  length check) to prevent memory exhaustion via oversized response
- mcp: add assertSafeUrl() validation to all 8 MCP tool functions that accept
  user-supplied URLs (ta_url, tmi_url, entity_url, subject_url, etc.)

HIGH fixes:
- dashboard: apply dashboardAuth middleware to GET /dashboard and all
  GET /dashboard/api/* routes — same ADMIN_TOKEN Bearer semantics as the
  management router; unauthenticated reads of audit log and subordinate list
  are now blocked when ADMIN_TOKEN is set
- dashboard: HTML-escape entityId in buildFallbackHtml() to prevent XSS
  if entityId contains special characters
- dashboard: add Content-Security-Policy header to dashboard HTML responses

MEDIUM fixes:
- management: validate status query parameter against allowed enum values
  (pending|active|revoked|decommissioned); return 400 for unknown values
- management: cap DELETE /subordinates/:entityId reason query param to 500 chars
- enrollment: remove nonce from prose instructions string to reduce log exposure
  (nonce is still returned as a dedicated field)

LOW fixes:
- ta-server: configure app.set('trust proxy', TRUST_PROXY) so req.ip reflects
  the real client IP in audit log entries when behind a reverse proxy
- mcp: rename decodeJwtParts → unsafeDecodeJwtParts with warning comment

Dependency fixes (npm audit):
- npm audit fix: resolved 17 of 19 vulnerabilities automatically
  (jws 3.2.3 installed, fixing jsonwebtoken HMAC verification issue)
- Add overrides.serialize-javascript: "^7.0.4" to fix mocha RCE via
  serialize-javascript <=7.0.2 without downgrading mocha
- Result: 0 vulnerabilities remaining

All 79 tests pass (54 ta-server, 10 tmi-server, 7 kms, 8 fedmgr-cli).

Co-Authored-By: claude-flow <ruv@ruv.net>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Note

Due to the large number of review comments, Critical severity comments were prioritized as inline comments.

♻️ Duplicate comments (10)
packages/tmi-server/src/index.ts (2)

164-210: ⚠️ Potential issue | 🔴 Critical

Protect /trustmarks/issue with authentication/authorization.

This endpoint currently allows any caller to mint first-party signed trustmarks, which is a trust-model break if exposed beyond a private lab network. Consider adding a bearer token check.

🛡️ Minimal gate using env token
+const ISSUE_TOKEN = process.env["TMI_ISSUE_TOKEN"];

 app.post(
   "/trustmarks/issue",
+  (req: Request, res: Response, next: NextFunction) => {
+    if (ISSUE_TOKEN) {
+      const auth = req.header("authorization");
+      if (auth !== `Bearer ${ISSUE_TOKEN}`) {
+        res.status(401).json({ error: "unauthorized" });
+        return;
+      }
+    }
+    next();
+  },
   async (req: Request, res: Response): Promise<void> => {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/tmi-server/src/index.ts` around lines 164 - 210, Add authentication
to the POST /trustmarks/issue handler by validating a bearer token before
parsing the request: read a configured secret (e.g.,
process.env.TRUSTMARK_ISSUE_TOKEN), extract and verify the Authorization header
inside the app.post("/trustmarks/issue", ...) route handler, and return 401/403
on missing/invalid token; only proceed to IssueRequestSchema parsing and calling
kms.signJwt when the token is valid (keep existing variables ISSUER, JWKS_URL,
and call to kms.signJwt unchanged).

119-126: ⚠️ Potential issue | 🟠 Major

Wrap async route handlers to catch rejections for Express 4.x error handling.

Express 4.x does not automatically forward rejected promises from async handlers to error middleware. All async routes (/.well-known/jwks.json, /trustmarks/issue, /.well-known/openid-federation) are vulnerable to unhandled rejections that will crash the server or leave requests hanging.

🔧 Proposed fix with async wrapper
-import express, { type Request, type Response } from "express";
+import express, { type NextFunction, type Request, type Response } from "express";
+
+const asyncRoute =
+  (fn: (req: Request, res: Response) => Promise<void>) =>
+  (req: Request, res: Response, next: NextFunction): void => {
+    void fn(req, res).catch(next);
+  };

 // ...

-app.get(
-  "/.well-known/jwks.json",
-  async (_req: Request, res: Response): Promise<void> => {
+app.get(
+  "/.well-known/jwks.json",
+  asyncRoute(async (_req, res) => {
     const jwks = await kms.jwks();
     res.setHeader("Cache-Control", "public, max-age=3600");
     res.json(jwks);
-  }
-);
+  })
+);

+// Add error middleware at the end
+app.use((err: unknown, _req: Request, res: Response, _next: NextFunction) => {
+  process.stderr.write(`[tmi-server] request failed: ${String(err)}\n`);
+  res.status(500).json({ error: "server_error" });
+});

Also applies to: 164-210, 219-235

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/tmi-server/src/index.ts` around lines 119 - 126, The async route
handlers (e.g., the GET handler for "/.well-known/jwks.json" and the other async
handlers for "/trustmarks/issue" and "/.well-known/openid-federation") are not
wrapped to forward rejected promises to Express 4.x error middleware; add a
reusable wrapper like asyncHandler(fn) that returns (req, res, next) =>
Promise.resolve(fn(req, res, next)).catch(next) and use it to wrap the async
functions passed to app.get/app.post so any thrown/rejected errors are forwarded
to Express error handlers via next(err).
packages/tmi-server/package.json (1)

15-18: ⚠️ Potential issue | 🟠 Major

Use workspace protocol for internal package dependency.

"@letsfederate/kms": "*" is non-deterministic and may pull registry versions unexpectedly. Use workspace:* for reliable monorepo linking.

🔧 Suggested fix
   "dependencies": {
-    "@letsfederate/kms": "*",
+    "@letsfederate/kms": "workspace:*",
     "express": "^4.19.2",
     "zod": "^3.23.8"
   },
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/tmi-server/package.json` around lines 15 - 18, Replace the
non-deterministic dependency entry for "@letsfederate/kms" in package.json so
the monorepo uses the workspace protocol (change the version from "*" to
"workspace:*") to ensure deterministic local linking; update the dependency
string for "@letsfederate/kms" and reinstall your lockfile (or run your package
manager's install) so the workspace resolution is recorded.
packages/kms/src/index.ts (1)

67-77: ⚠️ Potential issue | 🟠 Major

Fix kid derivation to use RFC 7638 thumbprint — exportJWK does not generate a thumbprint.

The current implementation imports and re-exports the JWK, but exportJWK simply returns the key parameters without computing a thumbprint. The fallback to "default" masks this issue and creates unstable identifiers across key rotations.

🔧 Proposed fix using calculateJwkThumbprint

Update the import:

 import {
   importJWK,
   exportJWK,
   SignJWT,
+  calculateJwkThumbprint,
   type JWK,
   type JWTPayload,
   type KeyLike,
 } from "jose";

Update the kid() method:

   async kid(): Promise<string> {
     await this.ensureLoaded();
     if (!this._pub!.kid) {
-      // Derive a stable kid from the JWK thumbprint when the file has none.
-      const k = await importJWK(this._pub as JWK, "ES256");
-      const exported = await exportJWK(k);
-      this._pub!.kid = exported.kid ?? "default";
+      // Derive a stable kid from the JWK thumbprint (RFC 7638).
+      this._pub!.kid = await calculateJwkThumbprint(this._pub as JWK, "sha256");
     }
     this._kid = this._pub!.kid as string;
     return this._kid;
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/kms/src/index.ts` around lines 67 - 77, The kid() method currently
re-exports the JWK and falls back to "default" which doesn't produce an RFC 7638
thumbprint; replace that logic to compute a stable kid using
calculateJwkThumbprint on the public JWK after ensureLoaded() (instead of using
exportJWK), set this._pub.kid to the resulting thumbprint string, update
this._kid from this._pub.kid, and return it; keep existing imports/usage of
importJWK only if still needed for key validation, otherwise directly pass the
JWK to calculateJwkThumbprint to derive the RFC‑7638 thumbprint.
packages/fedmgr-cli/src/commands/trustmark.ts (2)

59-63: ⚠️ Potential issue | 🟠 Major

Validate --ttl before building the request body.

Invalid input passes through as NaN (serialized to null) and out-of-range values bypass the documented bounds (60-86400).

🔧 Proposed fix
+      const ttl = Number.parseInt(opts.ttl, 10);
+      if (!Number.isInteger(ttl) || ttl < 60 || ttl > 86400) {
+        process.stderr.write("[trustmark issue] --ttl must be an integer in [60, 86400]\n");
+        process.exit(1);
+      }
+
       const body: Record<string, unknown> = {
         sub: opts.sub,
         trustmark_id: opts.id,
-        ttl_s: parseInt(opts.ttl, 10),
+        ttl_s: ttl,
       };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/fedmgr-cli/src/commands/trustmark.ts` around lines 59 - 63, Validate
opts.ttl before constructing the request body: parse opts.ttl into a number
(e.g., ttl = parseInt(opts.ttl, 10)), check that it is a finite integer and
within the documented bounds 60..86400, and if not, exit or throw a
user-friendly error; only then use that validated ttl value for the body.ttl_s
field (replace the current inline parseInt in the body construction that can
produce NaN/null). Ensure you reference opts.ttl, the local ttl variable, and
the body.ttl_s field when making the change.

109-128: ⚠️ Potential issue | 🟠 Major

Restrict JWKS URL handling to avoid fetching from untrusted targets.

Using jku directly from token headers allows arbitrary network fetches. An attacker-controlled token could cause the CLI to fetch from internal endpoints or malicious servers.

🔒 Proposed hardening
       try {
         const header = decodeProtectedHeader(opts.jws);
         jwksUrl = opts.jwks ?? (header["jku"] as string);
         if (!jwksUrl) throw new Error("No jku in JWS header and --jwks not supplied");
+        const parsed = new URL(jwksUrl);
+        const isLocalHttp =
+          parsed.protocol === "http:" &&
+          (parsed.hostname === "localhost" || parsed.hostname === "127.0.0.1");
+        if (!(parsed.protocol === "https:" || isLocalHttp)) {
+          throw new Error("JWKS URL must use https (http allowed only for localhost)");
+        }
       } catch (err) {
         process.stderr.write(`[trustmark verify] Header decode failed: ${String(err)}\n`);
         process.exit(1);
       }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/fedmgr-cli/src/commands/trustmark.ts` around lines 109 - 128, The
current verify action reads the JWS header jku and will fetch it directly (in
the action callback around decodeProtectedHeader and the fetch of jwksUrl),
which allows attacker-controlled network requests; change the logic so that
either require an explicit --jwks flag (do not auto-fetch jku) or validate any
jku before fetching: enforce https scheme, check the origin against a configured
whitelist of allowed hosts/origins, reject jku values that resolve to
private/loopback IPs or localhost, and only perform the fetch after these
checks; update the code paths around decodeProtectedHeader, jwksUrl assignment,
and the fetch/jwks logic to implement these validations and return a clear error
if the jku is disallowed.
packages/ta-server/src/federation/subordinate-statements.ts (1)

44-50: ⚠️ Potential issue | 🔴 Critical

Sanitize the embedded JWKS before signing the statement.

Line 49 republishes config.subjectJwks verbatim. If a caller passes private members (d, RSA private parameters, or symmetric k), the TA will publish secret key material and break the no-naked-keys policy.

🔐 Proposed hardening
+const PRIVATE_JWK_FIELDS = new Set(["d", "p", "q", "dp", "dq", "qi", "oth", "k"]);
+
+function toPublicJwks(jwks: { keys: JWK[] }): { keys: JWK[] } {
+  return {
+    keys: jwks.keys.map((key) =>
+      Object.fromEntries(
+        Object.entries(key).filter(([name]) => !PRIVATE_JWK_FIELDS.has(name))
+      ) as JWK
+    ),
+  };
+}
+
 export async function signSubordinateStatement(
   config: SubordinateConfig,
   kms: KeyProvider
 ): Promise<string> {
@@
   const payload = {
@@
-    jwks: config.subjectJwks,
+    jwks: toPublicJwks(config.subjectJwks),
     ...(config.subjectMetadata ? { metadata: config.subjectMetadata } : {}),
   };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ta-server/src/federation/subordinate-statements.ts` around lines 44
- 50, The payload currently embeds config.subjectJwks verbatim which can leak
private key material; create and call a sanitizer (e.g., sanitizeJwks or
stripPrivateJwk) before using it in payload so only public JWK fields are
included. The sanitizer should iterate the JWKS.keys array (reference
config.subjectJwks) and remove private members such as "d", "p", "q", "dp",
"dq", "qi", "oth", "k" and any other algorithm-specific private params,
returning a new JWKS object; then replace config.subjectJwks in the payload
construction with the sanitized result. Ensure the codepaths around payload
creation in subordinate-statements.ts use the sanitized JWKS variable.
packages/ta-server/src/index.ts (2)

222-225: ⚠️ Potential issue | 🟠 Major

Wrap the async route handlers.

Same Express 4.x issue as before: rejected promises from kms.jwks(), signEntityStatement(), or signSubordinateStatement() will not hit error middleware automatically. Use a shared async wrapper or .catch(next) on each route.

Also applies to: 232-252, 301-379

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ta-server/src/index.ts` around lines 222 - 225, The async Express
route handlers (e.g., the GET "/.well-known/jwks.json" handler that calls
kms.jwks(), and the handlers that call signEntityStatement() and
signSubordinateStatement()) must be wrapped so rejected promises are forwarded
to Express error middleware; update each async handler across the file
(including the blocks around lines referencing kms.jwks(),
signEntityStatement(), and signSubordinateStatement()) to either use a shared
async wrapper (e.g., wrapAsync(handler)) or append .catch(next) so
thrown/rejected errors call next(err) instead of being unhandled.

131-166: ⚠️ Potential issue | 🔴 Critical

Validate TA_SUBORDINATES and sanitize fetched JWKS before storing it.

This still trusts the env JSON shape and casts the fetched JWKS directly. A malformed bootstrap spec can get deep into startup, and a misconfigured subordinate JWKS endpoint that serves private members (d, p, q, etc.) would then be persisted and republished via /federation_fetch.

Also applies to: 188-191

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ta-server/src/index.ts` around lines 131 - 166, The TA_SUBORDINATES
parsing and JWKS handling trust input and fetched data too loosely; validate the
parsed specs (ensure it's an array of objects with string entityId and jwksUrl,
optional fetchEndpoint) before using them, and sanitize/validate the fetched
JWKS (in the code around the fetch of spec.jwksUrl) to reject or strip any
private RSA/EC params (d, p, q, dp, dq, qi, oth, etc.), ensure each JWK has
expected public parameters (kty, kid, use/alg or required fields for the kty)
and non-empty keys array, and only persist the cleaned keys to the store
(replace the direct cast to { keys: JWK[] } with a validateAndSanitizeJWKS
helper and use its result when calling store methods and any other JWKS-handling
block such as the second fetch branch referenced).
packages/ta-server/src/enrollment/index.ts (1)

71-153: ⚠️ Potential issue | 🟠 Major

Forward async failures with an Express 4 wrapper.

These handlers return promises, but Express 4 does not automatically send rejected promises to next(err). A failure from fetch, importJWK, or signSubordinateStatement can bypass your error middleware and leave the request unresolved.

Suggested pattern
-import { Router, type Request, type Response } from "express";
+import { Router, type NextFunction, type Request, type Response } from "express";
@@
+const asyncRoute =
+  (
+    handler: (
+      req: Request,
+      res: Response,
+      next: NextFunction
+    ) => Promise<void>
+  ) =>
+  (req: Request, res: Response, next: NextFunction): void => {
+    void handler(req, res, next).catch(next);
+  };
@@
-  router.post("/enroll", async (req: Request, res: Response): Promise<void> => {
+  router.post("/enroll", asyncRoute(async (req: Request, res: Response): Promise<void> => {
     // ...
-  });
+  }));
@@
-    async (req: Request, res: Response): Promise<void> => {
+    asyncRoute(async (req: Request, res: Response): Promise<void> => {
       // ...
-    }
+    })
   );

Also applies to: 156-329

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ta-server/src/enrollment/index.ts` around lines 71 - 153, The async
router.post("/enroll" ...) handler (and other handlers in the same file around
156-329) can throw from operations like fetch, importJWK, or
signSubordinateStatement and Express 4 will not forward the rejection to error
middleware; change the handler signature to accept next (req, res, next) and
wrap the async body in try/catch (or use an async wrapper helper like
express-async-handler) and call next(err) on any caught error; apply the same
change to the matching handlers referenced in this file so all promise
rejections are forwarded to the error middleware.
🟠 Major comments (25)
.github/workflows/release.yml-121-148 (1)

121-148: ⚠️ Potential issue | 🟠 Major

Wire fedmgr-mcp into the release pipeline or keep it private.

This workflow releases kms, tmi-server, ta-server, and fedmgr-cli, but the new fedmgr-mcp package is public too. Without a matching semantic-release job, it can land on main and still never be versioned or published.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/release.yml around lines 121 - 148, The release pipeline
is missing a job for the new public package fedmgr-mcp so it won't be
versioned/published; either add a new GitHub Actions job (e.g.,
release-fedmgr-mcp) modeled after release-fedmgr-cli that checks out, sets up
Node, runs npm ci, builds the package (npm run build -w
`@letsfederate/fedmgr-mcp`) and runs semantic-release-monorepo with the same env
vars, or make the package private by setting "private": true in the fedmgr-mcp
package.json; update the workflow (the release-fedmgr-cli job block and the new
job name) accordingly to ensure fedmgr-mcp is included or intentionally omitted.
scripts/test-report.sh-17-18 (1)

17-18: ⚠️ Potential issue | 🟠 Major

Preserve the consolidated report when one suite fails.

With set -euo pipefail, the first failing build/test pipeline aborts the script before later packages run and before the final summary is complete. CI then loses the cross-package failure picture this helper is meant to produce.

Also applies to: 38-55, 69-83

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/test-report.sh` around lines 17 - 18, The script currently uses `set
-euo pipefail` which aborts on the first failing command and prevents completing
the consolidated report; change the header to `set -uo pipefail` (drop `-e`) and
update the per-suite execution blocks (the loops/commands around lines 38-55 and
69-83) to capture each suite's exit code (e.g., run each test command with `||
true` and store its `$?` into an array or set a `FAILED=true` flag) so all
packages run, results are appended to the consolidated report, and at the end
emit the summary and exit with non-zero if any captured exit code indicates
failure.
.github/workflows/ci.yml-112-120 (1)

112-120: ⚠️ Potential issue | 🟠 Major

Don’t let per-package SBOM generation fail silently.

The loop already omits the new fedmgr-mcp workspace, and the trailing || true hides failures for the packages it does try to scan. That leaves CI green with an incomplete SBOM bundle.

📄 Suggested loop hardening
-          for pkg in kms fedmgr-cli tmi-server ta-server; do
+          failed=0
+          for pkg in kms fedmgr-cli fedmgr-mcp tmi-server ta-server; do
             cyclonedx-npm \
               --package-lock-only \
               --output-format JSON \
               --output-file sbom-output/${pkg}-sbom.json \
-              packages/${pkg}/package.json || true
+              packages/${pkg}/package.json || failed=1
           done
+          exit "$failed"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/ci.yml around lines 112 - 120, The SBOM generation loop
currently skips the new workspace and masks failures by appending "|| true", so
update the loop that iterates over pkg (the for pkg in ...; do ... cyclonedx-npm
... || true done) to include the new workspace name "fedmgr-mcp" in the package
list and remove the "|| true" so that cyclonedx-npm failures fail the job;
alternatively, replace the hardcoded list with a robust discovery (e.g.,
generate the list from workspaces or a manifest) and ensure any cyclonedx-npm
invocation returns a non-zero exit code on error so CI fails visibly.
.github/workflows/daily-scan.yml-145-149 (1)

145-149: ⚠️ Potential issue | 🟠 Major

Fix file naming and artifact run-id for SBOM baseline comparison.

The download-artifact action on line 153 lacks a run-id parameter, so it cannot retrieve the baseline from the previous workflow run. Additionally, the uploaded artifact contains fedmgr-sbom-YYYY-MM-DD.json (line 202), but the diff step expects fedmgr-sbom-baseline.json (line 163), causing a file-not-found error.

Specify run-id to fetch the previous run's artifact and rename the file to fedmgr-sbom-baseline.json during upload, or update the diff step to use the dated filename.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/daily-scan.yml around lines 145 - 149, The workflow fails
to locate the baseline because the download-artifact step lacks a run-id and the
uploaded SBOM uses a dated name; update the download-artifact step to include
the previous run's run-id (e.g., using github.run_id or by querying the previous
workflow run) and make artifact names consistent: either change the upload step
that creates fedmgr-sbom-$(date +%Y-%m-%d).json (produced by cyclonedx-npm) to
upload/rename it as fedmgr-sbom-baseline.json, or change the diff step to
reference the dated filename; ensure the steps referenced (cyclonedx-npm,
download-artifact, upload-artifact) use the same filename
(fedmgr-sbom-baseline.json) and that download-artifact specifies the correct
run-id to fetch the prior run's artifact.
packages/fedmgr-mcp/package.json-16-16 (1)

16-16: ⚠️ Potential issue | 🟠 Major

Replace the wildcard dependency with an explicit semver range.

"@letsfederate/kms": "*" allows every published fedmgr-mcp release to float to whatever KMS version is newest on npm, which can break consumers when @letsfederate/kms ships a major version. Use an explicit semver range (e.g., ^0.0.1) so consumers resolve the version this package was tested and released against.

Note: The repository uses npm workspaces; the workspace: protocol is not supported. Use explicit semver instead.

📦 Suggested manifest change
-    "@letsfederate/kms": "*",
+    "@letsfederate/kms": "^0.0.1",
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/fedmgr-mcp/package.json` at line 16, The package.json currently pins
"@letsfederate/kms" to the wildcard "*" which allows uncontrolled major
upgrades; replace that wildcard with an explicit semver range (for example
"^0.0.1" or the specific minor/patch you tested) in the dependencies entry for
"@letsfederate/kms" so consumers resolve the tested version; update the
dependencies object where "@letsfederate/kms" appears and commit the changed
semver string.
packages/kms/package.json-36-39 (1)

36-39: ⚠️ Potential issue | 🟠 Major

Move pkcs11js to optionalDependencies to unblock installations on unsupported platforms.

The codebase already lazy-loads pkcs11js via dynamic import and provides a TypeScript stub (src/types/pkcs11js.d.ts) that allows compilation without the package. SoftKMS operates independently and is documented as the primary CI path (Decision 2). Having pkcs11js in dependencies forces installation failures on platforms where PKCS#11 bindings cannot be compiled (Alpine, Windows, etc.), even for consumers using only SoftKMS. Moving it to optionalDependencies reflects the actual architecture without breaking anything, since the code already throws a helpful error message at runtime if the module is missing.

📦 Suggested manifest change
   "dependencies": {
-    "jose": "^5.2.0",
-    "pkcs11js": "^2.1.6"
+    "jose": "^5.2.0"
+  },
+  "optionalDependencies": {
+    "pkcs11js": "^2.1.6"
   },
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/kms/package.json` around lines 36 - 39, Update the package manifest
so pkcs11js is not required at install time: move "pkcs11js" from "dependencies"
into "optionalDependencies" in package.json; keep "jose" in dependencies as-is,
and ensure the existing lazy/dynamic import usage (and the TypeScript stub
src/types/pkcs11js.d.ts) continues to cover compilation for consumers using
SoftKMS so runtime behavior is unchanged if pkcs11js is absent.
scripts/docker-pkcs11-entrypoint.sh-63-74 (1)

63-74: ⚠️ Potential issue | 🟠 Major

Fail if key setup does not leave behind the requested key.

The pkcs11-tool ... || { ... --list-objects ... } fallback treats any non-zero exit as recoverable and continues without verifying that $LABEL actually exists. If key generation fails and the key doesn't already exist, the script prints "Key setup complete" and proceeds to tests with a non-existent key label, causing failures downstream that are harder to debug.

Add a check to confirm the key exists before proceeding:

Suggested fail-fast check
-pkcs11-tool --module "$SOFTHSM2_LIB" \
-  --login --pin "$PIN" \
-  --keypairgen \
-  --key-type EC:prime256v1 \
-  --label "$LABEL" \
-  --id 01 \
-  2>&1 || {
-    # Key may already exist from a previous run — list to confirm
-    echo "[pkcs11-test] Key generation returned non-zero (may already exist)."
-    pkcs11-tool --module "$SOFTHSM2_LIB" --list-objects --login --pin "$PIN" 2>&1
-  }
+if ! pkcs11-tool --module "$SOFTHSM2_LIB" \
+  --login --pin "$PIN" \
+  --keypairgen \
+  --key-type EC:prime256v1 \
+  --label "$LABEL" \
+  --id 01; then
+  echo "[pkcs11-test] Key generation failed; checking for an existing '$LABEL' key..."
+  if ! pkcs11-tool --module "$SOFTHSM2_LIB" --list-objects --login --pin "$PIN" 2>&1 | grep -q "$LABEL"; then
+    echo "[pkcs11-test] ERROR: expected key '$LABEL' was not found." >&2
+    exit 1
+  fi
+fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/docker-pkcs11-entrypoint.sh` around lines 63 - 74, The key generation
fallback currently treats any non-zero exit as recoverable without verifying the
requested key exists; after running pkcs11-tool --keypairgen (and after the
fallback list in the || block), run pkcs11-tool --module "$SOFTHSM2_LIB"
--list-objects --login --pin "$PIN" and check its output for the label value in
"$LABEL" (e.g., grep -q "$LABEL"); if the label is not present, print a clear
error including "$LABEL" and exit non-zero instead of continuing to “Key setup
complete”, ensuring the script fails fast when the key was not actually created
by the pkcs11-tool keypairgen command.
.github/workflows/daily-scan.yml-110-111 (1)

110-111: ⚠️ Potential issue | 🟠 Major

Pin Trivy to an immutable ref.

aquasecurity/trivy-action@master makes this security workflow non-reproducible and allows upstream branch changes to execute without review. Use a release tag at minimum (e.g., @0.35.0), or preferably a full commit SHA per GitHub's security hardening guidance.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/daily-scan.yml around lines 110 - 111, The Trivy step
"Trivy filesystem scan" currently uses an unstable ref (uses:
aquasecurity/trivy-action@master); update that step to pin the action to an
immutable ref by replacing `@master` with a specific release tag (e.g., `@0.35.0`)
or, preferably, a full commit SHA to ensure reproducible, reviewable runs; make
the change in the step that specifies uses: aquasecurity/trivy-action so the
workflow references a fixed tag or SHA.
packages/ta-server/test/trust-mark-status.test.mjs-18-18 (1)

18-18: ⚠️ Potential issue | 🟠 Major

Test the source tree, not the last build artifact.

Importing ../dist/... means this suite fails on a clean checkout and can pass against stale compiled code after src changes. Point the tests at src via tsx, or build as part of the package test command.

Also applies to: 74-74

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ta-server/test/trust-mark-status.test.mjs` at line 18, Tests import
compiled artifacts (TrustMarkStatusRegistry, buildStatusKey) from ../dist which
breaks on clean checkouts and can mask stale builds; update the test to import
the source modules instead (point at the src entry using tsx or the
source-relative module path) or ensure the package test script builds before
running tests so the test imports the freshly built code; specifically change
the import of TrustMarkStatusRegistry and buildStatusKey to reference the source
module (via tsx runner or src path) or add a pretest build step in package test
configuration.
test-results/e2e-results.json-1-1 (1)

1-1: ⚠️ Potential issue | 🟠 Major

This checked-in report captures a broken E2E baseline.

The artifact says success:false and shows the federation workflow timing out while waiting for port 9001. Please fix that readiness path before merging, or drop generated test reports from version control so the repository does not carry a known-failing snapshot.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test-results/e2e-results.json` at line 1, The checked-in e2e-results.json
records a failing E2E baseline (success:false) due to the waitForServer timeout
for port 9001; either fix the readiness path by updating the waitForServer
implementation in scripts/tests/e2e/federation-workflow.test.js (the
waitForServer function) to use a reliable readiness probe (e.g., poll HTTP
health endpoint, extend timeout/retry, and confirm the test actually starts the
server bound to port 9001) or stop committing generated test artifacts by
removing test-results/e2e-results.json from the repo and adding that path to
.gitignore so known-failing snapshots aren’t stored in source control.
packages/ta-server/src/federation/trust-mark-status.ts-27-28 (1)

27-28: ⚠️ Potential issue | 🟠 Major

Use a collision-free registry key.

Concatenating with :: can alias different (sub, id) pairs when either input contains the delimiter, which makes revoke/restore target the wrong status entry. Encode the tuple structurally instead.

Minimal change
 export function buildStatusKey(sub: string, id: string): string {
-  return `${sub}::${id}`;
+  return JSON.stringify([sub, id]);
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ta-server/src/federation/trust-mark-status.ts` around lines 27 - 28,
The key builder buildStatusKey currently concatenates sub and id with "::",
which can collide if either contains that delimiter; change buildStatusKey to
produce a collision-free structural encoding (for example, serialize the pair as
JSON.stringify([sub,id]) or join with a length-prefix or null (\0) separator) so
every (sub,id) maps to a unique key; update any code that reads/parses the key
(if any) to use the same decoder or avoid parsing by treating the produced
string as an opaque key.
packages/ta-server/src/db/migrations/001-init.sql-2-6 (1)

2-6: ⚠️ Potential issue | 🟠 Major

This migration advertises PostgreSQL support that does not exist and would fail if attempted.

Line 2 claims "Supports SQLite (local dev) and PostgreSQL (production) with compatible syntax," but the PRAGMA statements on lines 5–6 are SQLite-only syntax that PostgreSQL cannot parse. Additionally, store.ts indicates PostgreSQL support is planned for "future Increment G"—it is not yet implemented. Either remove the PostgreSQL claim from line 2 or defer it until an actual PostgreSQL adapter exists.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ta-server/src/db/migrations/001-init.sql` around lines 2 - 6, The
migration file incorrectly claims PostgreSQL support while containing
SQLite-only PRAGMA statements; update the comment on the top of
packages/ta-server/src/db/migrations/001-init.sql to remove or reword the
PostgreSQL claim so it only advertises SQLite (local dev) support, and/or add a
clear TODO noting PostgreSQL support is planned in store.ts (see
FederationStore.migrate and the store.ts "future Increment G" note) so readers
aren’t misled; ensure the file no longer asserts compatibility with PostgreSQL
unless a proper PostgreSQL adapter and syntax are implemented.
package.json-27-27 (1)

27-27: ⚠️ Potential issue | 🟠 Major

npm run tmi:decrypt won't set environment variables in the caller's shell.

npm executes scripts in a subshell, so source cannot export variables back to the parent. The script's own documentation indicates this is meant for interactive use (source scripts/tmi-decrypt.sh); for automation, use npm run tmi:start instead, which correctly invokes the script with the --start flag. Either remove this npm script or document it as interactive-only and point users to tmi:start for automation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@package.json` at line 27, The "tmi:decrypt" npm script uses "source
scripts/tmi-decrypt.sh" which cannot export environment variables to the caller
shell because npm runs scripts in a subshell; remove the "tmi:decrypt" entry
from package.json or replace it with a comment/documentation-only note, and
update documentation/README to mark "tmi:decrypt" as interactive-only and direct
automated workflows to use the existing "tmi:start" script (which invokes the
script with --start). Locate the "tmi:decrypt" key in package.json and either
delete that key or add a clear README note referencing "tmi:start" and
"scripts/tmi-decrypt.sh" so users know to source the script manually in an
interactive shell if needed.
packages/ta-server/src/dashboard/ui.html-356-367 (1)

356-367: ⚠️ Potential issue | 🟠 Major

Potential XSS via entity_id in inline event handlers.

The esc() function on line 490 only escapes single quotes, but entity_id values are injected into onclick attributes. A malicious entity ID containing \' or other characters could break out of the string context and execute arbitrary JavaScript.

// Current esc() only handles single quotes:
function esc(s) { return s.replace(/'/g, "\\'"); }

// Attack vector: entity_id = "x'); alert(1); ('"
// Produces: onclick="revokeEntity('x\'); alert(1); (\'')"
🔒 Proposed fix — use data attributes instead of inline handlers
-          ${r.status === 'active'  ? `<button class="btn btn-danger" onclick="revokeEntity('${esc(r.entity_id)}')">Revoke</button>` : ''}
-          ${r.status === 'revoked' ? `<button class="btn btn-success" onclick="restoreEntity('${esc(r.entity_id)}')">Restore</button>` : ''}
-          ${['active','revoked'].includes(r.status) ? `<button class="btn btn-ghost" style="margin-left:4px" onclick="decommissionEntity('${esc(r.entity_id)}')">Decommission</button>` : ''}
+          ${r.status === 'active'  ? `<button class="btn btn-danger" data-action="revoke" data-entity="${encodeURIComponent(r.entity_id)}">Revoke</button>` : ''}
+          ${r.status === 'revoked' ? `<button class="btn btn-success" data-action="restore" data-entity="${encodeURIComponent(r.entity_id)}">Restore</button>` : ''}
+          ${['active','revoked'].includes(r.status) ? `<button class="btn btn-ghost" style="margin-left:4px" data-action="decommission" data-entity="${encodeURIComponent(r.entity_id)}">Decommission</button>` : ''}

Then use event delegation:

document.getElementById('subs-table').addEventListener('click', (e) => {
  const btn = e.target.closest('[data-action]');
  if (!btn) return;
  const entityId = decodeURIComponent(btn.dataset.entity);
  const action = btn.dataset.action;
  if (action === 'revoke') revokeEntity(entityId);
  else if (action === 'restore') restoreEntity(entityId);
  else if (action === 'decommission') decommissionEntity(entityId);
});
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ta-server/src/dashboard/ui.html` around lines 356 - 367, The current
template injects entity_id into inline onclick handlers using esc(), which only
escapes single quotes and allows XSS; stop using inline handlers in
tbody.innerHTML (the rows map producing buttons) and instead render buttons with
safe data attributes (e.g. <button data-action="revoke"
data-entity="${encodeURIComponent(r.entity_id)}" class="btn
btn-danger">Revoke</button>) and remove usage of esc() for these buttons; add an
event delegation listener on the table container (e.g.
document.getElementById('subs-table').addEventListener('click', ...)) that reads
decodeURIComponent(btn.dataset.entity) and dispatches to the existing
revokeEntity, restoreEntity, and decommissionEntity functions based on
btn.dataset.action.
packages/fedmgr-cli/src/commands/trustmark.ts-317-325 (1)

317-325: ⚠️ Potential issue | 🟠 Major

TTL validation also missing in adopt command.

The same parseInt without validation issue exists in the adopt command at line 320.

🔧 Proposed fix
+        const ttl = Number.parseInt(opts.ttl, 10);
+        if (!Number.isInteger(ttl) || ttl < 60 || ttl > 86400) {
+          process.stderr.write("[trustmark adopt] --ttl must be an integer in [60, 86400]\n");
+          process.exit(1);
+        }
+
         const body: Record<string, unknown> = {
           sub: opts.sub,
           trustmark_id: opts.id,
-          ttl_s: parseInt(opts.ttl, 10),
+          ttl_s: ttl,
           adopted_from_iss: sourceIss,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/fedmgr-cli/src/commands/trustmark.ts` around lines 317 - 325, In the
adopt command where the request body is built (the block creating body with sub,
trustmark_id, ttl_s: parseInt(opts.ttl, 10), ...), validate opts.ttl before
using parseInt: parse and ensure it yields a finite positive integer (e.g.,
Number(parsed) and Number.isInteger(parsed) && parsed > 0), handle NaN/invalid
values by returning/throwing a proper error or using a safe default, and only
set ttl_s to that validated integer; update references to opts.ttl and ttl_s in
the adopt handler to use the validated value.
packages/kms/src/trust-validator.ts-289-299 (1)

289-299: ⚠️ Potential issue | 🟠 Major

Use the advertised federation fetch endpoint and verify the authority statement first.

The chain walk hard-codes /${"federation_fetch"} on the authority URL, so any entity that publishes a non-default metadata.federation_entity.federation_fetch_endpoint becomes unverifiable. It also decodes the authority statement and trusts its jwks immediately instead of first authenticating that self-signed entity statement. This has already drifted from packages/ta-server/src/federation/chain-verifier.ts.

Also applies to: 389-399

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/kms/src/trust-validator.ts` around lines 289 - 299, The current
fetchSubordinateStatement hard-codes "/federation_fetch" and immediately trusts
the authority's jwks; update it to read the authority's advertised
metadata.federation_entity.federation_fetch_endpoint (falling back to
"/federation_fetch" only if absent) and call that endpoint, and change the
chain-walk logic so you first fetch and cryptographically verify the authority's
self-signed entity statement (verify its signature using existing trust-anchor
logic) before extracting and trusting its jwks to validate subordinate
statements—adjust the functions involved (fetchSubordinateStatement and the
authority-statement handling / jwks usage code) to perform fetch → verify
authority statement → extract jwks → verify subordinate statement in that order.
packages/ta-server/src/federation/chain-verifier.ts-103-105 (1)

103-105: ⚠️ Potential issue | 🟠 Major

Verify the statement claims you consume, not just the signatures.

authority_hints is read from an unverified leaf statement, and the later jwtVerify() calls never pin issuer/subject. A valid JWT for another entity can therefore satisfy this walk. Verify the leaf self-statement before using its hints, and require iss=sub=authorityId for self-statements plus iss=authorityId, sub=currentEntityId for subordinate statements.

Also applies to: 121-123, 137-139, 182-183, 230-231

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ta-server/src/federation/chain-verifier.ts` around lines 103 - 105,
The code reads authority_hints from an unverified leaf JWT
(currentSelfJwt/currentPayload) and later calls jwtVerify without pinning
issuer/subject, allowing attacker-supplied JWTs to spoof the chain; before using
authority_hints (after resolveFederationStatement/decodeJwt) call jwtVerify on
the self-statement and assert jwt.payload.iss === jwt.payload.sub ===
authorityId, and for subordinate statements enforce jwtVerify with { issuer:
authorityId, subject: currentEntityId } (pin issuer and subject) whenever
verifying statements in the walk (same changes apply to the other
resolveFederationStatement/decodeJwt/jwtVerify occurrences referenced: the
blocks around where authority_hints is read and where jwtVerify is called).
packages/ta-server/src/db/sqlite-store.ts-117-125 (1)

117-125: ⚠️ Potential issue | 🟠 Major

Clear revocation fields when status moves back to a non-revoked state.

updateSubordinateStatus() preserves revocation_reason, revoked_at, and revoked_by unless new values are passed. Re-activating a subordinate therefore leaves stale revocation metadata behind, which makes the row look revoked even though status says otherwise.

🔄 Proposed fix
         `UPDATE subordinates SET
            status            = ?,
-           revocation_reason = CASE WHEN ? IS NOT NULL THEN ? ELSE revocation_reason END,
+           revocation_reason = CASE WHEN ? IN ('revoked','decommissioned')
+                                    THEN ?
+                                    ELSE NULL END,
            revoked_at        = CASE WHEN ? IN ('revoked','decommissioned')
                                     THEN datetime('now','utc')
-                                    ELSE revoked_at END,
-           revoked_by        = CASE WHEN ? IS NOT NULL THEN ? ELSE revoked_by END,
+                                    ELSE NULL END,
+           revoked_by        = CASE WHEN ? IN ('revoked','decommissioned')
+                                    THEN ?
+                                    ELSE NULL END,
            updated_at        = datetime('now','utc')
          WHERE entity_id = ?`
       )
       .run(
         status,
-        opts?.reason ?? null, opts?.reason ?? null,
         status,
-        opts?.actor ?? null, opts?.actor ?? null,
+        opts?.reason ?? null,
+        status,
+        status,
+        opts?.actor ?? null,
         entityId
       );

Also applies to: 127-133

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ta-server/src/db/sqlite-store.ts` around lines 117 - 125,
updateSubordinateStatus() currently preserves revocation_reason, revoked_at, and
revoked_by even when status is changed back to a non-revoked state; modify the
UPDATE SQL used in updateSubordinateStatus so those three columns are set to
NULL when the incoming status is not 'revoked' or 'decommissioned' (and
otherwise use the provided values): adjust the CASE expressions for
revocation_reason, revoked_at, and revoked_by to clear (NULL) when the status
param is NOT IN ('revoked','decommissioned'), and apply the same change to the
second similar UPDATE block referenced around lines 127-133 so re-activating a
subordinate removes stale revocation metadata.
packages/ta-server/src/federation/file-registry.ts-56-72 (1)

56-72: ⚠️ Potential issue | 🟠 Major

Don't treat corruption as “start fresh.”

Line 64 catches every read/parse failure, so a permission issue or truncated JSON file is silently converted into an empty registry. Because Line 72 overwrites the file in place, one partial write can erase the entire subordinate set on the next boot.

🧱 Proposed fix
   private _load(): void {
     try {
       const raw = fs.readFileSync(this.filePath, "utf8");
+      if (raw.trim() === "") {
+        return;
+      }
       const entries = JSON.parse(raw) as SubordinateEntry[];
       for (const entry of entries) {
         // Use super.register() so we don't trigger _save() during load.
         super.register(entry);
       }
-    } catch {
-      // File doesn't exist yet or is empty — start fresh.
+    } catch (err) {
+      if ((err as NodeJS.ErrnoException).code === "ENOENT") {
+        return;
+      }
+      throw err;
     }
   }
 
   private _save(): void {
     const entries = this.listEntityIds().map((id) => this.entries.get(id)!);
     fs.mkdirSync(path.dirname(this.filePath), { recursive: true });
-    fs.writeFileSync(this.filePath, JSON.stringify(entries, null, 2), "utf8");
+    const tmpPath = `${this.filePath}.tmp`;
+    fs.writeFileSync(tmpPath, JSON.stringify(entries, null, 2), "utf8");
+    fs.renameSync(tmpPath, this.filePath);
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ta-server/src/federation/file-registry.ts` around lines 56 - 72, The
current _load() swallows all errors making permission/parse failures look like
"no file" and then _save() can clobber data; change _load() to only treat ENOENT
as "start fresh" and otherwise surface the error (log and rethrow) so
corrupted/truncated JSON or permission issues are not silently ignored—inspect
the thrown Error.code when reading/parsing file in _load() and call
super.register(...) only on success; also make _save() atomic by writing JSON to
a temp file (e.g., this.filePath + ".tmp") after ensuring the directory exists,
fsync/flush if needed, and then rename/replace the original file (fs.renameSync)
to avoid leaving a truncated file from partial writes; reference _load(),
_save(), this.filePath, listEntityIds(), entries and super.register() when
applying these changes.
packages/ta-server/src/db/sqlite-store.ts-227-229 (1)

227-229: ⚠️ Potential issue | 🟠 Major

Normalize expires_at before comparing it in SQLite.

createEnrollment() writes expires_at as toISOString() (...T...Z), but this query compares it to datetime('now','utc') (... ...). In SQLite that is a text comparison unless you normalize the column, so pending enrollments can miss expiry.

⏰ Proposed fix
       .prepare(
         `UPDATE enrollment_requests
          SET status = 'expired'
-         WHERE status = 'pending' AND expires_at < datetime('now','utc')`
+         WHERE status = 'pending'
+           AND julianday(expires_at) < julianday('now')`
       )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ta-server/src/db/sqlite-store.ts` around lines 227 - 229, The UPDATE
query comparing expires_at to datetime('now','utc') does a text compare because
createEnrollment() stores expires_at as toISOString() (ISO8601 with 'T' and
'Z'); fix by normalizing expires_at inside the SQL before comparison (e.g. use
datetime(replace(replace(expires_at,'T',' '),'Z','')) or
julianday(replace(replace(expires_at,'T',' '),'Z','')) so SQLite parses it as a
proper timestamp). Update the existing SQL in the enrollment_requests expiration
code (the query that sets status = 'expired') to use the normalized expires_at
expression in the WHERE clause so pending enrollments properly expire.
packages/ta-server/src/federation/chain-verifier.ts-173-177 (1)

173-177: ⚠️ Potential issue | 🟠 Major

Use the JWT's protected header (kid and alg) to select the correct verification key, not the first ES256-importable one.

importFirstKey() bypasses proper key selection by iterating to the first JWK that imports as ES256, ignoring the JWT's protected header. This breaks during key rotation or when signers use non-ES256 algorithms—the correct key may never be tested. Consider passing the entire JWKS to jwtVerify() to let the jose library use the JWT's kid header for proper key selection.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ta-server/src/federation/chain-verifier.ts` around lines 173 - 177,
The code currently selects a verification key via
importFirstKey(authorityJwksKeys), which ignores the JWT protected header
(kid/alg) and breaks key rotation or non-ES256 signers; instead, stop importing
the "first" key and pass the full JWKS to jose's verifier so it can pick using
the token's protected header (e.g., replace importFirstKey usage in the
chain-verifier logic that sets authorityKey and instead supply
authorityPayload.jwks (authorityJwksKeys) to jwtVerify or use
createLocalJWKSet(authorityJwksKeys) so jwtVerify selects by kid/alg), ensuring
the verification step uses the JWT header to choose the correct key.
packages/ta-server/src/index.ts-420-420 (1)

420-420: ⚠️ Potential issue | 🟠 Major

Protect /dashboard with the same admin auth.

createDashboardRouter() exposes /dashboard/api/subordinates, /dashboard/api/revocations, and /dashboard/api/audit, but it's mounted without any auth here. Those endpoints stay public even when ADMIN_TOKEN is configured for the management router.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ta-server/src/index.ts` at line 420, The dashboard router is mounted
without admin authentication so its endpoints remain public; wrap the router
with the same admin auth middleware used by the management router (the
ADMIN_TOKEN check) before mounting. Concretely, locate the app.use("/",
createDashboardRouter(store, ENTITY_ID)) call and change it to apply the
existing admin auth middleware (for example app.use("/", adminAuthMiddleware,
createDashboardRouter(store, ENTITY_ID))) or mount it behind the same management
auth setup function used elsewhere; import or reuse the same middleware/function
that enforces ADMIN_TOKEN to ensure /dashboard/api/* endpoints require admin
auth.
packages/ta-server/src/management/index.ts-46-57 (1)

46-57: ⚠️ Potential issue | 🟠 Major

Fail closed for the management plane.

If ADMIN_TOKEN is missing, every revoke/restore/decommission/audit endpoint is public. A missing secret should not silently grant admin access; require an explicit insecure-dev opt-in or restrict these routes to loopback-only mode.

Also applies to: 69-78

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ta-server/src/management/index.ts` around lines 46 - 57, The
adminAuth middleware currently treats a missing adminToken as "auth disabled"
and grants access; change adminAuth to fail closed by denying access when
adminToken is undefined unless an explicit insecure opt-in flag (e.g.
INSECURE_MANAGEMENT or ENABLE_INSECURE_DEV) is set or the request comes from
loopback; update adminAuth (and the same logic used around lines 69-78) to: 1)
check for an explicit insecure opt-in configuration before allowing anonymous
access, 2) otherwise respond 401 for missing/invalid token, and 3) if you
implement loopback-only mode, detect loopback IPs via the request (req.ip or
req.connection.remoteAddress) and permit only those when adminToken is absent.
Ensure the error response remains descriptive and consistent.
packages/ta-server/src/index.ts-144-199 (1)

144-199: ⚠️ Potential issue | 🟠 Major

Don't reactivate DB rows during bootstrap.

The loop only skips entries that are already active. A subordinate that was revoked or decommissioned in the DB will be fetched again here and overwritten back to active on restart, which undoes an admin action and contradicts the "DB is authoritative afterwards" comment.

Suggested fix
-    const existing = store.getSubordinate(spec.entityId);
-    if (existing?.status === "active") {
+    const existing = store.getSubordinate(spec.entityId);
+    if (existing) {
       process.stderr.write(
-        `[ta-server] [TRUST:VALID] ${spec.entityId} already active in DB — skipping bootstrap fetch.\n`
+        `[ta-server] [TRUST:VALID] ${spec.entityId} already present in DB — skipping bootstrap fetch.\n`
       );
       continue;
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ta-server/src/index.ts` around lines 144 - 199, The bootstrap loop
currently reactivates any subordinate row because store.upsertSubordinate always
sets status:"active"; change the logic so existing DB rows are not reactivated:
after getting existing via store.getSubordinate(spec.entityId) if existing is
present then skip the fetch/upsert unless its status explicitly indicates a
pending bootstrap (e.g. status === "pending"); if you must refresh JWKS for an
existing row, update only jwks/jwksUrl/metadata while preserving
existing.status, revokedAt, revokedBy, revocationReason and notes instead of
forcing status:"active". Update the code around the specs loop and the
store.upsertSubordinate call (referencing getSubordinate and upsertSubordinate)
to implement these checks and preserve existing fields when updating.
packages/ta-server/test/intermediates.test.mjs-115-135 (1)

115-135: ⚠️ Potential issue | 🟠 Major

Add JWS signature verification to the test.

The test currently only decodes the JWS without verifying its signature. A broken signSubordinateStatement() implementation could still produce a decodable JWS and satisfy this test. Verify the signature using the TA public key before asserting payload contents, following the pattern already established in ta-server.test.mjs.

Suggested fix
-import { generateKeyPair, exportJWK, decodeJwt } from "jose";
+import { generateKeyPair, exportJWK, decodeJwt, importJWK, jwtVerify } from "jose";
@@
   const jws = await signSubordinateStatement(
@@
   );
 
+  const verificationKey = await importJWK(pubJwk, "ES256");
+  await jwtVerify(jws, verificationKey, { clockTolerance: 60 });
   const payload = decodeJwt(jws);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ta-server/test/intermediates.test.mjs` around lines 115 - 135, The
test decodes the JWS returned by signSubordinateStatement but never verifies its
signature; update the test to verify the JWS signature with the TA public key
(using the same verification helper/pattern used in ta-server.test.mjs) before
calling decodeJwt and asserting payload contents: obtain the TA public key from
TA_KMS (or the existing test helper), run the JWS verification function on the
JWS produced by signSubordinateStatement, then decode/inspect the verified
payload and assert metadata.federation_entity.federation_fetch_endpoint equals
fetchEndpoint.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: dcc2bb50-441f-47a2-b807-1d25266a75ae

📥 Commits

Reviewing files that changed from the base of the PR and between 443efed and 156a668.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (61)
  • .github/workflows/ci.yml
  • .github/workflows/containers.yml
  • .github/workflows/daily-scan.yml
  • .github/workflows/release.yml
  • .gitignore
  • docs/architecture.md
  • docs/decisions.md
  • docs/ralph-loop/increment-E-trust-chain.md
  • docs/ralph-loop/increment-F-federation-list-file-registry-hsm.md
  • docs/signing.md
  • docs/testing.md
  • docs/trust-guide.md
  • examples/lab/Dockerfile.softhsm-test
  • examples/lab/docker-compose.softhsm-test.yml
  • package.json
  • packages/fedmgr-cli/.releaserc.json
  • packages/fedmgr-cli/package.json
  • packages/fedmgr-cli/src/commands/trustmark.ts
  • packages/fedmgr-mcp/README.md
  • packages/fedmgr-mcp/package.json
  • packages/fedmgr-mcp/src/bin.ts
  • packages/fedmgr-mcp/src/index.ts
  • packages/fedmgr-mcp/tsconfig.json
  • packages/kms/.releaserc.json
  • packages/kms/package.json
  • packages/kms/src/index.ts
  • packages/kms/src/providers/pkcs11.ts
  • packages/kms/src/trust-validator.ts
  • packages/kms/src/types/pkcs11js.d.ts
  • packages/kms/test/pkcs11.test.mjs
  • packages/ta-server/.releaserc.json
  • packages/ta-server/package.json
  • packages/ta-server/src/dashboard/index.ts
  • packages/ta-server/src/dashboard/ui.html
  • packages/ta-server/src/db/migrations/001-init.sql
  • packages/ta-server/src/db/sqlite-store.ts
  • packages/ta-server/src/db/store.ts
  • packages/ta-server/src/enrollment/index.ts
  • packages/ta-server/src/federation/chain-verifier.ts
  • packages/ta-server/src/federation/file-registry.ts
  • packages/ta-server/src/federation/subordinate-statements.ts
  • packages/ta-server/src/federation/trust-mark-status.ts
  • packages/ta-server/src/index.ts
  • packages/ta-server/src/management/index.ts
  • packages/ta-server/test/chain-verifier.test.mjs
  • packages/ta-server/test/federation-list.test.mjs
  • packages/ta-server/test/file-registry.test.mjs
  • packages/ta-server/test/intermediates.test.mjs
  • packages/ta-server/test/trust-mark-status.test.mjs
  • packages/tmi-server/.releaserc.json
  • packages/tmi-server/package.json
  • packages/tmi-server/src/index.ts
  • scripts/docker-pkcs11-entrypoint.sh
  • scripts/sbom-diff.sh
  • scripts/test-report.sh
  • test-results/.gitignore
  • test-results/e2e-results.json
  • test-results/integration-results.json
  • test-results/junit.xml
  • test-results/security-results.json
  • test-results/unit-results.json
✅ Files skipped from review due to trivial changes (5)
  • packages/fedmgr-mcp/README.md
  • test-results/security-results.json
  • test-results/junit.xml
  • test-results/.gitignore
  • test-results/integration-results.json
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/fedmgr-cli/package.json

Comment on lines +177 to +182
- name: Setup fedmgr CLI
if: secrets.TMI_PRIV_JWE != ''
run: npm ci && npm run build -w @letsfederate/fedmgr-cli

- name: Decrypt TMI key to tmpfs
if: ${{ secrets.TMI_PRIV_JWE != '' }}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

secrets context is not available in step if conditions.

The secrets context cannot be accessed in step-level if expressions. This will cause the workflow to fail or behave unexpectedly. Use an environment variable set from the secret to check for presence.

🐛 Proposed fix: Use env var to check secret presence

Add a step early in the job to set an environment variable:

      - name: Check TMI secret availability
        id: tmi-check
        run: |
          if [ -n "$TMI_PRIV_JWE" ]; then
            echo "available=true" >> "$GITHUB_OUTPUT"
          else
            echo "available=false" >> "$GITHUB_OUTPUT"
          fi
        env:
          TMI_PRIV_JWE: ${{ secrets.TMI_PRIV_JWE }}

Then update the conditional steps to use:

-      - name: Setup fedmgr CLI
-        if: secrets.TMI_PRIV_JWE != ''
+      - name: Setup fedmgr CLI
+        if: steps.tmi-check.outputs.available == 'true'
         run: npm ci && npm run build -w `@letsfederate/fedmgr-cli`

Apply the same pattern to lines 182, 197, and 226.

🧰 Tools
🪛 actionlint (1.7.11)

[error] 178-178: context "secrets" is not allowed here. available contexts are "env", "github", "inputs", "job", "matrix", "needs", "runner", "steps", "strategy", "vars". see https://docs.github.com/en/actions/learn-github-actions/contexts#context-availability for more details

(expression)


[error] 182-182: context "secrets" is not allowed here. available contexts are "env", "github", "inputs", "job", "matrix", "needs", "runner", "steps", "strategy", "vars". see https://docs.github.com/en/actions/learn-github-actions/contexts#context-availability for more details

(expression)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/containers.yml around lines 177 - 182, The step-level `if`
uses the secrets context which is not available there; add an early step named
"Check TMI secret availability" (id: tmi-check) that reads the secret via env
TMI_PRIV_JWE: ${{ secrets.TMI_PRIV_JWE }} and writes an output like
available=true/false to $GITHUB_OUTPUT, then change the conditional on the steps
named "Setup fedmgr CLI" and "Decrypt TMI key to tmpfs" (and the other similar
steps) to use the step output check: if: ${{ steps.tmi-check.outputs.available
== 'true' }} so the steps rely on the tmi-check output instead of directly
referencing secrets in their if.

Comment on lines +159 to +162
for (const jwk of keys) {
try {
const cryptoKey = await importJWK(jwk, "ES256");
const result = await jwtVerify(jws, cryptoKey, { clockTolerance: 60 });
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# Check the file existence and size
wc -l packages/kms/src/trust-validator.ts

Repository: teamktown/fedmgr

Length of output: 99


🏁 Script executed:

# Read the relevant sections of the file
cat -n packages/kms/src/trust-validator.ts | sed -n '150,175p'

Repository: teamktown/fedmgr

Length of output: 1106


🏁 Script executed:

# Read the second location mentioned
cat -n packages/kms/src/trust-validator.ts | sed -n '410,435p'

Repository: teamktown/fedmgr

Length of output: 1085


🏁 Script executed:

# Get broader context around the function to understand the structure
cat -n packages/kms/src/trust-validator.ts | sed -n '130,200p'

Repository: teamktown/fedmgr

Length of output: 3154


🏁 Script executed:

# Check imports and dependencies
cat -n packages/kms/src/trust-validator.ts | head -30

Repository: teamktown/fedmgr

Length of output: 1291


🏁 Script executed:

# Search for jwtVerify and importJWK to understand their origins
rg -A 2 -B 2 "import.*jwtVerify\|import.*importJWK" packages/kms/src/trust-validator.ts

Repository: teamktown/fedmgr

Length of output: 42


🏁 Script executed:

# Check package.json to understand which JWT library is being used
cat packages/kms/package.json | grep -A 5 -B 5 "jose\|jwt"

Repository: teamktown/fedmgr

Length of output: 692


🌐 Web query:

jose npm library importJWK algorithm parameter ES256 RSA EdDSA

💡 Result:

In jose (panva/jose), importJWK is:

  • importJWK(jwk, alg?, options?)
  • For asymmetric JWKs, either the JWK must include an "alg" parameter or you must pass the alg argument (2nd parameter). If neither is present, import will fail. The passed alg defaults to jwk.alg when present. Also, JWK "use", "key_ops", and "ext" can further restrict import/usage. [1]

Algorithm ↔ key-type expectations (relevant to your examples):

  • ES256 → requires a NIST P-256 EC key (kty: "EC", crv: "P-256"). [2]
  • RSA signing (RS256 / PS256, etc.) → requires an RSA key with ≥ 2048-bit modulus (kty: "RSA"). [2]
  • EdDSA → requires an Ed25519 key (kty: "OKP", crv: "Ed25519"). [2]

Also: if the JWK is symmetric (kty: "oct"), importJWK returns a Uint8Array (not a CryptoKey). [1]

Sources: [1] importJWK() docs (panva/jose) https://github.com/panva/jose/blob/main/docs/key/import/functions/importJWK.md, [2] “Algorithm Key Requirements” (panva/jose) panva/jose#210


🌐 Web query:

jose jwtVerify automatic algorithm detection from JWT header

💡 Result:

Yes. In jose (panva/jose), jwtVerify() uses the JWT’s protected header alg value to determine which JWS algorithm to verify with, and then checks that it’s acceptable for the provided key/secret. By default, all alg values applicable for the used key/secret are allowed—so if you want to prevent “algorithm agility”, you should pin/whitelist allowed algorithms via options.algorithms. Unsecured JWTs (alg: "none") are never accepted by jwtVerify(). [1]

import { jwtVerify } from 'jose'

await jwtVerify(token, key, {
  algorithms: ['RS256'], // pin accepted alg(s)
})

Source: JWTVerifyOptions.algorithms docs. [1]

References

  1. (github.com)

Citations:


🏁 Script executed:

# Check if there are any examples in the codebase showing different algorithm handling
rg -i "RS256|EdDSA|RSA" packages/kms/

Repository: teamktown/fedmgr

Length of output: 166


🏁 Script executed:

# Look at the full function signature to understand the context better
cat -n packages/kms/src/trust-validator.ts | sed -n '90,130p'

Repository: teamktown/fedmgr

Length of output: 1763


Use the JWT header's alg claim to match and import keys with the correct algorithm, not hardcoded ES256.

Lines 161 and 422 force every JWK through P-256 (ES256). This fails when:

  • The JWT header specifies RS256, PS256, EdDSA, or any non-P-256 algorithm
  • The JWKS contains valid RSA or EdDSA keys (including PKCS#11-backed signers)

The protected header is already decoded at line 100; extract its alg claim and use it when importing the JWK. Optionally filter the JWKS by alg and kid before attempting verification.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/kms/src/trust-validator.ts` around lines 159 - 162, The loop that
imports JWKs uses a hardcoded algorithm ("ES256") when calling importJWK and
jwtVerify; instead extract the JWT protected header's alg claim (already decoded
earlier) and pass that alg into importJWK and jwtVerify so keys are imported
with the correct algorithm; update the verification loop in trust-validator.ts
(the block iterating over keys, using importJWK and jwtVerify with jws) to read
header.alg (validate it exists), optionally pre-filter the JWKS by alg and kid
before trying imports, and throw or skip when alg is missing or incompatible so
RSA/PS/EdDSA keys are handled correctly.

Comment on lines +16 to +18
"dependencies": {
"@letsfederate/kms": "*",
"better-sqlite3": "^12.8.0",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
rg -n -C2 '@letsfederate/kms"|semantic-release-monorepo|@semantic-release/npm|workspace:|prepublishOnly|prepare' package.json packages .github

Repository: teamktown/fedmgr

Length of output: 15356


🏁 Script executed:

#!/bin/bash
# Check for release configuration files
fd -e '.releaserc*' -o -name 'release.config.*' . 2>/dev/null

# Check if release config is in root package.json
echo "=== Checking root package.json for release config ===" 
cat package.json | grep -A 30 '"release"'

Repository: teamktown/fedmgr

Length of output: 113


🏁 Script executed:

#!/bin/bash
# Check all files in root for release config
ls -la . | grep -i release

# Check if there's releaserc in any format
find . -maxdepth 1 -name '.releaserc*' -o -name 'release.config*' 2>/dev/null

# Check package.json more thoroughly
echo "=== Full root package.json ===" 
head -100 package.json

Repository: teamktown/fedmgr

Length of output: 2725


🏁 Script executed:

#!/bin/bash
# Check the full release workflow for any inline configuration
cat .github/workflows/release.yml | grep -A 50 'semantic-release-monorepo'

Repository: teamktown/fedmgr

Length of output: 2596


🏁 Script executed:

#!/bin/bash
# Check for .releaserc files in package directories
find packages -name '.releaserc*' -o -name 'release.config.*' 2>/dev/null

# Also check each package directory for release config in package.json
echo "=== Checking ta-server package.json for release config ==="
grep -A 20 '"release"' packages/ta-server/package.json || echo "No release config found"

echo -e "\n=== Checking kms package.json for release config ==="
grep -A 20 '"release"' packages/kms/package.json || echo "No release config found"

Repository: teamktown/fedmgr

Length of output: 356


🏁 Script executed:

#!/bin/bash
echo "=== ta-server/.releaserc.json ===" 
cat packages/ta-server/.releaserc.json

echo -e "\n=== kms/.releaserc.json ===" 
cat packages/kms/.releaserc.json

Repository: teamktown/fedmgr

Length of output: 1468


Pin internal workspace dependencies to companion versions.

The .releaserc.json configuration does not rewrite internal dependencies before publishing. When @letsfederate/ta-server is published to npm, it will ship with "@letsfederate/kms": "*", allowing consumers to install any KMS version rather than the compatible monorepo release.

Change "@letsfederate/kms": "*" to a pinned version (e.g., "workspace:*" if npm workspaces are configured, or use an explicit version constraint) to ensure published packages depend on compatible KMS releases.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ta-server/package.json` around lines 16 - 18, The package currently
declares the internal dependency "@letsfederate/kms": "*" which will be
published as a floating dependency; update the dependency entry in package.json
to a pinned workspace or explicit version (e.g., "@letsfederate/kms":
"workspace:*" if using npm/yarn workspaces or to the exact semver used in the
monorepo) so published `@letsfederate/ta-server` requires the compatible KMS
release; modify the "dependencies" object replacing the "*" value for the
"@letsfederate/kms" key accordingly and run install/build to verify resolution.

Comment on lines +82 to +119
const { entity_id, jwks_url, notes } = parsed.data;

// Reject if already active
const existing = store.getSubordinate(entity_id);
if (existing?.status === "active") {
res.status(409).json({
error: "already_enrolled",
message: `${entity_id} is already an active subordinate. Use /subordinates/${encodeURIComponent(entity_id)} to manage it.`,
});
return;
}

// Reject if revoked — require explicit decommission first
if (existing?.status === "revoked") {
res.status(409).json({
error: "revoked",
message:
`${entity_id} was previously revoked. To re-enroll, an administrator ` +
`must first decommission it via DELETE /subordinates/${encodeURIComponent(entity_id)}.`,
});
return;
}

// Expire any stale pending enrollments for this entity
store.expireEnrollments();

const nonce = randomBytes(32).toString("hex");
const enrollmentId = randomUUID();
const expiresAt = new Date(Date.now() + 10 * 60 * 1000); // 10 minutes

store.createEnrollment({
id: enrollmentId,
entityId: entity_id,
jwksUrl: jwks_url,
nonce,
status: "pending",
expiresAt,
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Bind enrollment to the claimed entity_id.

Right now the caller chooses both entity_id and jwks_url, and completion only checks that proof_jws verifies against keys fetched from that jwks_url. That lets an attacker enroll https://victim.example with a JWKS they control, and the server-side fetch is also an SSRF primitive. Resolve the proof key from the claimed entity’s own metadata/self-statement, or at minimum enforce same-origin plus outbound allowlisting and timeouts before issuing the subordinate statement.

Also applies to: 204-221, 225-284

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ta-server/src/enrollment/index.ts` around lines 82 - 119, The
enrollment flow currently accepts a caller-supplied jwks_url and stores it (in
createEnrollment) allowing an attacker to enroll an arbitrary entity_id with
attacker-controlled keys; change the logic in the enrollment handler so the JWKS
used for the pending enrollment is resolved from the claimed entity's
authoritative metadata/self-statement (fetch and validate the entity's
metadata/JWKS tied to entity_id) rather than using the caller-provided jwks_url,
and if you cannot obtain metadata then reject the request; additionally harden
outbound fetches used to resolve metadata by enforcing same-origin or an
explicit outbound allowlist, strict timeouts and validation of the
self-statement signature before creating the enrollment record (refer to
entity_id, jwks_url, nonce, enrollmentId, store.createEnrollment and
store.expireEnrollments).

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (4)
packages/ta-server/src/index.ts (3)

79-82: ⚠️ Potential issue | 🟡 Minor

Validate TA_JWKS_URL at startup.

TA_ENTITY_ID is validated (lines 92-98), but TA_JWKS_URL is not. A malformed URL will cause less clear failures later. Add startup validation similar to TA_ENTITY_ID.

🛡️ Proposed fix
 const JWKS_URL =
   process.env["TA_JWKS_URL"] ??
   `http://localhost:${PORT}/.well-known/jwks.json`;

+try {
+  new URL(JWKS_URL);
+} catch {
+  process.stderr.write(
+    `[ta-server] [TRUST:FAIL] TA_JWKS_URL must be a valid URL, got: "${JWKS_URL}"\n`
+  );
+  process.exit(78);
+}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ta-server/src/index.ts` around lines 79 - 82, The JWKS URL
(TA_JWKS_URL / constant JWKS_URL) must be validated at startup like
TA_ENTITY_ID: resolve the value you will use for JWKS_URL
(process.env["TA_JWKS_URL"] or the default
`http://localhost:${PORT}/.well-known/jwks.json`) and attempt to parse it with
the URL constructor inside a try/catch; if parsing fails throw a clear Error
indicating TA_JWKS_URL is invalid so startup fails fast. Update the
initialization that defines JWKS_URL to perform this validation (mirror the
TA_ENTITY_ID validation pattern) before any code uses JWKS_URL.

225-229: ⚠️ Potential issue | 🟠 Major

Wrap async route handlers with error forwarding for Express 4.x.

Per past review: Express 4.x requires explicit error forwarding for async handlers. The handlers for /.well-known/jwks.json, /.well-known/openid-federation, and /federation_fetch can reject without forwarding to error middleware.

Consider using a wrapper or .catch(next):

⚡ Proposed fix — async wrapper utility
+// Wrap async handlers for Express 4.x compatibility
+const asyncHandler = (fn: (req: Request, res: Response, next: NextFunction) => Promise<void>) =>
+  (req: Request, res: Response, next: NextFunction) => fn(req, res, next).catch(next);

-app.get("/.well-known/jwks.json", async (_req: Request, res: Response) => {
+app.get("/.well-known/jwks.json", asyncHandler(async (_req: Request, res: Response) => {
   const jwks = await kms.jwks();
   res.setHeader("Cache-Control", "public, max-age=3600");
   res.json(jwks);
-});
+}));

Apply similarly to /.well-known/openid-federation and /federation_fetch.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ta-server/src/index.ts` around lines 225 - 229, The async Express
route handlers (routes for "/.well-known/jwks.json",
"/.well-known/openid-federation", and "/federation_fetch") must forward
rejections to Express error middleware; wrap each async handler with an
error-forwarding helper (e.g., a small wrapper like asyncHandler(fn) that
returns (req,res,next)=>Promise.resolve(fn(req,res,next)).catch(next)) or append
.catch(next) to the promise, then replace the inline async callbacks in the
handlers for those three routes with the wrapped versions so any thrown/rejected
errors are passed to next().

158-166: ⚠️ Potential issue | 🟠 Major

Add timeout to subordinate JWKS fetches.

The JWKS fetch uses plain fetch() without timeout. A stalled endpoint can block server startup indefinitely.

⏲️ Proposed fix with AbortController
+    const controller = new AbortController();
+    const timeout = setTimeout(() => controller.abort(), 10_000);
     try {
-      const r = await fetch(spec.jwksUrl);
+      const r = await fetch(spec.jwksUrl, { signal: controller.signal });
+      clearTimeout(timeout);
       if (!r.ok) {
         throw new Error(`HTTP ${r.status} ${r.statusText}`);
       }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ta-server/src/index.ts` around lines 158 - 166, The JWKS fetch in
the try block that calls fetch(spec.jwksUrl) can hang; add a timeout using
AbortController: create an AbortController before calling fetch, pass its signal
to fetch(spec.jwksUrl, { signal }), and set a timer (e.g., 3–5s) that calls
controller.abort() if it fires; ensure you clear the timer after the response is
received or on error so you don't leak timers. Locate the fetch call in index.ts
(the try block that assigns jwks) and implement the AbortController + timer
around that fetch, handling the abort error appropriately.
packages/ta-server/src/enrollment/index.ts (1)

217-248: ⚠️ Potential issue | 🟠 Major

Add timeout to JWKS fetch during enrollment completion.

The JWKS fetch at line 220 lacks timeout protection. A stalled JWKS endpoint could indefinitely block enrollment completion.

⏲️ Proposed fix
       // ── Fetch the entity's JWKS ────────────────────────────────────────
       let keys: JWK[];
       try {
-        const r = await fetch(enrollment.jwksUrl);
+        const controller = new AbortController();
+        const timeout = setTimeout(() => controller.abort(), 10_000);
+        const r = await fetch(enrollment.jwksUrl, { signal: controller.signal });
+        clearTimeout(timeout);
         if (!r.ok) throw new Error(`HTTP ${r.status}`);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ta-server/src/enrollment/index.ts` around lines 217 - 248, The JWKS
fetch in the enrollment completion flow is missing a timeout and can hang
indefinitely; wrap the fetch(enrollment.jwksUrl) call with an AbortController
and a timeout (e.g., JWKS_FETCH_TIMEOUT_MS) so the request is aborted after the
configured milliseconds, pass controller.signal into fetch, clear the timeout on
success, and handle the abort error path in the existing catch to return the
same 502/jwks_fetch_failed response (include the abort error message). Update
the block surrounding fetch(enrollment.jwksUrl), referencing enrollment.jwksUrl,
JWKS_MAX_BYTES, and the keys variable, and ensure any timer is cleaned up to
avoid leaks.
🧹 Nitpick comments (6)
packages/ta-server/src/management/index.ts (2)

48-60: Same return next() pattern issue as dashboard module.

Apply the same fix for consistency and to satisfy static analysis.

🔧 Proposed fix
 function adminAuth(adminToken: string | undefined) {
   return (req: Request, res: Response, next: NextFunction): void => {
-    if (!adminToken) return next(); // auth disabled
+    if (!adminToken) { next(); return; } // auth disabled
     const auth = req.headers["authorization"] ?? "";
-    if (auth === `Bearer ${adminToken}`) return next();
+    if (auth === `Bearer ${adminToken}`) { next(); return; }
     res.status(401).json({
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ta-server/src/management/index.ts` around lines 48 - 60, The
adminAuth middleware uses the pattern "return next()" which static analysis
flags; update the function adminAuth to call next() and then explicitly return
(i.e., replace both occurrences of "return next();" with "next(); return;") so
the middleware continues correctly while satisfying linters and static
analyzers; keep the surrounding logic in adminAuth (checks for adminToken and
Bearer match) unchanged.

344-356: Consider using a proper interface instead of repeated type assertions.

The multiple type assertions ((row as { entityId: string }).entityId) are verbose and could mask type mismatches. Define or import the subordinate row interface directly.

♻️ Proposed refactor
+interface SubordinateRow {
+  entityId: string;
+  jwksUrl: string;
+  status: string;
+  isIntermediate: boolean;
+  enrolledAt: Date;
+  updatedAt: Date;
+  revocationReason: string | null;
+  revokedAt: Date | null;
+  notes: string | null;
+}

-function formatSubordinate(row: ReturnType<FederationStore["getSubordinate"]> & object) {
+function formatSubordinate(row: SubordinateRow) {
   return {
-    entity_id:        (row as { entityId: string }).entityId,
-    jwks_url:         (row as { jwksUrl: string }).jwksUrl,
-    status:           (row as { status: string }).status,
-    is_intermediate:  (row as { isIntermediate: boolean }).isIntermediate,
-    enrolled_at:      (row as { enrolledAt: Date }).enrolledAt.toISOString(),
-    updated_at:       (row as { updatedAt: Date }).updatedAt.toISOString(),
-    revocation_reason:(row as { revocationReason: string | null }).revocationReason,
-    revoked_at:       (row as { revokedAt: Date | null }).revokedAt?.toISOString() ?? null,
-    notes:            (row as { notes: string | null }).notes,
+    entity_id:         row.entityId,
+    jwks_url:          row.jwksUrl,
+    status:            row.status,
+    is_intermediate:   row.isIntermediate,
+    enrolled_at:       row.enrolledAt.toISOString(),
+    updated_at:        row.updatedAt.toISOString(),
+    revocation_reason: row.revocationReason,
+    revoked_at:        row.revokedAt?.toISOString() ?? null,
+    notes:             row.notes,
   };
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ta-server/src/management/index.ts` around lines 344 - 356, The
function formatSubordinate uses repeated inline type assertions which is verbose
and fragile; define a proper interface (e.g., SubordinateRow) that lists
entityId, jwksUrl, status, isIntermediate, enrolledAt, updatedAt,
revocationReason, revokedAt, notes and either change the function signature to
formatSubordinate(row: SubordinateRow) or cast row once to that interface (const
r = row as SubordinateRow) and then read r.entityId, r.jwksUrl, etc.; reference
the existing function name formatSubordinate and the original type hint
FederationStore["getSubordinate"] to ensure the new interface matches that
return shape.
packages/ta-server/src/enrollment/index.ts (1)

381-393: Same return next() pattern issue.

Apply the same fix for consistency with other middleware in the codebase.

🔧 Proposed fix
 function require_json_or_nothing(
   req: Request,
   res: Response,
   next: () => void
 ): void {
-  if (req.is("application/json") || req.method === "GET") return next();
-  if (!req.headers["content-type"]) return next();
+  if (req.is("application/json") || req.method === "GET") { next(); return; }
+  if (!req.headers["content-type"]) { next(); return; }
   res.status(415).json({
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ta-server/src/enrollment/index.ts` around lines 381 - 393, The
middleware require_json_or_nothing uses the "return next()" pattern; update it
to match the project's middleware style by calling next() and then returning
(i.e., replace each "return next()" with "next(); return;") so execution flow is
identical to other middleware; specifically adjust the calls inside
require_json_or_nothing (the check for req.is("application/json") || req.method
=== "GET" and the check for !req.headers["content-type"]) to call next() then
return instead of using "return next()".
packages/fedmgr-mcp/src/validate-url.ts (1)

8-14: Missing 0.0.0.0 and 0.* range in SSRF blocklist.

The private IPv4 patterns don't include 0.0.0.0 or the 0.* range, which can also be used for SSRF attacks (e.g., 0.0.0.0 often binds to localhost on many systems).

🛡️ Proposed fix to add 0.0.0.0 blocking
 const PRIVATE_IPV4 = [
   /^127\./,
   /^10\./,
   /^172\.(1[6-9]|2\d|3[01])\./,
   /^192\.168\./,
   /^169\.254\./,
+  /^0\./,
 ];
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/fedmgr-mcp/src/validate-url.ts` around lines 8 - 14, Update the
PRIVATE_IPV4 block in validate-url.ts (the PRIVATE_IPV4 constant) to also block
the 0.* range and explicit 0.0.0.0; add a pattern such as /^0\./ to cover any
0.* addresses and/or /^0\.0\.0\.0$/ to explicitly block that address so SSRF
attempts using 0.* or 0.0.0.0 are rejected.
packages/ta-server/src/dashboard/index.ts (1)

44-61: Fix return next() pattern flagged by static analysis.

Biome correctly identifies that return next() returns a value from a void function. While functionally correct for Express middleware, this can be made cleaner.

🔧 Proposed fix
 function dashboardAuth(adminToken: string | undefined) {
   return (req: Request, res: Response, next: NextFunction): void => {
-    if (!adminToken) return next(); // auth disabled in dev
+    if (!adminToken) { next(); return; } // auth disabled in dev
     const auth = req.headers["authorization"] ?? "";
-    if (auth === `Bearer ${adminToken}`) return next();
+    if (auth === `Bearer ${adminToken}`) { next(); return; }
     // For API endpoints return JSON; for the HTML page return 401 text
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ta-server/src/dashboard/index.ts` around lines 44 - 61, In
dashboardAuth, avoid using the "return next()" pattern; instead call next() and
then return to keep the function's void signature. Concretely, update both
conditionals in the middleware (the adminToken check and the auth === `Bearer
${adminToken}` check) to call next(); followed by an explicit return (e.g., if
(!adminToken) { next(); return; } and similarly for the auth check) so the
middleware still short-circuits but does not return the value of next().
packages/ta-server/src/utils/validate-url.ts (1)

16-22: Same missing 0.0.0.0 range and code duplication.

This validator is nearly identical to packages/fedmgr-mcp/src/validate-url.ts. Consider extracting to a shared package (e.g., @letsfederate/common) to avoid drift and ensure consistent SSRF protections. Also add 0.0.0.0/0.* blocking as noted in the other file.

🛡️ Proposed fix
 const PRIVATE_IPV4 = [
   /^127\./,
   /^10\./,
   /^172\.(1[6-9]|2\d|3[01])\./,
   /^192\.168\./,
   /^169\.254\./,
+  /^0\./,
 ];
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ta-server/src/utils/validate-url.ts` around lines 16 - 22, The
PRIVATE_IPV4 constant in validate-url.ts is missing blocking for the 0.0.0.0/0.*
range and is duplicated across repositories; update the PRIVATE_IPV4 array to
include a regex to block 0.* (e.g., add /^0\./) and refactor by extracting the
URL validation logic (the PRIVATE_IPV4 constant and associated validate function
in validate-url.ts) into a shared module (suggested package name
`@letsfederate/common`), then import and use that shared validator from both
packages/ta-server and packages/fedmgr-mcp to remove duplication and ensure
consistent SSRF protections.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@package.json`:
- Line 25: The package.json defines a scripts entry named "version" pointing to
"scripts/run-conformance.sh", which overrides npm's reserved version lifecycle
hook; rename the script key (e.g., to "conformance" or "check:conformance") and
update any calls or CI steps referencing "npm run version" to the new name so
that npm version lifecycle (npm version <newversion>) no longer triggers
scripts/run-conformance.sh unexpectedly, and add a short comment or
documentation if the lifecycle behavior was intentional.
- Line 27: The npm script "tmi:decrypt" currently uses "source
scripts/tmi-decrypt.sh", which runs inside npm's subshell so exported
environment variables won't persist; update this by either (A) removing the npm
script and documenting that developers should run `source
scripts/tmi-decrypt.sh` directly in their interactive shell, or (B) change the
script to execute the script (e.g., `bash scripts/tmi-decrypt.sh`) if it should
perform actions rather than export vars, or (C) modify scripts/tmi-decrypt.sh to
output key=value pairs (or a .env file) and update the npm script to write/load
that output so subsequent npm-run commands can consume the variables; reference
the "tmi:decrypt" npm script and the scripts/tmi-decrypt.sh file when making the
change.

In `@packages/fedmgr-mcp/src/index.ts`:
- Around line 287-300: In toolRevokeSubordinate, avoid unconditionally calling
String on args["reason"] (which yields "undefined"); change to check for
presence and only stringify when provided (e.g., const reason = args["reason"]
!= null ? String(args["reason"]) : ""); use a clear fallback label when building
the user-facing return (e.g., show "(none)" if reason is empty) and ensure the
JSON body sent in fetchJson uses the same reason variable so the server doesn't
receive the literal "undefined".
- Around line 237-258: The ta_url parameter is used to build jwksUrl without
SSRF validation in toolVerifyTrustmark; validate and sanitize ta_url with the
existing requireSafeUrl (or equivalent) before constructing jwksUrl, e.g. call
requireSafeUrl(String(args["ta_url"])) and use its returned safe value (or throw
a descriptive error if invalid) when creating `${safeTaUrl.replace(/\/$/,
"")}/.well-known/jwks.json`, then pass that jwksUrl into validateTrustmark;
update toolVerifyTrustmark to reject or error on unsafe URLs to prevent SSRF.
- Around line 55-69: The fetchJson helper (function fetchJson) needs a request
timeout and a response-size cap to avoid hangs and memory exhaustion: create an
AbortController, merge its signal with the incoming init.signal, start a timer
(e.g., configurable ms) that calls controller.abort() on timeout and clears on
completion; then read the response as a streamed body (using
res.body.getReader() or a streaming reader) and accumulate bytes up to a safe
limit (e.g., configurable maxBytes), aborting and throwing if exceeded; preserve
existing behavior of throwing on !res.ok (include the partial text up to the
limit), and after successful read attempt JSON.parse returning parsed object or
the raw text if JSON.parse fails; ensure timers and the controller are cleaned
up and that abort errors are surfaced as a clear timeout/size-exceeded error.

---

Duplicate comments:
In `@packages/ta-server/src/enrollment/index.ts`:
- Around line 217-248: The JWKS fetch in the enrollment completion flow is
missing a timeout and can hang indefinitely; wrap the fetch(enrollment.jwksUrl)
call with an AbortController and a timeout (e.g., JWKS_FETCH_TIMEOUT_MS) so the
request is aborted after the configured milliseconds, pass controller.signal
into fetch, clear the timeout on success, and handle the abort error path in the
existing catch to return the same 502/jwks_fetch_failed response (include the
abort error message). Update the block surrounding fetch(enrollment.jwksUrl),
referencing enrollment.jwksUrl, JWKS_MAX_BYTES, and the keys variable, and
ensure any timer is cleaned up to avoid leaks.

In `@packages/ta-server/src/index.ts`:
- Around line 79-82: The JWKS URL (TA_JWKS_URL / constant JWKS_URL) must be
validated at startup like TA_ENTITY_ID: resolve the value you will use for
JWKS_URL (process.env["TA_JWKS_URL"] or the default
`http://localhost:${PORT}/.well-known/jwks.json`) and attempt to parse it with
the URL constructor inside a try/catch; if parsing fails throw a clear Error
indicating TA_JWKS_URL is invalid so startup fails fast. Update the
initialization that defines JWKS_URL to perform this validation (mirror the
TA_ENTITY_ID validation pattern) before any code uses JWKS_URL.
- Around line 225-229: The async Express route handlers (routes for
"/.well-known/jwks.json", "/.well-known/openid-federation", and
"/federation_fetch") must forward rejections to Express error middleware; wrap
each async handler with an error-forwarding helper (e.g., a small wrapper like
asyncHandler(fn) that returns
(req,res,next)=>Promise.resolve(fn(req,res,next)).catch(next)) or append
.catch(next) to the promise, then replace the inline async callbacks in the
handlers for those three routes with the wrapped versions so any thrown/rejected
errors are passed to next().
- Around line 158-166: The JWKS fetch in the try block that calls
fetch(spec.jwksUrl) can hang; add a timeout using AbortController: create an
AbortController before calling fetch, pass its signal to fetch(spec.jwksUrl, {
signal }), and set a timer (e.g., 3–5s) that calls controller.abort() if it
fires; ensure you clear the timer after the response is received or on error so
you don't leak timers. Locate the fetch call in index.ts (the try block that
assigns jwks) and implement the AbortController + timer around that fetch,
handling the abort error appropriately.

---

Nitpick comments:
In `@packages/fedmgr-mcp/src/validate-url.ts`:
- Around line 8-14: Update the PRIVATE_IPV4 block in validate-url.ts (the
PRIVATE_IPV4 constant) to also block the 0.* range and explicit 0.0.0.0; add a
pattern such as /^0\./ to cover any 0.* addresses and/or /^0\.0\.0\.0$/ to
explicitly block that address so SSRF attempts using 0.* or 0.0.0.0 are
rejected.

In `@packages/ta-server/src/dashboard/index.ts`:
- Around line 44-61: In dashboardAuth, avoid using the "return next()" pattern;
instead call next() and then return to keep the function's void signature.
Concretely, update both conditionals in the middleware (the adminToken check and
the auth === `Bearer ${adminToken}` check) to call next(); followed by an
explicit return (e.g., if (!adminToken) { next(); return; } and similarly for
the auth check) so the middleware still short-circuits but does not return the
value of next().

In `@packages/ta-server/src/enrollment/index.ts`:
- Around line 381-393: The middleware require_json_or_nothing uses the "return
next()" pattern; update it to match the project's middleware style by calling
next() and then returning (i.e., replace each "return next()" with "next();
return;") so execution flow is identical to other middleware; specifically
adjust the calls inside require_json_or_nothing (the check for
req.is("application/json") || req.method === "GET" and the check for
!req.headers["content-type"]) to call next() then return instead of using
"return next()".

In `@packages/ta-server/src/management/index.ts`:
- Around line 48-60: The adminAuth middleware uses the pattern "return next()"
which static analysis flags; update the function adminAuth to call next() and
then explicitly return (i.e., replace both occurrences of "return next();" with
"next(); return;") so the middleware continues correctly while satisfying
linters and static analyzers; keep the surrounding logic in adminAuth (checks
for adminToken and Bearer match) unchanged.
- Around line 344-356: The function formatSubordinate uses repeated inline type
assertions which is verbose and fragile; define a proper interface (e.g.,
SubordinateRow) that lists entityId, jwksUrl, status, isIntermediate,
enrolledAt, updatedAt, revocationReason, revokedAt, notes and either change the
function signature to formatSubordinate(row: SubordinateRow) or cast row once to
that interface (const r = row as SubordinateRow) and then read r.entityId,
r.jwksUrl, etc.; reference the existing function name formatSubordinate and the
original type hint FederationStore["getSubordinate"] to ensure the new interface
matches that return shape.

In `@packages/ta-server/src/utils/validate-url.ts`:
- Around line 16-22: The PRIVATE_IPV4 constant in validate-url.ts is missing
blocking for the 0.0.0.0/0.* range and is duplicated across repositories; update
the PRIVATE_IPV4 array to include a regex to block 0.* (e.g., add /^0\./) and
refactor by extracting the URL validation logic (the PRIVATE_IPV4 constant and
associated validate function in validate-url.ts) into a shared module (suggested
package name `@letsfederate/common`), then import and use that shared validator
from both packages/ta-server and packages/fedmgr-mcp to remove duplication and
ensure consistent SSRF protections.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e598f157-892e-4798-8b7c-d37c57fd5bf9

📥 Commits

Reviewing files that changed from the base of the PR and between 156a668 and ef6978a.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (8)
  • package.json
  • packages/fedmgr-mcp/src/index.ts
  • packages/fedmgr-mcp/src/validate-url.ts
  • packages/ta-server/src/dashboard/index.ts
  • packages/ta-server/src/enrollment/index.ts
  • packages/ta-server/src/index.ts
  • packages/ta-server/src/management/index.ts
  • packages/ta-server/src/utils/validate-url.ts

"dev": "nodemon server.js",
"cli": "node src/fedmgr.js",
"version": "scripts/run-conformance.sh"
"version": "scripts/run-conformance.sh",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Overriding version shadows npm's lifecycle hook.

The version script name is reserved by npm as a lifecycle hook that runs during npm version <newversion>. Assigning it to scripts/run-conformance.sh means:

  1. Running npm run version executes the conformance script (possibly intended)
  2. Running npm version patch will also trigger this script as a lifecycle hook, which may not be intended

If this is intentional, consider documenting it. If not, rename to something like conformance or check:conformance.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@package.json` at line 25, The package.json defines a scripts entry named
"version" pointing to "scripts/run-conformance.sh", which overrides npm's
reserved version lifecycle hook; rename the script key (e.g., to "conformance"
or "check:conformance") and update any calls or CI steps referencing "npm run
version" to the new name so that npm version lifecycle (npm version
<newversion>) no longer triggers scripts/run-conformance.sh unexpectedly, and
add a short comment or documentation if the lifecycle behavior was intentional.

"version": "scripts/run-conformance.sh"
"version": "scripts/run-conformance.sh",
"tmi:keys:init": "bash scripts/tmi-keys-init.sh",
"tmi:decrypt": "source scripts/tmi-decrypt.sh",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

source in npm script does not persist environment variables.

source scripts/tmi-decrypt.sh runs in a subshell spawned by npm, so any exported environment variables will not persist to the parent shell after the script completes. This pattern doesn't achieve its apparent intent of setting up environment variables for subsequent commands.

If the goal is to export variables for manual shell use, document that users should run source scripts/tmi-decrypt.sh directly in their terminal. For npm script usage, consider an alternative approach:

Suggested alternatives

Option 1: Remove the script if it's only meant for direct shell sourcing:

-    "tmi:decrypt": "source scripts/tmi-decrypt.sh",

Option 2: If the script should execute and perform actions (not just export vars), use bash:

-    "tmi:decrypt": "source scripts/tmi-decrypt.sh",
+    "tmi:decrypt": "bash scripts/tmi-decrypt.sh",
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@package.json` at line 27, The npm script "tmi:decrypt" currently uses "source
scripts/tmi-decrypt.sh", which runs inside npm's subshell so exported
environment variables won't persist; update this by either (A) removing the npm
script and documenting that developers should run `source
scripts/tmi-decrypt.sh` directly in their interactive shell, or (B) change the
script to execute the script (e.g., `bash scripts/tmi-decrypt.sh`) if it should
perform actions rather than export vars, or (C) modify scripts/tmi-decrypt.sh to
output key=value pairs (or a .env file) and update the npm script to write/load
that output so subsequent npm-run commands can consume the variables; reference
the "tmi:decrypt" npm script and the scripts/tmi-decrypt.sh file when making the
change.

Comment on lines +287 to +300
async function toolRevokeSubordinate(args: Record<string, unknown>): Promise<string> {
const taUrl = requireSafeUrl(String(args["ta_url"] ?? DEFAULT_TA_URL), "ta_url");
const entityId = requireSafeUrl(String(args["entity_id"]), "entity_id");
const reason = String(args["reason"]);

const encodedId = encodeURIComponent(entityId);
const data = await fetchJson(`${taUrl}/subordinates/${encodedId}/revoke`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ reason }),
});

return `Subordinate ${entityId} revoked.\nReason: ${reason}\n\nResponse:\n${JSON.stringify(data, null, 2)}`;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Handle missing reason argument gracefully.

Line 290 converts args["reason"] to string unconditionally. If the user doesn't provide a reason, String(undefined) results in the literal string "undefined" being sent to the server.

🔧 Proposed fix
 async function toolRevokeSubordinate(args: Record<string, unknown>): Promise<string> {
   const taUrl    = requireSafeUrl(String(args["ta_url"]    ?? DEFAULT_TA_URL), "ta_url");
   const entityId = requireSafeUrl(String(args["entity_id"]),                   "entity_id");
-  const reason = String(args["reason"]);
+  const reason = args["reason"] ? String(args["reason"]) : undefined;

   const encodedId = encodeURIComponent(entityId);
   const data = await fetchJson(`${taUrl}/subordinates/${encodedId}/revoke`, {
     method: "POST",
     headers: { "Content-Type": "application/json" },
-    body: JSON.stringify({ reason }),
+    body: JSON.stringify(reason ? { reason } : {}),
   });

-  return `Subordinate ${entityId} revoked.\nReason: ${reason}\n\nResponse:\n${JSON.stringify(data, null, 2)}`;
+  return `Subordinate ${entityId} revoked.${reason ? `\nReason: ${reason}` : ""}\n\nResponse:\n${JSON.stringify(data, null, 2)}`;
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async function toolRevokeSubordinate(args: Record<string, unknown>): Promise<string> {
const taUrl = requireSafeUrl(String(args["ta_url"] ?? DEFAULT_TA_URL), "ta_url");
const entityId = requireSafeUrl(String(args["entity_id"]), "entity_id");
const reason = String(args["reason"]);
const encodedId = encodeURIComponent(entityId);
const data = await fetchJson(`${taUrl}/subordinates/${encodedId}/revoke`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ reason }),
});
return `Subordinate ${entityId} revoked.\nReason: ${reason}\n\nResponse:\n${JSON.stringify(data, null, 2)}`;
}
async function toolRevokeSubordinate(args: Record<string, unknown>): Promise<string> {
const taUrl = requireSafeUrl(String(args["ta_url"] ?? DEFAULT_TA_URL), "ta_url");
const entityId = requireSafeUrl(String(args["entity_id"]), "entity_id");
const reason = args["reason"] ? String(args["reason"]) : undefined;
const encodedId = encodeURIComponent(entityId);
const data = await fetchJson(`${taUrl}/subordinates/${encodedId}/revoke`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify(reason ? { reason } : {}),
});
return `Subordinate ${entityId} revoked.${reason ? `\nReason: ${reason}` : ""}\n\nResponse:\n${JSON.stringify(data, null, 2)}`;
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/fedmgr-mcp/src/index.ts` around lines 287 - 300, In
toolRevokeSubordinate, avoid unconditionally calling String on args["reason"]
(which yields "undefined"); change to check for presence and only stringify when
provided (e.g., const reason = args["reason"] != null ? String(args["reason"]) :
""); use a clear fallback label when building the user-facing return (e.g., show
"(none)" if reason is empty) and ensure the JSON body sent in fetchJson uses the
same reason variable so the server doesn't receive the literal "undefined".

teamktown and others added 2 commits March 17, 2026 21:02
Critical fixes:
- #1 enrollment/index.ts: enforce same-origin binding between entity_id and
  jwks_url — prevents victim entity enrollment attack
- #2 subordinate-statements.ts: strip private JWK fields (d,p,q,dp,dq,qi,oth,k)
  before embedding JWKS in signed federation documents
- #3 ta-server/index.ts: sanitize bootstrap JWKS (strip private fields) + 5s
  AbortController timeout on JWKS bootstrap fetches
- #4 kms/trust-validator.ts: apply assertSafeUrl() to jku URL before SSRF fetch;
  add kms/src/validate-url.ts with full SSRF protection
- #5 tmi-server/index.ts: protect POST /trustmarks/issue with TMI_ISSUE_TOKEN
  Bearer auth; startup [TRUST:WARN] when token not set

Major fixes:
- #6/#7 asyncHandler wrapper for Express 4.x async routes in ta-server +
  tmi-server — unhandled rejections now forwarded to error middleware
- #8 Bootstrap JWKS fetch: 5s AbortController timeout
- #9 ta-server/index.ts: validate TA_JWKS_URL with new URL() at startup
- #10 tmi-server already had authority_hints — confirmed no change needed
- #11 fedmgr-cli/package.json: add jose to runtime dependencies
- #12 CLI trustmark issue: validate --ttl (reject NaN, enforce 60-86400)
- #13 CLI trustmark verify: apply assertSafeUrl to jku JWKS URL
- #14 fedmgr-mcp fetchJson: 10s timeout + 1MB size cap
- #15 fedmgr-mcp verify_trustmark: apply requireSafeUrl to ta_url
- #16 containers.yml: fix secrets context missing ${{ }} wrapper

Docs:
- docs/security-faq.md: comprehensive security mindfulness guide covering
  SSRF, algorithm confusion, private key leakage, admin auth, Express 4
  async handling, size/timeout DoS, dependency declaration, CI/CD secrets

Co-Authored-By: claude-flow <ruv@ruv.net>
Mirrors the skill at .claude/skills/openid-federation/SKILL.md in the
repo so it is versioned alongside the codebase it describes.

Covers: trust chain resolution, enrollment protocol, entity roles,
security patterns (SSRF, algorithm pinning, private key stripping,
asyncHandler, fetch timeouts), implementation checklist, common
mistakes table, and fedmgr CLI/MCP tool reference.

Co-Authored-By: claude-flow <ruv@ruv.net>
@teamktown teamktown changed the title feat: KMS + TMI + Trust Anchor + CLI — no naked keys, OIDF federation scaffolding feat: OpenID Federation 1.0 — Trust Anchor, TMI, CLI, MCP + 16 security fixes Mar 17, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

♻️ Duplicate comments (4)
packages/kms/src/trust-validator.ts (2)

163-173: ⚠️ Potential issue | 🟠 Major

Hardcoded ES256 limits algorithm support — still unaddressed from prior review.

The verification loop forces ES256 at lines 165-166. This will fail for RSA (RS256/PS256) or EdDSA keys. Extract the algorithm from the JWS header (already decoded at line 101) and use it for importJWK and jwtVerify.

🔧 Proposed fix to use header algorithm
+  const headerAlg = header["alg"] as string | undefined;
+  if (!headerAlg || typeof headerAlg !== "string") {
+    return {
+      state: "INVALID",
+      subject: "(unknown)",
+      message: "[TRUST:FAIL] JWS header missing 'alg' claim",
+      recommendedAction: "The trustmark must include an 'alg' header. Re-issue the trustmark.",
+    };
+  }
+
   for (const jwk of keys) {
     try {
-      const cryptoKey = await importJWK(jwk, "ES256");
-      const result = await jwtVerify(jws, cryptoKey, { clockTolerance: 60 });
+      const cryptoKey = await importJWK(jwk, headerAlg);
+      const result = await jwtVerify(jws, cryptoKey, {
+        clockTolerance: 60,
+        algorithms: [headerAlg],
+      });
       payload = result.payload as Record<string, unknown>;
       verified = true;
       break;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/kms/src/trust-validator.ts` around lines 163 - 173, The loop
currently hardcodes "ES256" for importJWK and jwtVerify; change it to read the
algorithm from the decoded JWS header (the variable that holds the parsed header
near where the JWS is decoded) and pass that algorithm into importJWK(...) and
jwtVerify(..., { algorithms: [alg] } or the appropriate param for the library),
so that importJWK and jwtVerify use header.alg (e.g., const alg = jwsHeader.alg)
instead of "ES256" when iterating over keys; keep the existing try/catch,
payload and verified assignment logic intact.

422-433: ⚠️ Potential issue | 🟠 Major

Hardcoded ES256 in chain verification — same issue as validateTrustmark.

The subordinate statement verification at line 426 also uses hardcoded ES256. Apply the same fix as in validateTrustmark by extracting the algorithm from the subordinate JWT's header.

🔧 Proposed fix
+    const subordHeader = decodeProtectedHeader(subordJwt);
+    const subordAlg = subordHeader.alg as string;
+    if (!subordAlg) {
+      return {
+        state: "INVALID",
+        subject: entityId,
+        message: `[TRUST:FAIL] Subordinate statement from ${authorityId} missing 'alg' header`,
+        chainDepth: depth,
+        recommendedAction: "The subordinate statement must include an 'alg' header.",
+      };
+    }
+
     for (const jwk of authorityJwks.keys) {
       try {
-        const cryptoKey = await importJWK(jwk, "ES256");
-        await jwtVerify(subordJwt, cryptoKey, { clockTolerance });
+        const cryptoKey = await importJWK(jwk, subordAlg);
+        await jwtVerify(subordJwt, cryptoKey, {
+          clockTolerance,
+          algorithms: [subordAlg],
+        });
         subordVerified = true;
         break;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/kms/src/trust-validator.ts` around lines 422 - 433, The subordinate
statement verification loop currently hardcodes "ES256" when calling importJWK
and jwtVerify; change it to extract the algorithm from the subordinate JWT
header (like the fix in validateTrustmark): parse the header of subordJwt to
read header.alg (or throw if missing), pass that alg into importJWK and
jwtVerify instead of "ES256", and keep existing clockTolerance and
authorityJwks.keys iteration logic (references: subordJwt, authorityJwks.keys,
importJWK, jwtVerify, clockTolerance, validateTrustmark).
packages/ta-server/src/enrollment/index.ts (1)

237-266: ⚠️ Potential issue | 🟡 Minor

Add a timeout to the JWKS fetch to prevent request hangs.

The fetch at line 238 has no timeout/abort signal. A stalled or slow JWKS endpoint could cause the request to hang indefinitely, degrading server responsiveness.

⏲️ Proposed fix to add fetch timeout
       // ── Fetch the entity's JWKS ────────────────────────────────────────
       let keys: JWK[];
       try {
-        const r = await fetch(enrollment.jwksUrl);
+        const controller = new AbortController();
+        const timeout = setTimeout(() => controller.abort(), 5000);
+        let r: Awaited<ReturnType<typeof fetch>>;
+        try {
+          r = await fetch(enrollment.jwksUrl, { signal: controller.signal });
+        } finally {
+          clearTimeout(timeout);
+        }
         if (!r.ok) throw new Error(`HTTP ${r.status}`);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ta-server/src/enrollment/index.ts` around lines 237 - 266, The JWKS
fetch in the enrollment flow (the fetch call using enrollment.jwksUrl and
JWKS_MAX_BYTES) has no timeout and can hang; fix it by using an AbortController
with a short configurable timeout (e.g., a constant like JWKS_FETCH_TIMEOUT_MS),
pass controller.signal to fetch, set a setTimeout to call controller.abort() and
clear that timeout after the fetch completes, and ensure the catch block
properly handles abort/timeout errors (treat as jwks_fetch_failed) while still
including the error text in logs and the 502 response; update the fetch block
around enrollment.jwksUrl to use this abort pattern and clean up the timer on
success/failure.
.github/workflows/containers.yml (1)

177-226: ⚠️ Potential issue | 🔴 Critical

Fix invalid secrets usage in step-level if expressions

Line 178, Line 182, Line 197, and Line 226 use secrets in step if, which is not allowed by GitHub Actions context rules and will fail lint/execution.

🐛 Proposed fix
+      - name: Check TMI secret availability
+        id: tmi-check
+        env:
+          TMI_PRIV_JWE: ${{ secrets.TMI_PRIV_JWE }}
+        run: |
+          if [ -n "$TMI_PRIV_JWE" ]; then
+            echo "available=true" >> "$GITHUB_OUTPUT"
+          else
+            echo "available=false" >> "$GITHUB_OUTPUT"
+          fi
+
       - name: Setup fedmgr CLI
-        if: ${{ secrets.TMI_PRIV_JWE != '' }}
+        if: ${{ steps.tmi-check.outputs.available == 'true' }}
         run: npm ci && npm run build -w `@letsfederate/fedmgr-cli`

       - name: Decrypt TMI key to tmpfs
-        if: ${{ secrets.TMI_PRIV_JWE != '' }}
+        if: ${{ steps.tmi-check.outputs.available == 'true' }}
         env:
           TMI_PRIV_JWE: ${{ secrets.TMI_PRIV_JWE }}
           TMI_JWE_PASSPHRASE: ${{ secrets.TMI_JWE_PASSPHRASE }}

       - name: Issue trustmark for ${{ matrix.service }}
-        if: ${{ secrets.TMI_PRIV_JWE != '' }}
+        if: ${{ steps.tmi-check.outputs.available == 'true' }}
         id: trustmark
         env:
           DIGEST: ${{ steps.build.outputs.digest }}

       - name: Attach trustmark attestation
-        if: ${{ secrets.TMI_PRIV_JWE != '' && steps.trustmark.outputs.jws != '' }}
+        if: ${{ steps.tmi-check.outputs.available == 'true' && steps.trustmark.outputs.jws != '' }}
#!/bin/bash
set -euo pipefail
# Verify invalid step-level secrets usage in if expressions
rg -nP '^\s*if:\s*\$\{\{\s*secrets\.[^}]+\}\}' .github/workflows/containers.yml || true
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/containers.yml around lines 177 - 226, The step-level if
expressions use the secrets context (e.g., "if: ${{ secrets.TMI_PRIV_JWE != ''
}}" in the steps named "Setup fedmgr CLI", "Decrypt TMI key to tmpfs", "Issue
trustmark for ${{ matrix.service }}" and the "Attach trustmark attestation"
step), which GitHub Actions forbids; fix by injecting the secret into an env
context (add env: TMI_PRIV_JWE: ${{ secrets.TMI_PRIV_JWE }} at the job level or
top of the workflow) and change each step if to reference env (e.g., if: ${{
env.TMI_PRIV_JWE != '' }}) and for the last check combine with the output as if:
${{ env.TMI_PRIV_JWE != '' && steps.trustmark.outputs.jws != '' }} so the
conditions use allowed contexts instead of secrets directly.
🧹 Nitpick comments (8)
test-results/security-results.json (1)

1-1: Avoid committing non-deterministic test-run metadata.

This report includes volatile fields (startTime, endTime, per-test duration) and an absolute runner path (/workspaces/fedmgr/...), which will create noisy diffs and leak CI environment details. Prefer storing raw reports as CI artifacts only, or sanitize these fields before committing.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test-results/security-results.json` at line 1, The committed JSON test report
contains non-deterministic fields that cause noisy diffs and leak CI paths;
remove or sanitize the volatile keys before committing: strip top-level
startTime and endTime, strip per-test duration under
testResults[*].assertionResults[*].duration, and scrub any absolute runner path
strings (e.g., occurrences matching /workspaces/...) from testResults[*].name or
other fields; implement this either by updating the test reporter to emit a
sanitized summary (or a deterministic minimal report) or by adding a small
post-test step (script) that loads the report, removes/normalizes those keys,
and writes the sanitized JSON instead of the raw runner output.
packages/tmi-server/src/index.ts (1)

173-232: Consider timing-safe comparison for bearer token authentication.

The token comparison on line 179 uses direct string equality (auth !== \Bearer ${TMI_ISSUE_TOKEN}`). While timing attacks on bearer tokens are low-risk in this context (attacker can't observe response timing precisely over network), using crypto.timingSafeEqual` would be defense-in-depth.

🛡️ Proposed hardening
+import crypto from "node:crypto";
+
+function timingSafeCompare(a: string, b: string): boolean {
+  const bufA = Buffer.from(a);
+  const bufB = Buffer.from(b);
+  if (bufA.length !== bufB.length) return false;
+  return crypto.timingSafeEqual(bufA, bufB);
+}

     if (TMI_ISSUE_TOKEN) {
       const auth = req.headers["authorization"] ?? "";
-      if (auth !== `Bearer ${TMI_ISSUE_TOKEN}`) {
+      if (!timingSafeCompare(auth, `Bearer ${TMI_ISSUE_TOKEN}`)) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/tmi-server/src/index.ts` around lines 173 - 232, The bearer token
comparison in the POST /trustmarks/issue handler uses direct equality on auth
and the expected `Bearer ${TMI_ISSUE_TOKEN}` which can leak timing information;
change the check to perform a timing-safe comparison by normalizing both values
to Buffers (e.g., from req.headers["authorization"] and the expected string) and
use crypto.timingSafeEqual, but also handle differing lengths safely (compare
lengths first and return 401 if lengths differ or pad/compare in constant-time).
Update the logic around the auth variable in the async handler that checks
`TMI_ISSUE_TOKEN` so it uses this timing-safe comparison instead of `auth !==
\`Bearer ${TMI_ISSUE_TOKEN}\``.
packages/kms/src/index.ts (2)

79-85: Strengthen private field stripping for defense-in-depth.

The jwks() method only strips the d field, but security-faq.md (line 84) documents a broader set of private JWK fields: ["d", "p", "q", "dp", "dq", "qi", "oth", "k"]. While EC P-256 keys only use d, stripping all private fields provides consistency with the documented pattern and guards against future key type changes.

♻️ Proposed fix to align with documented pattern
+const PRIVATE_JWK_FIELDS = ["d", "p", "q", "dp", "dq", "qi", "oth", "k"];
+
+function stripPrivateFields(jwk: JWK): JWK {
+  const pub = { ...jwk };
+  for (const f of PRIVATE_JWK_FIELDS) delete (pub as Record<string, unknown>)[f];
+  return pub;
+}
+
   async jwks(): Promise<{ keys: JWK[] }> {
     await this.ensureLoaded();
-    // Strip any private fields before serving. Public JWK file should already
-    // contain only public params, but we defensively remove `d`.
-    const { d: _d, ...pubOnly } = this._pub as JWK & { d?: string };
-    return { keys: [{ ...pubOnly, kid: await this.kid() }] };
+    const pubOnly = stripPrivateFields(this._pub as JWK);
+    return { keys: [{ ...pubOnly, kid: await this.kid() }] };
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/kms/src/index.ts` around lines 79 - 85, The jwks() method only
strips `d` but should remove all private JWK fields for defense-in-depth; update
jwks() (which calls ensureLoaded() and returns { keys: [...] } using this._pub
and kid()) to destructure and omit the private fields
["d","p","q","dp","dq","qi","oth","k"] (e.g., const { d, p, q, dp, dq, qi, oth,
k, ...pubOnly } = this._pub as JWK & { d?: string; p?: string; q?: string; dp?:
string; dq?: string; qi?: string; oth?: any; k?: string }) and then return the
key built from pubOnly with kid: await this.kid().

105-133: Add clearer error handling for malformed JWK files.

JSON.parse on lines 113-114 will throw a generic SyntaxError if the JWK files contain invalid JSON. Wrapping with a try-catch and re-throwing a descriptive error would improve debuggability during deployment.

♻️ Proposed improvement
     const [privRaw, pubRaw] = await Promise.all([
       fs.readFile(path.resolve(this.cfg.privateJwkPath), "utf8"),
       fs.readFile(path.resolve(this.cfg.publicJwkPath), "utf8"),
     ]);

-    const privJwk: JWK = JSON.parse(privRaw) as JWK;
-    const pubJwk: JWK = JSON.parse(pubRaw) as JWK;
+    let privJwk: JWK;
+    let pubJwk: JWK;
+    try {
+      privJwk = JSON.parse(privRaw) as JWK;
+    } catch (err) {
+      throw new Error(`SoftKmsProvider: invalid JSON in private JWK at ${this.cfg.privateJwkPath}: ${err}`);
+    }
+    try {
+      pubJwk = JSON.parse(pubRaw) as JWK;
+    } catch (err) {
+      throw new Error(`SoftKmsProvider: invalid JSON in public JWK at ${this.cfg.publicJwkPath}: ${err}`);
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/kms/src/index.ts` around lines 105 - 133, The ensureLoaded method
can throw a generic SyntaxError when parsing malformed JWK files; wrap the
JSON.parse of privRaw and pubRaw in a try-catch inside ensureLoaded (referencing
privJwk and pubJwk) and re-throw a descriptive Error that includes which path
(this.cfg.privateJwkPath or this.cfg.publicJwkPath) failed and the original
error message; keep existing EC P-256 validation and importJWK(privJwk, "ES256")
behavior, but ensure the catch preserves the original error as the cause or
message to aid debugging.
packages/tmi-server/src/utils/async-handler.ts (1)

1-16: Consider extracting shared utility.

This is an exact duplicate of packages/ta-server/src/utils/async-handler.ts. For maintainability, consider extracting to a shared package (e.g., @letsfederate/shared or into @letsfederate/kms). However, the duplication is acceptable for now given the utility's small size and stability.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/tmi-server/src/utils/async-handler.ts` around lines 1 - 16, The
asyncHandler function is duplicated (same implementation exists in the other
package) so extract it into a shared module and update both packages to import
it: create a new export (e.g., asyncHandler) in the shared package (suggested
names: `@letsfederate/shared` or `@letsfederate/kms`), move the implementation of
asyncHandler into that shared module, export it, then replace the local
implementations in both packages to import { asyncHandler } from the shared
package and remove the duplicate function declarations (ensure TypeScript types
are preserved: Request, Response, NextFunction, RequestHandler).
packages/ta-server/src/index.ts (1)

298-325: Pagination with after cursor is sensitive to ordering changes.

The /federation_list endpoint uses after for cursor-based pagination, but the underlying listSubordinates result order is not explicitly defined. If the store's ordering changes (e.g., due to concurrent inserts), pagination could skip or duplicate entries.

Consider sorting ids by a stable key (e.g., alphabetically by entityId) before applying the after filter:

   let ids = rows.map((r) => r.entityId);
+  ids.sort(); // stable ordering for cursor pagination

   const after = req.query["after"] as string | undefined;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ta-server/src/index.ts` around lines 298 - 325, The /federation_list
handler currently paginates using the `after` cursor on an unordered
`store.listSubordinates` result (variable `rows`/`ids`), which can cause
skips/duplicates if store ordering changes; sort the entries by a stable key
(e.g., `entityId`) after building `rows` and before deriving `ids`/applying the
`after` filter (use a stable comparator such as
`String.prototype.localeCompare`), then apply the existing `after` slicing and
`limit` logic so pagination is deterministic; reference the
`app.get("/federation_list", ...)` handler, `store.listSubordinates`, local vars
`rows` and `ids`, and the `after`/`limit` handling.
packages/ta-server/src/federation/subordinate-statements.ts (1)

96-100: Type assertion could be narrowed for safer access.

The isIntermediate function assumes federation_entity is a record with string properties. Consider adding a guard for the endpoint value.

♻️ Suggested refinement
 export function isIntermediate(entry: SubordinateEntry): boolean {
   const fe = (entry.metadata?.["federation_entity"] ?? {}) as Record<string, unknown>;
-  return typeof fe["federation_fetch_endpoint"] === "string" &&
-    (fe["federation_fetch_endpoint"] as string).length > 0;
+  const endpoint = fe["federation_fetch_endpoint"];
+  return typeof endpoint === "string" && endpoint.length > 0;
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ta-server/src/federation/subordinate-statements.ts` around lines 96
- 100, The isIntermediate function currently force-casts
entry.metadata?.["federation_entity"] to Record<string, unknown> and then
assumes federation_fetch_endpoint is a string; change this to safely narrow the
type: check that entry.metadata exists, that federation_entity is a non-null
object (typeof === "object" && !== null), and that
federation_entity["federation_fetch_endpoint"] is a string with length > 0
before returning true. Update the guard in isIntermediate (and reference
SubordinateEntry and metadata) to avoid the broad type assertion and only treat
the endpoint as a string after the typeof check.
.github/workflows/containers.yml (1)

126-126: Pin mutable @master action reference to a specific version

The aquasecurity/trivy-action@master reference is mutable and weakens CI supply-chain security guarantees. Pin to a release tag (e.g., @v0.18.0) or a commit SHA instead.

Note: A similar pattern exists in .github/workflows/daily-scan.yml:111.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/containers.yml at line 126, Replace the mutable action
reference "aquasecurity/trivy-action@master" with a pinned release tag or commit
SHA (for example "aquasecurity/trivy-action@v0.18.0" or a specific commit) to
harden CI supply-chain security; update the uses line in the workflow where
"aquasecurity/trivy-action@master" appears and apply the same change to the
other workflow that also references this action to ensure all occurrences are
pinned.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/containers.yml:
- Around line 206-208: The TMI public JWK path is incorrectly derived from the
matrix.service variable (TMI_PUB_JWK_PATH="$TMI_KEY" ...
TMI_PUB_JWK_PATH="packages/${{ matrix.service }}/keys/tmi.pub.jwk"), which can
point to the wrong package when launching the TMI process; change the
TMI_PUB_JWK_PATH assignment to use the fixed tmi-server key path (e.g., set
TMI_PUB_JWK_PATH to "packages/tmi-server/keys/tmi.pub.jwk") so the TMI process
always loads the correct public key regardless of matrix.service.
- Around line 181-193: The workflow step "Decrypt TMI key to tmpfs" uses the
external "step" CLI but never installs it; add a preceding job step (conditional
on the same secret check used by the decrypt step) that installs the step CLI
before invoking "step crypto jwe decrypt" so the decrypt command can run
successfully; name it e.g. "Install step CLI", use the same if: ${{
secrets.TMI_PRIV_JWE != '' }} guard, and run the provided curl/tar/install
commands to place the "step" binary on PATH.

In `@docs/claude/skills/openid-federation/SKILL.md`:
- Around line 63-67: The doc inconsistently describes where the jku appears
(mixing "claim" and "JWS header"); update all occurrences (e.g., the list under
"A JWT issued by a Trustmark Issuer" that references `jku`, and the other
passage flagged around the later section) to consistently state that `jku` is a
JWS/JWT header parameter (not a claim), and reword any phrase like "claim in JWS
header" to a single consistent phrase such as "JWS header parameter `jku`"
throughout the document.
- Around line 245-254: The example in the fetch timeout/size snippet uses a 1MB
cap (text.length > 1024 * 1024) which conflicts with the "Common Mistakes"
guidance of ≤64KB; change the size-check in the example (the check that throws
after reading r.text(), currently using 1024 * 1024) to use 64 * 1024 (or
otherwise match the chosen canonical cap) and update any other occurrences of
1024 * 1024 in the docs (including the other example mentioned) so the
AbortController/timeout usage (controller, timeout, controller.abort, fetch,
r.text(), JSON.parse) and the table are consistent.
- Around line 189-193: The fenced code block in SKILL.md (the triple-backtick
block that shows entity_id/jwks_url examples) lacks a language identifier;
update that code fence to include a language tag (e.g., "text") so the block
becomes ```text ... ``` to satisfy markdownlint MD040 and improve
rendering/tooling consistency.

In `@packages/fedmgr-cli/src/commands/trustmark.ts`:
- Around line 144-157: The jwtVerify call in the verification loop must be
pinned to allowed algorithms to prevent algorithm confusion attacks: update the
jwtVerify invocation inside the for-loop (where you importJWK(key, "ES256") and
call jwtVerify(opts.jws, cryptoKey)) to pass an options object including
algorithms: ["ES256"] (e.g., jwtVerify(opts.jws, cryptoKey, { algorithms:
["ES256"] })), leaving the rest of the loop logic unchanged so it still tries
each key from jwks.keys and sets verified on success.
- Around line 330-339: The adopt command builds the request body using ttl_s:
parseInt(opts.ttl, 10) but does not validate the parsed TTL; replicate the same
TTL validation logic used by the issue command to guard against NaN and
out-of-range values. In the adopt handler (where body is constructed and ttl_s
is set), parse opts.ttl into a numeric ttl, check Number.isInteger(ttl) and that
it falls within the accepted min/max range used by issue, throw or exit with a
helpful error if invalid, and then assign body["ttl_s"] = ttl; also ensure any
existing opts.evidence handling remains unchanged.

In `@packages/fedmgr-mcp/src/index.ts`:
- Around line 240-246: The request body uses the wrong field names: change the
body to send ttl_s instead of ttl and trustmark_id instead of id; specifically
update the object built in index.ts (variable body) so it sets body["ttl_s"] =
ttlSeconds (not ttl) and body["trustmark_id"] = String(args["trustmark_id"])
(instead of body["id"]), keep body["image_digest"] =
String(args["image_digest"]) and retain sub: subjectUrl; adjust any code that
reads or constructs body to use these keys.
- Around line 323-331: The raw fetch in toolGetSignedConfig bypasses
timeout/size protections; replace the fetch call with the shared fetchJson
utility (or equivalent) used elsewhere, invoking it for configUrl with timeout
10000 and maxBody 1048576, then validate res.ok as before and extract the body
as text (or ensure fetchJson returns the response text); update references to
res/res.text() accordingly so toolGetSignedConfig still throws the same HTTP
error when !res.ok but benefits from the 10s timeout and 1MB size limit.

In `@packages/kms/src/trust-validator.ts`:
- Around line 284-303: fetchEntityStatement and fetchSubordinateStatement
currently build URLs from authority_hints and call fetchFn without SSRF checks;
add SSRF validation before constructing/fetching those URLs (e.g., validate
scheme is https, resolve and reject private/loopback/internal IPs or enforce an
allowlist) and surface validation failures as errors instead of fetching. Update
the helpers (fetchEntityStatement, fetchSubordinateStatement) to call a shared
validation utility (or implement validateAuthorityHint) that takes the raw
authority_hints value, verifies hostname/IP and scheme safety, and only then
returns/permits the sanitized URL for fetchFn; keep the existing error handling
for non-OK responses.

In `@packages/kms/src/validate-url.ts`:
- Around line 43-64: The IPv6 handling in validate-url.ts is incomplete: update
the hostname checks (where parsed.hostname is used) to normalize and detect
bracketed loopback forms (e.g., "[::1]" and variants) and treat them as loopback
the same as "::1"; add checks for IPv6 Unique Local Addresses (fc00::/7, e.g.,
prefixes "fc" or "fd") and throw UrlSafetyError like other private-range checks;
detect IPv4-mapped IPv6 addresses (patterns like ::ffff:x.y.z.w or
::ffff:xxxx:xxxx) by extracting the embedded IPv4, then run the existing
PRIVATE_IPV4 tests against that IPv4 and block if matched; reference/update the
logic around hostname, the existing PRIVATE_IPV4 loop, and the IPv6 link-local
check to include these additional guards.

In `@packages/ta-server/src/enrollment/index.ts`:
- Around line 332-346: The completion path currently checks enrollment.status
=== "pending" then calls store.upsertSubordinate, store.updateEnrollment, audit,
and signSubordinateStatement, which can race; wrap the critical sequence in an
atomic database operation so only one request can complete: either (a) perform
the enrollment status change with an atomic check-and-update (e.g.,
updateEnrollment(enrollmentId, "completed") or a new method that updates WHERE
status='pending' RETURNING success) and only run
upsertSubordinate/audit/signSubordinateStatement when that update returned
success, or (b) execute the upsertSubordinate + updateEnrollment + audit +
signSubordinateStatement inside a DB transaction with a SELECT ... FOR UPDATE on
the enrollment row to serialize concurrent requests; reference
store.upsertSubordinate, store.updateEnrollment, audit, signSubordinateStatement
and the enrollment.status check to implement the chosen atomic/transactional
approach.

---

Duplicate comments:
In @.github/workflows/containers.yml:
- Around line 177-226: The step-level if expressions use the secrets context
(e.g., "if: ${{ secrets.TMI_PRIV_JWE != '' }}" in the steps named "Setup fedmgr
CLI", "Decrypt TMI key to tmpfs", "Issue trustmark for ${{ matrix.service }}"
and the "Attach trustmark attestation" step), which GitHub Actions forbids; fix
by injecting the secret into an env context (add env: TMI_PRIV_JWE: ${{
secrets.TMI_PRIV_JWE }} at the job level or top of the workflow) and change each
step if to reference env (e.g., if: ${{ env.TMI_PRIV_JWE != '' }}) and for the
last check combine with the output as if: ${{ env.TMI_PRIV_JWE != '' &&
steps.trustmark.outputs.jws != '' }} so the conditions use allowed contexts
instead of secrets directly.

In `@packages/kms/src/trust-validator.ts`:
- Around line 163-173: The loop currently hardcodes "ES256" for importJWK and
jwtVerify; change it to read the algorithm from the decoded JWS header (the
variable that holds the parsed header near where the JWS is decoded) and pass
that algorithm into importJWK(...) and jwtVerify(..., { algorithms: [alg] } or
the appropriate param for the library), so that importJWK and jwtVerify use
header.alg (e.g., const alg = jwsHeader.alg) instead of "ES256" when iterating
over keys; keep the existing try/catch, payload and verified assignment logic
intact.
- Around line 422-433: The subordinate statement verification loop currently
hardcodes "ES256" when calling importJWK and jwtVerify; change it to extract the
algorithm from the subordinate JWT header (like the fix in validateTrustmark):
parse the header of subordJwt to read header.alg (or throw if missing), pass
that alg into importJWK and jwtVerify instead of "ES256", and keep existing
clockTolerance and authorityJwks.keys iteration logic (references: subordJwt,
authorityJwks.keys, importJWK, jwtVerify, clockTolerance, validateTrustmark).

In `@packages/ta-server/src/enrollment/index.ts`:
- Around line 237-266: The JWKS fetch in the enrollment flow (the fetch call
using enrollment.jwksUrl and JWKS_MAX_BYTES) has no timeout and can hang; fix it
by using an AbortController with a short configurable timeout (e.g., a constant
like JWKS_FETCH_TIMEOUT_MS), pass controller.signal to fetch, set a setTimeout
to call controller.abort() and clear that timeout after the fetch completes, and
ensure the catch block properly handles abort/timeout errors (treat as
jwks_fetch_failed) while still including the error text in logs and the 502
response; update the fetch block around enrollment.jwksUrl to use this abort
pattern and clean up the timer on success/failure.

---

Nitpick comments:
In @.github/workflows/containers.yml:
- Line 126: Replace the mutable action reference
"aquasecurity/trivy-action@master" with a pinned release tag or commit SHA (for
example "aquasecurity/trivy-action@v0.18.0" or a specific commit) to harden CI
supply-chain security; update the uses line in the workflow where
"aquasecurity/trivy-action@master" appears and apply the same change to the
other workflow that also references this action to ensure all occurrences are
pinned.

In `@packages/kms/src/index.ts`:
- Around line 79-85: The jwks() method only strips `d` but should remove all
private JWK fields for defense-in-depth; update jwks() (which calls
ensureLoaded() and returns { keys: [...] } using this._pub and kid()) to
destructure and omit the private fields ["d","p","q","dp","dq","qi","oth","k"]
(e.g., const { d, p, q, dp, dq, qi, oth, k, ...pubOnly } = this._pub as JWK & {
d?: string; p?: string; q?: string; dp?: string; dq?: string; qi?: string; oth?:
any; k?: string }) and then return the key built from pubOnly with kid: await
this.kid().
- Around line 105-133: The ensureLoaded method can throw a generic SyntaxError
when parsing malformed JWK files; wrap the JSON.parse of privRaw and pubRaw in a
try-catch inside ensureLoaded (referencing privJwk and pubJwk) and re-throw a
descriptive Error that includes which path (this.cfg.privateJwkPath or
this.cfg.publicJwkPath) failed and the original error message; keep existing EC
P-256 validation and importJWK(privJwk, "ES256") behavior, but ensure the catch
preserves the original error as the cause or message to aid debugging.

In `@packages/ta-server/src/federation/subordinate-statements.ts`:
- Around line 96-100: The isIntermediate function currently force-casts
entry.metadata?.["federation_entity"] to Record<string, unknown> and then
assumes federation_fetch_endpoint is a string; change this to safely narrow the
type: check that entry.metadata exists, that federation_entity is a non-null
object (typeof === "object" && !== null), and that
federation_entity["federation_fetch_endpoint"] is a string with length > 0
before returning true. Update the guard in isIntermediate (and reference
SubordinateEntry and metadata) to avoid the broad type assertion and only treat
the endpoint as a string after the typeof check.

In `@packages/ta-server/src/index.ts`:
- Around line 298-325: The /federation_list handler currently paginates using
the `after` cursor on an unordered `store.listSubordinates` result (variable
`rows`/`ids`), which can cause skips/duplicates if store ordering changes; sort
the entries by a stable key (e.g., `entityId`) after building `rows` and before
deriving `ids`/applying the `after` filter (use a stable comparator such as
`String.prototype.localeCompare`), then apply the existing `after` slicing and
`limit` logic so pagination is deterministic; reference the
`app.get("/federation_list", ...)` handler, `store.listSubordinates`, local vars
`rows` and `ids`, and the `after`/`limit` handling.

In `@packages/tmi-server/src/index.ts`:
- Around line 173-232: The bearer token comparison in the POST /trustmarks/issue
handler uses direct equality on auth and the expected `Bearer
${TMI_ISSUE_TOKEN}` which can leak timing information; change the check to
perform a timing-safe comparison by normalizing both values to Buffers (e.g.,
from req.headers["authorization"] and the expected string) and use
crypto.timingSafeEqual, but also handle differing lengths safely (compare
lengths first and return 401 if lengths differ or pad/compare in constant-time).
Update the logic around the auth variable in the async handler that checks
`TMI_ISSUE_TOKEN` so it uses this timing-safe comparison instead of `auth !==
\`Bearer ${TMI_ISSUE_TOKEN}\``.

In `@packages/tmi-server/src/utils/async-handler.ts`:
- Around line 1-16: The asyncHandler function is duplicated (same implementation
exists in the other package) so extract it into a shared module and update both
packages to import it: create a new export (e.g., asyncHandler) in the shared
package (suggested names: `@letsfederate/shared` or `@letsfederate/kms`), move the
implementation of asyncHandler into that shared module, export it, then replace
the local implementations in both packages to import { asyncHandler } from the
shared package and remove the duplicate function declarations (ensure TypeScript
types are preserved: Request, Response, NextFunction, RequestHandler).

In `@test-results/security-results.json`:
- Line 1: The committed JSON test report contains non-deterministic fields that
cause noisy diffs and leak CI paths; remove or sanitize the volatile keys before
committing: strip top-level startTime and endTime, strip per-test duration under
testResults[*].assertionResults[*].duration, and scrub any absolute runner path
strings (e.g., occurrences matching /workspaces/...) from testResults[*].name or
other fields; implement this either by updating the test reporter to emit a
sanitized summary (or a deterministic minimal report) or by adding a small
post-test step (script) that loads the report, removes/normalizes those keys,
and writes the sanitized JSON instead of the raw runner output.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6fe0d375-e800-4086-8f64-2174955a1241

📥 Commits

Reviewing files that changed from the base of the PR and between ef6978a and 5f3d8b0.

📒 Files selected for processing (20)
  • .github/workflows/containers.yml
  • docs/claude/skills/openid-federation/SKILL.md
  • docs/security-faq.md
  • packages/fedmgr-cli/package.json
  • packages/fedmgr-cli/src/commands/trustmark.ts
  • packages/fedmgr-mcp/src/index.ts
  • packages/kms/src/index.ts
  • packages/kms/src/trust-validator.ts
  • packages/kms/src/validate-url.ts
  • packages/ta-server/src/enrollment/index.ts
  • packages/ta-server/src/federation/subordinate-statements.ts
  • packages/ta-server/src/index.ts
  • packages/ta-server/src/utils/async-handler.ts
  • packages/tmi-server/src/index.ts
  • packages/tmi-server/src/utils/async-handler.ts
  • test-results/e2e-results.json
  • test-results/integration-results.json
  • test-results/junit.xml
  • test-results/security-results.json
  • test-results/unit-results.json
✅ Files skipped from review due to trivial changes (1)
  • test-results/unit-results.json
🚧 Files skipped from review as they are similar to previous changes (3)
  • test-results/junit.xml
  • test-results/integration-results.json
  • test-results/e2e-results.json

Comment on lines +181 to +193
- name: Decrypt TMI key to tmpfs
if: ${{ secrets.TMI_PRIV_JWE != '' }}
env:
TMI_PRIV_JWE: ${{ secrets.TMI_PRIV_JWE }}
TMI_JWE_PASSPHRASE: ${{ secrets.TMI_JWE_PASSPHRASE }}
run: |
mkdir -p /dev/shm/tmi-ci
echo "$TMI_PRIV_JWE" | base64 -d > /dev/shm/tmi-ci/tmi.priv.jwe
# Decrypt JWE → plaintext JWK on tmpfs
echo "$TMI_JWE_PASSPHRASE" | \
step crypto jwe decrypt \
--in /dev/shm/tmi-ci/tmi.priv.jwe \
> /dev/shm/tmi-ci/tmi.priv.jwk
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail
echo "step CLI invocations:"
rg -n 'step crypto jwe decrypt|[[:space:]]step[[:space:]]' .github/workflows/containers.yml || true
echo
echo "Potential installation references:"
rg -n 'smallstep|step-cli|install step|apt-get.*step|brew install step' .github/workflows/containers.yml || true

Repository: teamktown/fedmgr

Length of output: 246


🏁 Script executed:

# Check if step installation is implicit or happens in earlier workflow steps
cat -n .github/workflows/containers.yml | sed -n '1,195p'

Repository: teamktown/fedmgr

Length of output: 8779


🏁 Script executed:

# Find package.json files and check for step/smallstep dependencies
find . -name "package.json" -type f | while read f; do
  echo "=== $f ==="
  jq '.dependencies,.devDependencies | to_entries[] | select(.key | contains("step")) | "\(.key): \(.value)"' "$f" 2>/dev/null || true
done

# Also check if step is in the workspace
echo
echo "=== Looking for fedmgr-cli package.json ==="
find . -path "*fedmgr-cli*" -name "package.json" -type f

Repository: teamktown/fedmgr

Length of output: 439


🏁 Script executed:

# Check fedmgr-cli package.json for dependencies
echo "=== Root package.json ==="
cat package.json | jq '.dependencies // {} | keys[]' 2>/dev/null || echo "No dependencies section or invalid JSON"

echo
echo "=== fedmgr-cli package.json ==="
cat packages/fedmgr-cli/package.json | jq '.dependencies // {}, .devDependencies // {} | keys[]' 2>/dev/null || echo "No dependencies"

# Also check if `@smallstep` is anywhere
echo
echo "=== Searching for smallstep references in package files ==="
rg -i '@smallstep|"step"' ./packages/fedmgr-cli/package.json ./package.json || echo "Not found"

# Check if step is preinstalled in ubuntu-latest
echo
echo "=== Checking npm for `@smallstep/step-cli` package ==="
npm search `@smallstep/step-cli` 2>&1 | head -20 || echo "Search not available in sandbox"

Repository: teamktown/fedmgr

Length of output: 989


🌐 Web query:

step CLI preinstalled ubuntu-latest GitHub Actions runner

💡 Result:

step (Smallstep Step CLI / step-cli) is not preinstalled on the GitHub-hosted ubuntu-latest runner (which currently maps to Ubuntu 24.04)—it does not appear in the official Ubuntu 24.04 runner “Installed Software” list. [1], [2]

To use it, install it in your workflow (e.g., download a .deb/tarball from Smallstep’s release/install instructions). [3]

Example (install in a step):

- name: Install step-cli
  run: |
    curl -LO https://dl.smallstep.com/cli/docs-ca-install/latest/step_linux_amd64.tar.gz
    tar -xzf step_linux_amd64.tar.gz
    sudo install -m 0755 step_*/bin/step /usr/local/bin/step
    step version

Sources:
[1] GitHub runner-images repo (“ubuntu-latest” → Ubuntu 24.04)
[2] Ubuntu 24.04 runner image installed software list (no step/step-cli)
[3] Smallstep installation docs (download/install options for Step CLI)


🏁 Script executed:

# Check fedmgr-cli package.json scripts section
echo "=== fedmgr-cli build scripts ==="
cat packages/fedmgr-cli/package.json | jq '.scripts'

# Also check if there are any installation scripts
echo
echo "=== All scripts in fedmgr-cli ==="
cat packages/fedmgr-cli/package.json | jq 'keys'

Repository: teamktown/fedmgr

Length of output: 488


🏁 Script executed:

# Check if there's any mention of step installation elsewhere in the repo
# that might be called during the build process
echo "=== Checking for step installation in any workflow files ==="
rg -n 'step.*install|install.*step' .github/workflows/ || echo "No step installation found"

echo
echo "=== Checking Dockerfiles for step ==="
rg -n 'step' Dockerfile* || echo "No step in Dockerfiles"

echo
echo "=== Checking for setup-step action or similar ==="
rg -n 'step-cli|smallstep|setup.*step' .github/workflows/ || echo "Not found"

Repository: teamktown/fedmgr

Length of output: 270


Install step CLI before using it

Line 191 invokes step crypto jwe decrypt, but this workflow does not explicitly install the step CLI. The tool is neither preinstalled on ubuntu-latest runners nor listed as a dependency in the project. This will cause the step to fail at runtime.

Add an explicit installation step before line 181:

- name: Install step CLI
  if: ${{ secrets.TMI_PRIV_JWE != '' }}
  run: |
    curl -LO https://dl.smallstep.com/cli/docs-ca-install/latest/step_linux_amd64.tar.gz
    tar -xzf step_linux_amd64.tar.gz
    sudo install -m 0755 step_*/bin/step /usr/local/bin/step
🧰 Tools
🪛 actionlint (1.7.11)

[error] 182-182: context "secrets" is not allowed here. available contexts are "env", "github", "inputs", "job", "matrix", "needs", "runner", "steps", "strategy", "vars". see https://docs.github.com/en/actions/learn-github-actions/contexts#context-availability for more details

(expression)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/containers.yml around lines 181 - 193, The workflow step
"Decrypt TMI key to tmpfs" uses the external "step" CLI but never installs it;
add a preceding job step (conditional on the same secret check used by the
decrypt step) that installs the step CLI before invoking "step crypto jwe
decrypt" so the decrypt command can run successfully; name it e.g. "Install step
CLI", use the same if: ${{ secrets.TMI_PRIV_JWE != '' }} guard, and run the
provided curl/tar/install commands to place the "step" binary on PATH.

Comment on lines +206 to +208
TMI_PRIV_JWK_PATH="$TMI_KEY" \
TMI_PUB_JWK_PATH="packages/${{ matrix.service }}/keys/tmi.pub.jwk" \
TMI_ISSUER="$TMI_ISSUER" \
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail
echo "Workflow-configured TMI_PUB_JWK_PATH:"
rg -n 'TMI_PUB_JWK_PATH=' .github/workflows/containers.yml
echo
echo "Available tmi.pub.jwk files:"
fd 'tmi\.pub\.jwk$' packages

Repository: teamktown/fedmgr

Length of output: 210


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Checking matrix.service values in workflow ==="
rg -A 20 'matrix:' .github/workflows/containers.yml | head -30

echo
echo "=== List all directories in packages ==="
ls -la packages/ 2>/dev/null || echo "packages dir not found"

echo
echo "=== Check packages subdirectories and their structure ==="
find packages -maxdepth 2 -type d 2>/dev/null | sort

echo
echo "=== Search for any .jwk files in entire repo ==="
fd '\.jwk$' . 2>/dev/null || echo "No .jwk files found"

echo
echo "=== Check context around line 211 in containers.yml ==="
sed -n '205,215p' .github/workflows/containers.yml

Repository: teamktown/fedmgr

Length of output: 2773


Use a fixed TMI public-key path for the local TMI process

Line 207 derives TMI_PUB_JWK_PATH from ${{ matrix.service }}, but line 211 always starts packages/tmi-server/dist/index.js. When matrix.service is ta-server, this points at packages/ta-server/keys/tmi.pub.jwk, which is incorrect—the TMI server process requires the key from tmi-server regardless of which service is being deployed.

Proposed fix
-          TMI_PUB_JWK_PATH="packages/${{ matrix.service }}/keys/tmi.pub.jwk" \
+          TMI_PUB_JWK_PATH="packages/tmi-server/keys/tmi.pub.jwk" \
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
TMI_PRIV_JWK_PATH="$TMI_KEY" \
TMI_PUB_JWK_PATH="packages/${{ matrix.service }}/keys/tmi.pub.jwk" \
TMI_ISSUER="$TMI_ISSUER" \
TMI_PRIV_JWK_PATH="$TMI_KEY" \
TMI_PUB_JWK_PATH="packages/tmi-server/keys/tmi.pub.jwk" \
TMI_ISSUER="$TMI_ISSUER" \
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/containers.yml around lines 206 - 208, The TMI public JWK
path is incorrectly derived from the matrix.service variable
(TMI_PUB_JWK_PATH="$TMI_KEY" ... TMI_PUB_JWK_PATH="packages/${{ matrix.service
}}/keys/tmi.pub.jwk"), which can point to the wrong package when launching the
TMI process; change the TMI_PUB_JWK_PATH assignment to use the fixed tmi-server
key path (e.g., set TMI_PUB_JWK_PATH to "packages/tmi-server/keys/tmi.pub.jwk")
so the TMI process always loads the correct public key regardless of
matrix.service.

Comment on lines +63 to +67
A JWT issued by a Trustmark Issuer (`iss`) about a subject (`sub`):
- `id` — trustmark type URI
- `iat` / `exp` — issued-at and expiry
- `jku` — JWKS URL of the TMI (for signature verification)
- Optional: `image_digest`, `repo`, `evidence`, `adopted_from_*`
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Clarify jku placement terminology consistently.

The doc alternates between claim/header wording and uses “claim in JWS header,” which is internally inconsistent terminology. Pick one consistent statement across sections.

Suggested wording cleanup
-- `jku` — JWKS URL of the TMI (for signature verification)
+- `jku` — JWS header parameter pointing to the TMI JWKS (for signature verification)

-- [ ] Fetch JWKS via `jku` claim in JWS header
+- [ ] Fetch JWKS via `jku` in the JWS protected header

Also applies to: 326-332

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/claude/skills/openid-federation/SKILL.md` around lines 63 - 67, The doc
inconsistently describes where the jku appears (mixing "claim" and "JWS
header"); update all occurrences (e.g., the list under "A JWT issued by a
Trustmark Issuer" that references `jku`, and the other passage flagged around
the later section) to consistently state that `jku` is a JWS/JWT header
parameter (not a claim), and reword any phrase like "claim in JWS header" to a
single consistent phrase such as "JWS header parameter `jku`" throughout the
document.

Comment on lines +189 to +193
```
entity_id = https://victim.example.com ← allowed origin
jwks_url = https://victim.example.com/... ← ✓ same origin
jwks_url = https://attacker.com/... ← ✗ rejected
```
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add a language identifier to the fenced code block.

Line 189 uses an untyped fence, which triggers markdownlint MD040 and weakens rendering/tooling consistency.

Suggested doc fix
-```
+```text
 entity_id = https://victim.example.com      ← allowed origin
 jwks_url  = https://victim.example.com/...  ← ✓ same origin
 jwks_url  = https://attacker.com/...        ← ✗ rejected
</details>

<!-- suggestion_start -->

<details>
<summary>📝 Committable suggestion</summary>

> ‼️ **IMPORTANT**
> Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

```suggestion

🧰 Tools
🪛 markdownlint-cli2 (0.21.0)

[warning] 189-189: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/claude/skills/openid-federation/SKILL.md` around lines 189 - 193, The
fenced code block in SKILL.md (the triple-backtick block that shows
entity_id/jwks_url examples) lacks a language identifier; update that code fence
to include a language tag (e.g., "text") so the block becomes ```text ... ``` to
satisfy markdownlint MD040 and improve rendering/tooling consistency.

Comment on lines +245 to +254
### 4. Fetch Timeouts and Size Caps

```typescript
const controller = new AbortController();
const timeout = setTimeout(() => controller.abort(), 10_000); // 10s
try {
const r = await fetch(url, { signal: controller.signal });
const text = await r.text();
if (text.length > 1024 * 1024) throw new Error("Response too large");
return JSON.parse(text);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Unify fetch size-cap guidance (currently 1MB vs 64KB).

The timeout/size example allows 1024 * 1024 bytes, while the “Common Mistakes” table says ≤ 64KB. This mismatch can lead to inconsistent implementations.

Suggested doc fix (align to one policy)
-  if (text.length > 1024 * 1024) throw new Error("Response too large");
+  if (text.length > 64 * 1024) throw new Error("Response too large");

Also applies to: 347-349

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/claude/skills/openid-federation/SKILL.md` around lines 245 - 254, The
example in the fetch timeout/size snippet uses a 1MB cap (text.length > 1024 *
1024) which conflicts with the "Common Mistakes" guidance of ≤64KB; change the
size-check in the example (the check that throws after reading r.text(),
currently using 1024 * 1024) to use 64 * 1024 (or otherwise match the chosen
canonical cap) and update any other occurrences of 1024 * 1024 in the docs
(including the other example mentioned) so the AbortController/timeout usage
(controller, timeout, controller.abort, fetch, r.text(), JSON.parse) and the
table are consistent.

Comment on lines +240 to +246
const body: Record<string, unknown> = {
sub: subjectUrl,
ttl: ttlSeconds,
};

if (args["trustmark_id"]) body["id"] = String(args["trustmark_id"]);
if (args["image_digest"]) body["image_digest"] = String(args["image_digest"]);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Request body field names don't match TMI server API.

The TMI server expects ttl_s and trustmark_id, but this tool sends ttl and id. This will cause the server to ignore these values and use defaults instead.

Based on the TMI server implementation in packages/tmi-server/src/index.ts, the expected field names are ttl_s for TTL and trustmark_id for the trustmark identifier.

🐛 Proposed fix
   const body: Record<string, unknown> = {
     sub: subjectUrl,
-    ttl: ttlSeconds,
+    ttl_s: ttlSeconds,
   };

-  if (args["trustmark_id"]) body["id"] = String(args["trustmark_id"]);
+  if (args["trustmark_id"]) body["trustmark_id"] = String(args["trustmark_id"]);
   if (args["image_digest"]) body["image_digest"] = String(args["image_digest"]);
   if (args["repo"]) body["repo"] = String(args["repo"]);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const body: Record<string, unknown> = {
sub: subjectUrl,
ttl: ttlSeconds,
};
if (args["trustmark_id"]) body["id"] = String(args["trustmark_id"]);
if (args["image_digest"]) body["image_digest"] = String(args["image_digest"]);
const body: Record<string, unknown> = {
sub: subjectUrl,
ttl_s: ttlSeconds,
};
if (args["trustmark_id"]) body["trustmark_id"] = String(args["trustmark_id"]);
if (args["image_digest"]) body["image_digest"] = String(args["image_digest"]);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/fedmgr-mcp/src/index.ts` around lines 240 - 246, The request body
uses the wrong field names: change the body to send ttl_s instead of ttl and
trustmark_id instead of id; specifically update the object built in index.ts
(variable body) so it sets body["ttl_s"] = ttlSeconds (not ttl) and
body["trustmark_id"] = String(args["trustmark_id"]) (instead of body["id"]),
keep body["image_digest"] = String(args["image_digest"]) and retain sub:
subjectUrl; adjust any code that reads or constructs body to use these keys.

Comment on lines +323 to +331
async function toolGetSignedConfig(args: Record<string, unknown>): Promise<string> {
const entityUrl = requireSafeUrl(String(args["entity_url"]), "entity_url").replace(/\/$/, "");
const configUrl = `${entityUrl}/.well-known/openid-federation`;

const res = await fetch(configUrl);
if (!res.ok) {
throw new Error(`GET ${configUrl} returned HTTP ${res.status} ${res.statusText}`);
}
const jwtText = await res.text();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Missing timeout and size protection on raw fetch.

This function uses raw fetch instead of fetchJson, bypassing the timeout (10s) and size limit (1MB) protections. Since entityUrl is user-provided, this could allow DoS via slow responses or large payloads.

⏲️ Proposed fix using fetchJson pattern
 async function toolGetSignedConfig(args: Record<string, unknown>): Promise<string> {
   const entityUrl = requireSafeUrl(String(args["entity_url"]), "entity_url").replace(/\/$/, "");
   const configUrl = `${entityUrl}/.well-known/openid-federation`;

-  const res = await fetch(configUrl);
-  if (!res.ok) {
-    throw new Error(`GET ${configUrl} returned HTTP ${res.status} ${res.statusText}`);
+  const controller = new AbortController();
+  const timeout = setTimeout(() => controller.abort(), FETCH_TIMEOUT_MS);
+  let res: Response;
+  try {
+    res = await fetch(configUrl, { signal: controller.signal });
+  } finally {
+    clearTimeout(timeout);
   }
-  const jwtText = await res.text();
+
+  if (!res.ok) {
+    throw new Error(`GET ${configUrl} returned HTTP ${res.status} ${res.statusText}`);
+  }
+
+  const jwtText = await res.text();
+  if (jwtText.length > FETCH_MAX_BYTES) {
+    throw new Error(`Response too large (${jwtText.length} bytes, max ${FETCH_MAX_BYTES})`);
+  }
   const { header, payload } = unsafeDecodeJwtParts(jwtText.trim());
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/fedmgr-mcp/src/index.ts` around lines 323 - 331, The raw fetch in
toolGetSignedConfig bypasses timeout/size protections; replace the fetch call
with the shared fetchJson utility (or equivalent) used elsewhere, invoking it
for configUrl with timeout 10000 and maxBody 1048576, then validate res.ok as
before and extract the body as text (or ensure fetchJson returns the response
text); update references to res/res.text() accordingly so toolGetSignedConfig
still throws the same HTTP error when !res.ok but benefits from the 10s timeout
and 1MB size limit.

Comment on lines +284 to +303
const fetchEntityStatement = async (entityUrl: string): Promise<string> => {
const url = `${entityUrl.replace(/\/$/, "")}/.well-known/openid-federation`;
const r = await fetchFn(url);
if (!r.ok) {
throw new Error(`GET ${url} returned HTTP ${r.status}`);
}
return r.text();
};

const fetchSubordinateStatement = async (
taUrl: string,
subEntityId: string
): Promise<string> => {
const url = `${taUrl.replace(/\/$/, "")}/federation_fetch?sub=${encodeURIComponent(subEntityId)}`;
const r = await fetchFn(url);
if (!r.ok) {
throw new Error(`federation_fetch returned HTTP ${r.status} (sub=${subEntityId})`);
}
return r.text();
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Authority hint URLs should be SSRF-validated before fetching.

The fetchEntityStatement helper (lines 284-291) and fetchSubordinateStatement (lines 293-303) construct URLs from authority_hints values without SSRF validation. A malicious entity statement could include an authority_hints entry pointing to internal services.

🛡️ Proposed fix to add SSRF protection
   const fetchEntityStatement = async (entityUrl: string): Promise<string> => {
+    assertSafeUrl(entityUrl, "authority_hint");
     const url = `${entityUrl.replace(/\/$/, "")}/.well-known/openid-federation`;
     const r = await fetchFn(url);
     if (!r.ok) {
       throw new Error(`GET ${url} returned HTTP ${r.status}`);
     }
     return r.text();
   };

   const fetchSubordinateStatement = async (
     taUrl: string,
     subEntityId: string
   ): Promise<string> => {
+    assertSafeUrl(taUrl, "authority_url");
     const url = `${taUrl.replace(/\/$/, "")}/federation_fetch?sub=${encodeURIComponent(subEntityId)}`;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/kms/src/trust-validator.ts` around lines 284 - 303,
fetchEntityStatement and fetchSubordinateStatement currently build URLs from
authority_hints and call fetchFn without SSRF checks; add SSRF validation before
constructing/fetching those URLs (e.g., validate scheme is https, resolve and
reject private/loopback/internal IPs or enforce an allowlist) and surface
validation failures as errors instead of fetching. Update the helpers
(fetchEntityStatement, fetchSubordinateStatement) to call a shared validation
utility (or implement validateAuthorityHint) that takes the raw authority_hints
value, verifies hostname/IP and scheme safety, and only then returns/permits the
sanitized URL for fetchFn; keep the existing error handling for non-OK
responses.

Comment on lines +43 to +64
const hostname = parsed.hostname.toLowerCase();
if ((hostname === "localhost" || hostname === "[::1]" || hostname === "::1") && !isDev) {
throw new UrlSafetyError(`${fieldName} targets loopback. Use a routable HTTPS URL.`);
}

const ipv4Match = /^(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})$/.exec(hostname);
if (ipv4Match) {
const ip = ipv4Match[1]!;
if (!(ip.startsWith("127.") && isDev)) {
for (const pattern of PRIVATE_IPV4) {
if (pattern.test(ip)) {
throw new UrlSafetyError(
`${fieldName} targets private IP ${ip}. Private ranges are blocked (SSRF protection).`
);
}
}
}
}

if (hostname.startsWith("[fe80") || hostname.startsWith("fe80")) {
throw new UrlSafetyError(`${fieldName} uses IPv6 link-local address (not routable).`);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

IPv6 SSRF protection is incomplete.

The validator has gaps in IPv6 handling:

  1. ::1 is only blocked when it appears as the hostname string, but not in bracketed form for all cases
  2. IPv6 unique local addresses (fc00::/7, fd00::/8) are not blocked
  3. IPv4-mapped IPv6 addresses like ::ffff:10.0.0.1 could bypass the IPv4 private range checks
🛡️ Proposed fix to strengthen IPv6 checks
   if (hostname.startsWith("[fe80") || hostname.startsWith("fe80")) {
     throw new UrlSafetyError(`${fieldName} uses IPv6 link-local address (not routable).`);
   }
+
+  // Block IPv6 unique local addresses (fc00::/7)
+  if (hostname.startsWith("[fc") || hostname.startsWith("[fd") ||
+      hostname.startsWith("fc") || hostname.startsWith("fd")) {
+    throw new UrlSafetyError(`${fieldName} uses IPv6 unique local address (not routable).`);
+  }
+
+  // Block IPv4-mapped IPv6 addresses (::ffff:x.x.x.x)
+  const mappedMatch = /^\[?::ffff:(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\]?$/i.exec(hostname);
+  if (mappedMatch) {
+    const mappedIp = mappedMatch[1]!;
+    for (const pattern of PRIVATE_IPV4) {
+      if (pattern.test(mappedIp)) {
+        throw new UrlSafetyError(
+          `${fieldName} uses IPv4-mapped IPv6 with private IP ${mappedIp}. Blocked (SSRF protection).`
+        );
+      }
+    }
+  }
 
   return parsed;
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const hostname = parsed.hostname.toLowerCase();
if ((hostname === "localhost" || hostname === "[::1]" || hostname === "::1") && !isDev) {
throw new UrlSafetyError(`${fieldName} targets loopback. Use a routable HTTPS URL.`);
}
const ipv4Match = /^(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})$/.exec(hostname);
if (ipv4Match) {
const ip = ipv4Match[1]!;
if (!(ip.startsWith("127.") && isDev)) {
for (const pattern of PRIVATE_IPV4) {
if (pattern.test(ip)) {
throw new UrlSafetyError(
`${fieldName} targets private IP ${ip}. Private ranges are blocked (SSRF protection).`
);
}
}
}
}
if (hostname.startsWith("[fe80") || hostname.startsWith("fe80")) {
throw new UrlSafetyError(`${fieldName} uses IPv6 link-local address (not routable).`);
}
const hostname = parsed.hostname.toLowerCase();
if ((hostname === "localhost" || hostname === "[::1]" || hostname === "::1") && !isDev) {
throw new UrlSafetyError(`${fieldName} targets loopback. Use a routable HTTPS URL.`);
}
const ipv4Match = /^(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})$/.exec(hostname);
if (ipv4Match) {
const ip = ipv4Match[1]!;
if (!(ip.startsWith("127.") && isDev)) {
for (const pattern of PRIVATE_IPV4) {
if (pattern.test(ip)) {
throw new UrlSafetyError(
`${fieldName} targets private IP ${ip}. Private ranges are blocked (SSRF protection).`
);
}
}
}
}
if (hostname.startsWith("[fe80") || hostname.startsWith("fe80")) {
throw new UrlSafetyError(`${fieldName} uses IPv6 link-local address (not routable).`);
}
// Block IPv6 unique local addresses (fc00::/7)
if (hostname.startsWith("[fc") || hostname.startsWith("[fd") ||
hostname.startsWith("fc") || hostname.startsWith("fd")) {
throw new UrlSafetyError(`${fieldName} uses IPv6 unique local address (not routable).`);
}
// Block IPv4-mapped IPv6 addresses (::ffff:x.x.x.x)
const mappedMatch = /^\[?::ffff:(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\]?$/i.exec(hostname);
if (mappedMatch) {
const mappedIp = mappedMatch[1]!;
for (const pattern of PRIVATE_IPV4) {
if (pattern.test(mappedIp)) {
throw new UrlSafetyError(
`${fieldName} uses IPv4-mapped IPv6 with private IP ${mappedIp}. Blocked (SSRF protection).`
);
}
}
}
return parsed;
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/kms/src/validate-url.ts` around lines 43 - 64, The IPv6 handling in
validate-url.ts is incomplete: update the hostname checks (where parsed.hostname
is used) to normalize and detect bracketed loopback forms (e.g., "[::1]" and
variants) and treat them as loopback the same as "::1"; add checks for IPv6
Unique Local Addresses (fc00::/7, e.g., prefixes "fc" or "fd") and throw
UrlSafetyError like other private-range checks; detect IPv4-mapped IPv6
addresses (patterns like ::ffff:x.y.z.w or ::ffff:xxxx:xxxx) by extracting the
embedded IPv4, then run the existing PRIVATE_IPV4 tests against that IPv4 and
block if matched; reference/update the logic around hostname, the existing
PRIVATE_IPV4 loop, and the IPv6 link-local check to include these additional
guards.

Comment on lines +332 to +346
// ── Activate the subordinate ──────────────────────────────────────
store.upsertSubordinate({
entityId: enrollment.entityId,
jwksUrl: enrollment.jwksUrl,
jwks: { keys: keys as { keys: never[] }["keys"] },
metadata: null,
status: "active",
isIntermediate: false,
revocationReason: null,
revokedAt: null,
revokedBy: null,
notes: null,
});

store.updateEnrollment(enrollmentId, "completed");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if updateEnrollment uses any atomic row locking
rg -n -A 10 "updateEnrollment" packages/ta-server/src/db/sqlite-store.ts

Repository: teamktown/fedmgr

Length of output: 516


🏁 Script executed:

#!/bin/bash
# Get the complete updateEnrollment implementation
rg -n -A 20 "updateEnrollment" packages/ta-server/src/db/sqlite-store.ts | head -40

Repository: teamktown/fedmgr

Length of output: 708


🏁 Script executed:

#!/bin/bash
# Check for upsertSubordinate implementation
rg -n -A 15 "upsertSubordinate" packages/ta-server/src/db/sqlite-store.ts

Repository: teamktown/fedmgr

Length of output: 819


🏁 Script executed:

#!/bin/bash
# Check the enrollment complete endpoint to see the full flow
cat -n packages/ta-server/src/enrollment/index.ts | sed -n '320,350p'

Repository: teamktown/fedmgr

Length of output: 1367


🏁 Script executed:

#!/bin/bash
# Search for transaction handling in the enrollment flow
rg -n "transaction\|db\.transaction\|BEGIN\|COMMIT" packages/ta-server/src/db/sqlite-store.ts | head -20

Repository: teamktown/fedmgr

Length of output: 42


🏁 Script executed:

#!/bin/bash
# Get the full enrollment completion endpoint to see the context
cat -n packages/ta-server/src/enrollment/index.ts | sed -n '290,360p'

Repository: teamktown/fedmgr

Length of output: 3069


🏁 Script executed:

#!/bin/bash
# Check how enrollment is fetched and what status checks exist
rg -n -B 5 "Activate the subordinate" packages/ta-server/src/enrollment/index.ts

Repository: teamktown/fedmgr

Length of output: 275


🏁 Script executed:

#!/bin/bash
# Look at the full endpoint starting from the route definition
cat -n packages/ta-server/src/enrollment/index.ts | sed -n '260,295p'

Repository: teamktown/fedmgr

Length of output: 1650


🏁 Script executed:

#!/bin/bash
# Search for where enrollment is loaded/fetched in the endpoint
rg -n -B 20 "Verify nonce" packages/ta-server/src/enrollment/index.ts | head -40

Repository: teamktown/fedmgr

Length of output: 912


🏁 Script executed:

#!/bin/bash
# Look at the very beginning of the endpoint handler
cat -n packages/ta-server/src/enrollment/index.ts | sed -n '220,270p'

Repository: teamktown/fedmgr

Length of output: 2345


🏁 Script executed:

#!/bin/bash
# Search for the enrollment fetch and any status check
rg -n "enrollment\s*=" packages/ta-server/src/enrollment/index.ts | head -10

Repository: teamktown/fedmgr

Length of output: 185


🏁 Script executed:

#!/bin/bash
# Get the full context from enrollment fetch to proof verification
cat -n packages/ta-server/src/enrollment/index.ts | sed -n '196,235p'

Repository: teamktown/fedmgr

Length of output: 1580


🏁 Script executed:

#!/bin/bash
# Check what happens with the signed statement after it's generated
cat -n packages/ta-server/src/enrollment/index.ts | sed -n '351,380p'

Repository: teamktown/fedmgr

Length of output: 1465


Concurrent completion requests can create duplicate audit entries and signed statements.

There's a time gap between the status check (line 216) and the actual update operations. Two concurrent requests can both pass the enrollment.status === "pending" check, then execute upsertSubordinateupdateEnrollmentauditsignSubordinateStatement independently. While upsertSubordinate is idempotent (via ON CONFLICT), updateEnrollment and audit are not — the latter will create duplicate audit entries and duplicate signed statements. Wrap these operations in a database transaction or perform an atomic check-and-update on the enrollment status to ensure only one request successfully completes the enrollment.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ta-server/src/enrollment/index.ts` around lines 332 - 346, The
completion path currently checks enrollment.status === "pending" then calls
store.upsertSubordinate, store.updateEnrollment, audit, and
signSubordinateStatement, which can race; wrap the critical sequence in an
atomic database operation so only one request can complete: either (a) perform
the enrollment status change with an atomic check-and-update (e.g.,
updateEnrollment(enrollmentId, "completed") or a new method that updates WHERE
status='pending' RETURNING success) and only run
upsertSubordinate/audit/signSubordinateStatement when that update returned
success, or (b) execute the upsertSubordinate + updateEnrollment + audit +
signSubordinateStatement inside a DB transaction with a SELECT ... FOR UPDATE on
the enrollment row to serialize concurrent requests; reference
store.upsertSubordinate, store.updateEnrollment, audit, signSubordinateStatement
and the enrollment.status check to implement the chosen atomic/transactional
approach.

@aeoess
Copy link
Copy Markdown

aeoess commented Mar 18, 2026

This is a natural complement to what we're building with the Agent Passport System (npm: agent-passport-system, 534 tests, 61 MCP tools).

Our MCP server handles agent-to-agent identity: Ed25519 passports, scoped delegation chains with cascade revocation, and a 3-signature policy chain (intent → evaluation → receipt). But we don't currently address the trust anchor problem — how does a verifier know which issuers to trust?

OpenID Federation 1.0 solves exactly that layer. Your trust anchors and trustmark issuance provide the institutional trust that our cryptographic identity sits on top of. The composition would be:

  1. OpenID Federation establishes which organizations and trust anchors are legitimate
  2. Agent Passport provides the agent-level identity, delegation, and governance within that trust framework
  3. MCP is the transport for both

Concrete integration point: when our MCP server verifies a delegation chain, it currently checks Ed25519 signatures but has no way to verify that the root principal is a recognized entity. An OpenID Federation trust anchor could validate the root — the human or organization at the top of the delegation chain is a verified member of a federation.

We also just published an RFC for a cross-engine execution envelope that could carry federation metadata alongside governance evidence.

Would be interested in exploring what the integration surface looks like. Does fedmgr expose trust anchor verification as an API that other MCP servers could call?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants