From 1697c3bb47783cd05031ea1cb082dfa54363ae0e Mon Sep 17 00:00:00 2001 From: James Ross Date: Tue, 3 Mar 2026 18:16:27 -0800 Subject: [PATCH 01/41] docs: add forensic architectural audit report Zero-knowledge code extraction, critical assessment, roadmap reconciliation, and prescriptive blueprint for @git-stunts/git-cas. Covers all 31 source files, 61 test files, and 12 CLI files. --- CODE-EVAL.md | 605 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 605 insertions(+) create mode 100644 CODE-EVAL.md diff --git a/CODE-EVAL.md b/CODE-EVAL.md new file mode 100644 index 0000000..3ff5cce --- /dev/null +++ b/CODE-EVAL.md @@ -0,0 +1,605 @@ +# Forensic Architectural Audit: `@git-stunts/git-cas` + +**Audit Date:** 2026-03-03 +**Repository State:** `0f7f8e658e6cd094176541ac68d33b2a6ec75a91` (HEAD, `main`) +**Auditor:** Claude Opus 4.6, operating under zero-knowledge forensic protocol +**Version Under Audit:** 5.2.4 + +--- + +## Activity Log — Discovery Narrative + +The exploration began at the repository root with a simultaneous five-pronged dive: core domain services, infrastructure adapters, ports/codecs/chunkers, test structure, and type definitions. The first thing that jumped out — before reading a single line of code — was the file tree. Thirty-one source files, twelve bin files, sixty-one test files. A 3.1:1 test-to-source ratio. That alone telegraphs intent: someone cares about correctness here. + +The ports directory was my Rosetta Stone. Six abstract base classes — `CryptoPort`, `CodecPort`, `GitPersistencePort`, `GitRefPort`, `ObservabilityPort`, `ChunkingPort` — each throwing `'Not implemented'`. Textbook hexagonal architecture. I already knew this was a ports-and-adapters system before reading a single service file. + +`CasService.js` at 911 lines is the gravitational center. It imports no infrastructure directly — only ports. Good. `KeyResolver.js` (220 lines) handles all cryptographic key orchestration, recently extracted from CasService (the M15 Prism task card confirmed this). `VaultService.js` (467 lines) operates on a separate Git ref (`refs/cas/vault`) with compare-and-swap concurrency control. + +The three crypto adapters (`NodeCryptoAdapter`, `WebCryptoAdapter`, `BunCryptoAdapter`) are where I started changing my initial opinions. I expected copy-paste sloppiness — instead I found runtime-specific optimizations (Bun's native `CryptoHasher`, Web Crypto's `subtle` API) all converging on identical cryptographic parameters: AES-256-GCM, 12-byte nonce, 16-byte tag, SHA-256 content hashing. But the behavioral discrepancies between adapters (see Phase 2) tell a more nuanced story. + +The CDC chunker (`CdcChunker.js`) surprised me. A hand-rolled buzhash rolling hash with a 64-byte sliding window, xorshift64-seeded lookup table, and three-phase processing pipeline (fill window, feed pre-minimum, scan boundary). This is not commodity code — it's a bespoke content-defined chunking engine. + +The test suite confirmed the architecture: 833+ unit tests, crypto is never mocked (always real adapters), persistence is always mocked (in-memory maps), integration tests gate on Docker (`GIT_STUNTS_DOCKER=1`). The fuzz testing coverage is noteworthy — 50-iteration fuzz rounds for crypto, chunking, and store/restore. + +The CLI (`bin/git-cas.js`, 657 lines) implements a full TEA (The Elm Architecture) interactive dashboard. That's architecturally ambitious for a storage utility. + +My opinion shifted most dramatically on the vault system. I initially expected a simple key-value store backed by a file. Instead, it's a full commit chain on `refs/cas/vault` with optimistic concurrency control, exponential backoff retries, percent-encoded slug names, and atomic compare-and-swap ref updates. This is distributed-systems thinking applied to a local Git repo. + +--- + +## Phase 1: Zero-Knowledge Code Extraction + +### Deduced Value Proposition + +This system is a **content-addressed storage engine that uses Git's object database as its persistence layer**, with optional AES-256-GCM encryption, gzip compression, content-defined chunking, and a vault-based indexing system backed by Git refs. + +The core problem it solves: **storing, encrypting, versioning, and retrieving binary blobs entirely within Git's native object model** — no external servers, no sidecar databases, no LFS endpoints. Everything lives in `.git/objects` and is transportable via standard Git push/pull/clone. + +### Comprehensive Feature Set (Implemented) + +1. **Store**: Chunk a byte stream (fixed-size or CDC), optionally compress (gzip), optionally encrypt (AES-256-GCM), write chunks as Git blobs, produce a manifest. +2. **Restore**: Read chunks from Git blobs, verify SHA-256 integrity, decrypt, decompress, reassemble. +3. **Streaming Restore**: `restoreStream()` yields chunks as an async iterable — O(chunk_size) memory for unencrypted data. +4. **Content-Defined Chunking (CDC)**: Buzhash rolling hash with configurable min/max/target sizes. Deduplication-friendly. +5. **Fixed-Size Chunking**: Default 256 KiB, configurable. +6. **Merkle Tree Manifests**: Automatic manifest splitting when chunk count exceeds threshold (default 1000). Sub-manifest references with startIndex/chunkCount. +7. **Envelope Encryption**: DEK/KEK model. Random 32-byte DEK encrypts data; each recipient's KEK wraps the DEK independently. +8. **Multi-Recipient Management**: Add/remove recipients without re-encrypting data. +9. **Key Rotation**: Re-wrap DEK with new KEK. No data re-encryption — O(1) key rotation. +10. **Passphrase-Based Encryption**: PBKDF2 or scrypt KDF with configurable parameters. +11. **Vault System**: Git-ref-backed (`refs/cas/vault`) content registry with CAS (compare-and-swap) concurrency control. +12. **Vault Passphrase Rotation**: Re-wrap all envelope-encrypted vault entries with a new passphrase-derived KEK. +13. **Integrity Verification**: Per-chunk SHA-256 + GCM auth tag for encrypted data. +14. **Orphan Detection**: `findOrphanedChunks()` — reference-counting analysis across vault entries. +15. **Codec Pluggability**: JSON (human-readable) or CBOR (compact binary) manifests. +16. **Multi-Runtime Support**: Node.js 22, Bun, Deno — with runtime-specific crypto adapters. +17. **Observability**: Structured metrics (`chunk:stored`, `file:stored`, `integrity:pass/fail`), log levels, span tracing. +18. **CLI**: 18 commands including store, restore, verify, inspect, rotate, vault management, and an interactive TEA dashboard. +19. **Parallel I/O**: Semaphore-bounded concurrent blob writes (store) and read-ahead window (restore). +20. **File I/O Helpers**: `storeFile()` / `restoreFile()` for file-to-file convenience. + +### API Surface & Boundary + +**Public entrypoints** (as defined by package.json/jsr.json exports): + +| Entrypoint | Module | Primary Export | +|---|---|---| +| `.` (root) | `index.js` | `ContentAddressableStore` facade class | +| `./service` | `src/domain/services/CasService.js` | `CasService` (direct domain access) | +| `./schema` | `src/domain/schemas/ManifestSchema.js` | Zod schemas (ManifestSchema, ChunkSchema, etc.) | + +**Facade API** (`ContentAddressableStore`): + +| Method | Return | +|---|---| +| `store(options)` | `Promise` | +| `restore(options)` | `Promise<{ buffer, bytesWritten }>` | +| `restoreStream(options)` | `AsyncIterable` | +| `createTree(options)` | `Promise` (tree OID) | +| `readManifest(options)` | `Promise` | +| `verifyIntegrity(options)` | `Promise` | +| `deleteAsset(options)` | `Promise<{ slug, chunksOrphaned }>` | +| `findOrphanedChunks(options)` | `Promise<{ referenced, total }>` | +| `rotateKey(options)` | `Promise` | +| `addRecipient(options)` | `Promise` | +| `removeRecipient(options)` | `Promise` | +| `listRecipients(manifest)` | `string[]` | +| `deriveKey(options)` | `Promise<{ key, salt, params }>` | +| `getVaultService()` | `VaultService` | +| `rotateVaultPassphrase(options)` | `Promise<{ commitOid, rotatedSlugs, skippedSlugs }>` | + +**External system interface:** +- **Ingress**: File paths, byte streams (`AsyncIterable`), encryption keys (32-byte `Buffer`), passphrases (strings), vault slugs (strings). +- **Egress**: Git blob/tree OIDs (40-char hex strings), `Manifest` value objects, byte buffers, vault entries. +- **Infrastructure boundary**: All Git operations flow through `@git-stunts/plumbing` → `git` CLI subprocess. + +### Internal Architecture & Components + +``` +┌─────────────────────────────────────────────────────────┐ +│ ContentAddressableStore (index.js) — Facade │ +│ Wires ports, exposes unified API │ +└──────────────────────┬──────────────────────────────────┘ + │ + ┌─────────────┼──────────────┐ + │ │ │ +┌────────▼──────┐ ┌────▼─────┐ ┌─────▼──────────────────┐ +│ CasService │ │ Vault │ │ rotateVaultPassphrase │ +│ (911 lines) │ │ Service │ │ (standalone function) │ +│ │ │(467 lines│ └────────────────────────┘ +│ ┌───────────┐ │ └──────────┘ +│ │KeyResolver│ │ +│ │(220 lines)│ │ +│ └───────────┘ │ +└───────┬───────┘ + │ depends on (ports only) + ┌─────┼──────┬──────────┬────────────┐ + │ │ │ │ │ +┌─▼─┐ ┌▼──┐ ┌─▼──┐ ┌────▼────┐ ┌─────▼─────┐ +│Git│ │Git│ │Cry-│ │Observ- │ │Chunking │ +│Per│ │Ref│ │pto │ │ability │ │Port │ +│sis│ │Port│ │Port│ │Port │ │ │ +│ten│ │ │ │ │ │ │ │ │ +│ce │ │ │ │ │ │ │ │ │ +└─┬─┘ └─┬─┘ └──┬─┘ └────┬───┘ └─────┬─────┘ + │ │ │ │ │ + ▼ ▼ ▼ ▼ ▼ +┌───────────────────────────────────────────────┐ +│ Infrastructure Adapters │ +│ │ +│ GitPersistenceAdapter NodeCryptoAdapter │ +│ GitRefAdapter WebCryptoAdapter │ +│ FileIOHelper BunCryptoAdapter │ +│ EventEmitterObserver │ +│ JsonCodec / CborCodec SilentObserver │ +│ FixedChunker StatsCollector │ +│ CdcChunker │ +└───────────────────────────────────────────────┘ +``` + +The dependency direction is strictly inward: domain depends on ports (interfaces), infrastructure depends on ports (implements). The facade wires them together. No domain module imports any infrastructure module. + +### Mechanics & Internals + +#### Algorithms + +**Content-Defined Chunking (Buzhash):** +- Rolling hash over a 64-byte sliding window. +- Lookup table: 256-entry `Uint32Array` generated via xorshift64 PRNG seeded with `0x6a09e667f3bcc908` (SHA-256's first fractional prime constant — a nice touch). +- Hash update: `hash = (rotl32(hash, 1) ^ table[outgoing] ^ table[incoming]) >>> 0`. +- Boundary detection: `(hash & mask) === 0` where `mask = (1 << floor(log2(targetChunkSize))) - 1`. +- Three-phase pipeline: fill window (first 64 bytes), feed pre-minimum (accumulate until min chunk size), scan boundary (check on each byte until boundary or max). +- **Complexity**: O(n) where n = input bytes. Each byte requires one table lookup, one XOR, one rotate. The mask test is O(1). + +**Encryption:** +- AES-256-GCM with 12-byte random nonce and 16-byte authentication tag. +- Streaming encryption wraps the chunk pipeline (encrypt-then-chunk: the ciphertext is chunked, not the plaintext). +- DEK wrapping uses the same AES-256-GCM as data encryption — the DEK is treated as a 32-byte plaintext. + +**Key Derivation:** +- PBKDF2-HMAC-SHA-512 (default 100,000 iterations) or scrypt (default N=16384, r=8, p=1). +- Salt: 32 bytes random, stored in manifest. + +**Integrity:** +- SHA-256 digest per chunk (computed at store time, verified at restore time). +- GCM authentication tag for encrypted data (verified during decryption). +- Manifests validated by Zod schemas at construction time. + +#### Storage & Data Structures + +**Git Object Database:** +- Chunks stored as Git blobs via `git hash-object -w --stdin`. +- Manifests stored as Git blobs (JSON or CBOR encoded). +- Trees constructed via `git mktree` with mode `100644 blob` entries. +- Vault state stored as a commit chain on `refs/cas/vault`: + - Each commit points to a tree containing: `.vault.json` metadata blob + one `040000 tree` entry per vault slug. + +**In-Memory:** +- `Manifest` and `Chunk` are frozen value objects (immutable after construction). +- `Semaphore` uses a FIFO queue of promise resolvers. +- `StatsCollector` accumulates metrics in private fields. +- CDC chunker allocates a `Buffer.allocUnsafe(maxChunkSize)` working buffer per `chunk()` invocation. + +#### Memory Management + +**Store path:** +- Semaphore-bounded: at most `concurrency` chunk buffers in flight simultaneously. +- CDC chunker holds one `maxChunkSize` working buffer (~1 MiB default) plus the 64-byte sliding window. +- After chunking, the working buffer is copied via `Buffer.from(subarray)` — no aliasing. + +**Restore path (streaming, unencrypted):** +- Read-ahead window: up to `concurrency` chunk-sized buffers in memory. +- Chunks are yielded and become eligible for GC immediately after consumption. + +**Restore path (buffered, encrypted/compressed):** +- **All chunks are concatenated into a single buffer before decryption.** This is the documented memory amplification concern (Roadmap C1). A 1 GB encrypted file requires ~1 GB in memory for decryption, plus the decrypted result. + +**Web Crypto streaming encryption:** +- The `createEncryptionStream` on `WebCryptoAdapter` **buffers the entire stream** internally because Web Crypto's AES-GCM is a one-shot API. This silently converts O(chunk_size) memory to O(total_file_size) memory on Deno (Roadmap C4). + +#### Performance Characteristics + +| Operation | Time Complexity | Space Complexity | Blocking? | +|---|---|---|---| +| Store (fixed chunking) | O(n) | O(concurrency × chunkSize) | Git subprocess I/O | +| Store (CDC chunking) | O(n) | O(maxChunkSize + concurrency × chunkSize) | Git subprocess I/O | +| Restore (streaming, plain) | O(n) | O(concurrency × chunkSize) | Git subprocess I/O | +| Restore (buffered, encrypted) | O(n) | **O(n)** — full file in memory | Git subprocess I/O + decrypt | +| createTree (v1, < threshold) | O(k) where k = chunks | O(k) for tree entries | Git subprocess | +| createTree (v2, Merkle) | O(k) | O(k / threshold) sub-manifests | Git subprocess | +| readManifest (v2) | O(k) | O(sub-manifest count) reads | Git subprocess × sub-manifests | +| Key rotation | O(1) | O(1) — only re-wraps DEK | Constant | +| Vault CAS update | O(entries) | O(entries) for tree rebuild | Git subprocess | +| CDC boundary scan | O(n) per byte | O(1) per byte (table lookup + XOR) | CPU-bound | + +**Critical bottleneck:** Git subprocess spawning. Every `writeBlob`, `readBlob`, `writeTree`, `readTree` operation spawns a `git` child process. For a file with 1000 chunks at concurrency 4, that's ~1000 `git hash-object` invocations + ~1000 `git cat-file` invocations on restore. The `@git-stunts/plumbing` layer mitigates this somewhat but cannot eliminate the per-operation process overhead. + +--- + +## Phase 2: The Critical Assessment + +### Use Cases & Fitness + +**Optimized for:** +- Single-file binary asset storage (firmware images, data bundles, encrypted archives) in the 1 KB to ~500 MB range. +- Git monorepos where binary assets must travel with the code. +- Air-gapped or offline environments where external services are unavailable. +- Multi-recipient access control without re-encrypting data. + +**Where it will break:** +- **Files > 1 GB encrypted**: The `_restoreBuffered` path requires the entire file in memory for decryption. A 4 GB file on a machine with 8 GB RAM will OOM. +- **High-frequency writes**: Each chunk write spawns a Git subprocess. At 1000 writes/second with process spawn overhead (~5ms each), you hit a ceiling of ~200 chunks/second single-threaded. +- **Large repositories (>10 GB)**: Git's own performance degrades with ODB size. `git gc` becomes slow, pack files grow. +- **Web Crypto runtime (Deno) with large files**: The streaming encryption adapter silently buffers the entire file due to Web Crypto API limitations. +- **Concurrent vault mutations from multiple processes**: The CAS retry mechanism (3 attempts, 50-200ms backoff) handles light contention but will fail under sustained concurrent writes. + +### Design Trade-offs + +**1. Git subprocess for every blob operation vs. libgit2/in-process Git** + +- **Evidence:** + + - **Claim:** Every blob read/write spawns a `git` child process via `@git-stunts/plumbing`. + - **Primary Evidence:** `src/infrastructure/adapters/GitPersistenceAdapter.js:11-17` (`writeBlob` calls `plumbing.execute`) + - **Supporting Context:** `plumbing.execute()` and `plumbing.executeStream()` spawn `git` subprocesses. + - **Discovery Path:** `index.js` → `GitPersistenceAdapter` → `plumbing.execute` → `git hash-object` + - **Cryptographic Proof:** `git hash-object src/infrastructure/adapters/GitPersistenceAdapter.js` = `797be53113174ff8e86104fa97afda0748dd3fce` + +- **Systemic effect:** Process spawn overhead (~2-10ms per invocation) dominates I/O for small chunks. A 100 MB file with 256 KiB chunks = ~400 subprocess invocations for store + ~400 for restore. The `Policy.timeout(30_000)` wrapper adds resilience but not performance. +- **Trade-off rationale:** Using the `git` CLI ensures correctness across all Git configurations (bare repos, worktrees, custom object stores, alternates) without reimplementing Git's object database. It also means zero native dependencies — critical for multi-runtime support. + +**2. Encrypt-then-chunk vs. chunk-then-encrypt** + +- **Evidence:** + + - **Claim:** Encryption wraps the source stream before chunking, meaning ciphertext is what gets chunked — not plaintext. + - **Primary Evidence:** `src/domain/services/CasService.js:store()` — encryption stream wraps source before passing to `_chunkAndStore`. + - **Supporting Context:** The encryption stream is created first (`crypto.createEncryptionStream(key)`), then the encrypted output is piped through the chunker. + - **Cryptographic Proof:** `git hash-object src/domain/services/CasService.js` = `9d1370ca88697992847c131bba7d74f726a2cd8c` + +- **Systemic effect:** CDC deduplication is **completely defeated** for encrypted data because AES-GCM ciphertext is pseudorandom — identical plaintext produces different ciphertext (random nonce). This means encrypted CDC-chunked files get zero deduplication benefit. The chunking metadata is still recorded in the manifest, but it serves no dedup purpose. +- **Trade-off rationale:** The alternative (chunk-then-encrypt) would require per-chunk nonces and auth tags, significantly complicating the manifest schema and increasing metadata overhead. The current design keeps crypto simple (one nonce, one tag, one DEK for the whole file). + +**3. Full-buffer decrypt vs. streaming decrypt** + +- **Evidence:** + + - **Claim:** Encrypted/compressed restores buffer the entire file before decryption. + - **Primary Evidence:** `src/domain/services/CasService.js:_restoreBuffered()` — concatenates all chunk buffers then calls `decrypt()`. + - **Cryptographic Proof:** `git hash-object src/domain/services/CasService.js` = `9d1370ca88697992847c131bba7d74f726a2cd8c` + +- **Systemic effect:** Memory usage is O(file_size) for encrypted restores. The `restoreStream()` API exists and is O(chunk_size) for plaintext, but encrypted paths silently degrade to O(n). +- **Trade-off rationale:** AES-256-GCM produces a single authentication tag for the entire ciphertext. Verifying the tag requires processing all ciphertext. Streaming authenticated decryption would require a different AEAD construction (e.g., STREAM from libsodium, or chunked AES-GCM with per-chunk tags). + +**4. Vault as Git commit chain vs. flat file** + +- **Evidence:** + + - **Claim:** The vault uses Git commits on `refs/cas/vault` with CAS (compare-and-swap) updates. + - **Primary Evidence:** `src/domain/services/VaultService.js:VAULT_REF`, `#casUpdateRef`, `#retryMutation` + - **Cryptographic Proof:** `git hash-object src/domain/services/VaultService.js` = `d5a1ac2b1a771e9a3a7ac1652c6f40e0f0cbffaa` + +- **Systemic effect:** Every vault mutation (add, remove, init) creates a new Git commit. This provides full audit history but grows the commit graph linearly. Over thousands of vault mutations, `git log refs/cas/vault` becomes slow. The CAS semantics handle concurrent writes gracefully but are limited to 3 retries with short backoff — insufficient for high-contention scenarios. +- **Trade-off rationale:** Using Git's native commit/ref mechanism means the vault is automatically included in `git push/pull/clone`. No separate sync mechanism needed. The audit trail is a natural consequence. + +**5. Semaphore-based concurrency vs. worker pool** + +- **Evidence:** + + - **Claim:** Parallel blob I/O uses a counting semaphore, not a proper worker/thread pool. + - **Primary Evidence:** `src/domain/services/Semaphore.js` — FIFO counting semaphore; `CasService.js:_chunkAndStore` — semaphore-guarded fan-out. + - **Cryptographic Proof:** `git hash-object src/domain/services/Semaphore.js` = `507ed14668364491797a68ed906b346b01ddd488` + +- **Systemic effect:** All concurrency is async I/O multiplexing on the event loop. There's no CPU parallelism for hashing or encryption. SHA-256 and AES-GCM run on the main thread (in Node.js). For CPU-bound workloads this is a bottleneck, but since the dominant cost is Git subprocess I/O, async concurrency is the correct choice. + +### Flaws & Limitations + +#### Flaw 1: Crypto Adapter Behavioral Inconsistencies + +- **Evidence:** + + - **Claim:** The three crypto adapters have inconsistent validation and error-handling behavior. + - **Primary Evidence:** `NodeCryptoAdapter.js:26-36`, `BunCryptoAdapter.js:25-44`, `WebCryptoAdapter.js:28-44` + - **Supporting Context:** + - `NodeCryptoAdapter.encryptBuffer` is synchronous; `BunCryptoAdapter.encryptBuffer` and `WebCryptoAdapter.encryptBuffer` are async. + - `BunCryptoAdapter.decryptBuffer` calls `_validateKey(key)`; `NodeCryptoAdapter.decryptBuffer` and `WebCryptoAdapter.decryptBuffer` do not. + - `NodeCryptoAdapter.createEncryptionStream` has no premature-finalize guard; Bun and Web adapters throw `CasError('STREAM_NOT_CONSUMED')`. + - **Cryptographic Proof:** + - `git hash-object src/infrastructure/adapters/NodeCryptoAdapter.js` = `f89898c5ec1892dd965e6ed69ac5373883ed1650` + - `git hash-object src/infrastructure/adapters/BunCryptoAdapter.js` = `1d8b8ce4def9cd8be885e5065041dbe0a0b6d0ac` + - `git hash-object src/infrastructure/adapters/WebCryptoAdapter.js` = `5a70733d945387a8a8101013157811aa654958c6` + +- **Impact:** Liskov Substitution violation. Code that works correctly on Bun (where `decryptBuffer` validates the key type early) may fail with a cryptic `node:crypto` error on Node.js (where the key is passed directly to `createDecipheriv`). The missing premature-finalize guard on Node means a bug in stream consumption produces undefined behavior on Node but a clear error on Bun/Deno. +- **Severity:** Medium. The callers generally `await` all results (which papers over sync-vs-async), and CasService always calls `_validateKey` before encrypting. But the asymmetry is a maintenance hazard. + +#### Flaw 2: Memory Amplification on Encrypted Restore + +- **Evidence:** + + - **Claim:** Encrypted restores load the entire file into memory. + - **Primary Evidence:** `src/domain/services/CasService.js:_restoreBuffered()` — `Buffer.concat(chunkBuffers)` before `this.decrypt()`. + - **Cryptographic Proof:** `git hash-object src/domain/services/CasService.js` = `9d1370ca88697992847c131bba7d74f726a2cd8c` + +- **Impact:** Restoring a 1 GB encrypted file requires ~2 GB of heap (ciphertext buffer + plaintext output). No guard, no warning, no configurable limit. +- **Severity:** High for large files. The roadmap acknowledges this as concern C1 and estimates ~20 LoC to add a `maxRestoreBufferSize` guard. + +#### Flaw 3: Web Crypto Stream Buffering + +- **Evidence:** + + - **Claim:** `WebCryptoAdapter.createEncryptionStream` silently buffers the entire stream. + - **Primary Evidence:** `src/infrastructure/adapters/WebCryptoAdapter.js:64-84` — `const chunks = []; for await (const chunk of source) { chunks.push(chunk); } const buffer = Buffer.concat(chunks);` + - **Cryptographic Proof:** `git hash-object src/infrastructure/adapters/WebCryptoAdapter.js` = `5a70733d945387a8a8101013157811aa654958c6` + +- **Impact:** On Deno, `createEncryptionStream` provides a streaming API but has O(n) memory behavior. Users expect O(chunk_size) memory from a streaming API. This is deceptive. +- **Severity:** Medium. Deno is a secondary runtime, and the roadmap flags this as concern C4. + +#### Flaw 4: FixedChunker Quadratic Buffer Allocation + +- **Evidence:** + + - **Claim:** `FixedChunker.chunk()` uses `Buffer.concat()` in a loop, creating a new buffer allocation per input chunk. + - **Primary Evidence:** `src/infrastructure/chunkers/FixedChunker.js:20` — `buffer = Buffer.concat([buffer, data]);` + - **Cryptographic Proof:** `git hash-object src/infrastructure/chunkers/FixedChunker.js` = `1477e185f16730ad13028454cecb1fb2ac785889` + +- **Impact:** For a source that yields many small buffers (e.g., 4 KB network reads), `Buffer.concat([buffer, data])` is called for each read. This copies the accumulated buffer each time, yielding O(n^2/chunkSize) total memory copies where n is file size. In contrast, `CdcChunker` uses a pre-allocated working buffer with zero intermediate copies. +- **Severity:** Low in practice (the source is typically a file stream with 64 KiB reads), but architecturally inconsistent with the CDC chunker's careful buffer management. + +#### Flaw 5: CDC Deduplication Defeated by Encrypt-Then-Chunk + +- **Evidence:** + + - **Claim:** Encryption is applied before chunking, destroying content-addressable deduplication. + - **Primary Evidence:** `src/domain/services/CasService.js:store()` — encryption wraps source before `_chunkAndStore`. + - **Cryptographic Proof:** `git hash-object src/domain/services/CasService.js` = `9d1370ca88697992847c131bba7d74f726a2cd8c` + +- **Impact:** The primary value proposition of CDC is sub-file deduplication. For encrypted files, CDC provides zero dedup benefit over fixed chunking. Users who enable both encryption and CDC chunking get CDC's overhead (rolling hash computation) without its benefit. +- **Severity:** Medium. This is an inherent limitation of the encrypt-then-chunk design. Fixing it would require per-chunk encryption (chunk-then-encrypt), which is a significant architectural change. + +#### Flaw 6: No Upper Bound on Chunk Size + +- **Evidence:** + + - **Claim:** `FixedChunker` accepts any positive `chunkSize` value without an upper bound. + - **Primary Evidence:** `src/infrastructure/chunkers/FixedChunker.js:9` — no validation beyond ChunkingPort base. + - **Supporting Context:** `CdcChunker` has configurable `maxChunkSize` (default 1 MiB) but no hard upper limit either. `resolveChunker` validates `chunkSize > 0` for fixed but has no ceiling. + - **Cryptographic Proof:** `git hash-object src/infrastructure/chunkers/FixedChunker.js` = `1477e185f16730ad13028454cecb1fb2ac785889` + +- **Impact:** A user could set `chunkSize: 10 * 1024 * 1024 * 1024` (10 GB) and the system would attempt to buffer a 10 GB chunk. The roadmap flags this as concern C3. +- **Severity:** Low (user misconfiguration, not a bug in normal usage). + +#### Flaw 7: `deleteAsset` Is Misleadingly Named + +- **Evidence:** + + - **Claim:** `deleteAsset()` does not delete anything — it only reads metadata. + - **Primary Evidence:** `src/domain/services/CasService.js:deleteAsset()` — reads manifest and returns `{ slug, chunksOrphaned }`. + - **Cryptographic Proof:** `git hash-object src/domain/services/CasService.js` = `9d1370ca88697992847c131bba7d74f726a2cd8c` + +- **Impact:** API confusion. Similarly, `findOrphanedChunks()` doesn't find orphans — it finds referenced chunks. Both methods are analysis tools masquerading as lifecycle operations. +- **Severity:** Low (naming issue, not a functional defect). + +#### Flaw 8: Error.captureStackTrace Portability + +- **Evidence:** + + - **Claim:** `CasError` uses `Error.captureStackTrace` which is V8-specific. + - **Primary Evidence:** `src/domain/errors/CasError.js:5` — `Error.captureStackTrace(this, this.constructor);` + - **Cryptographic Proof:** `git hash-object src/domain/errors/CasError.js` = `6acc1da7e28ed698571f861900081d8b044cde57` + +- **Impact:** This is a no-op on non-V8 engines. Since the project targets Node (V8), Bun (JSC), and Deno (V8), it's a no-op on Bun's JavaScriptCore. Not a crash risk (it degrades gracefully), but indicates incomplete multi-runtime awareness. +- **Severity:** Negligible. + +#### Flaw 9: Missing pre-commit Hook + +- **Evidence:** + + - **Claim:** The project has a pre-push hook but no pre-commit hook. + - **Primary Evidence:** `scripts/git-hooks/pre-push` exists; `scripts/git-hooks/pre-commit` does not. + - **Supporting Context:** The CLAUDE.md global instructions specify that pre-commit should run lint. The hooks directory is also named `git-hooks` rather than the conventional `hooks` specified in CLAUDE.md. + +- **Impact:** Lint failures are not caught until push time. A developer can accumulate many unlinted commits before discovering issues. +- **Severity:** Low (process issue, not a code defect). + +### Innovation vs. Commodity + +**Novel or distinctive:** +1. **Git ODB as a CAS backend** — No other library treats Git's native object store as a general-purpose content-addressed storage layer with this level of sophistication (Merkle manifests, codec pluggability, vault indexing). +2. **Buzhash CDC implementation** — Hand-rolled, well-optimized, with a clever xorshift64 seeded table. Not copy-pasted from a library. +3. **DEK/KEK envelope encryption with zero-cost key rotation** — The key rotation model (re-wrap DEK, don't re-encrypt data) is architecturally elegant and matches the patterns used by KMS systems like AWS KMS. +4. **Vault as a Git commit chain** — Using Git refs for an atomic, auditable key-value store is creative. +5. **Multi-runtime JS with runtime-specific crypto** — Three crypto adapters targeting three JS runtimes is uncommon in the Node ecosystem. + +**Commodity:** +1. **AES-256-GCM encryption** — Standard AEAD construction, correctly implemented. +2. **PBKDF2/scrypt KDF** — Standard KDF choices with standard parameters. +3. **Zod schema validation** — Standard validation library, standard usage. +4. **Hexagonal architecture** — Well-known pattern, well-executed. +5. **Commander.js CLI** — Standard CLI framework, standard usage. + +**Assessment:** This codebase introduces genuinely novel abstractions (Git ODB as CAS, vault commit chain, zero-cost key rotation) while building on commodity cryptographic primitives. The combination is the innovation — not any individual component. + +--- + +## Phase 3: The Reality Check + +### Roadmap Reconciliation + +The roadmap lists 9 milestones (M7–M15). **All 9 are marked CLOSED.** There are zero open milestones. + +| Milestone | Roadmap Status | Verified in Code | Reconciliation | +|---|---|---|---| +| M7 Horizon | CLOSED (v2.0.0) | Yes — Merkle manifests (v2), compression, sub-manifests all implemented | Accurate | +| M8 Spit Shine | CLOSED (v4.0.1) | Yes — CryptoPort refactor, verify command, error handler all present | Accurate | +| M9 Cockpit | CLOSED (v4.0.1) | Yes — 18 CLI commands, --json flag, hints system all present | Accurate | +| M10 Hydra | CLOSED (v5.0.0) | Yes — CdcChunker with buzhash, resolveChunker, CDC params in manifest | Accurate | +| M11 Locksmith | CLOSED (v5.1.0) | Yes — addRecipient, removeRecipient, listRecipients, envelope encryption | Accurate | +| M12 Carousel | CLOSED (v5.2.0) | Yes — rotateKey, keyVersion tracking, DEK re-wrapping | Accurate | +| M13 Bijou | CLOSED (v3.1.0) | Yes — dashboard TUI, progress bars, encryption card, manifest view, heatmap | Accurate | +| M14 Conduit | CLOSED (v4.0.0) | Yes — restoreStream, ObservabilityPort, Semaphore, parallel I/O | Accurate | +| M15 Prism | CLOSED | Yes — async sha256 on NodeCryptoAdapter, KeyResolver extracted | Accurate | + +**Verdict: The roadmap is 100% accurate.** Every claimed milestone is verifiable in the codebase. No phantom features, no vaporware. This is unusual — most roadmaps overstate completion. + +### Backlog Triage + +The roadmap identifies 7 concerns (C1–C7) and 6 visions (V1–V6). Cross-referencing against Phase 2 findings: + +**Concerns already identified by the roadmap that Phase 2 confirmed:** + +| Concern | Roadmap Estimate | Phase 2 Finding | Agreement | +|---|---|---|---| +| C1: Memory amplification on encrypted restore | High severity, ~20 LoC | Flaw 2: Confirmed. O(n) memory for encrypted restores. | Full agreement | +| C2: Orphaned blob accumulation after STREAM_ERROR | Medium, ~20 LoC | Not independently discovered — the error handling drains promises correctly. Low priority. | Agreement on low urgency | +| C3: No upper bound on chunk size | Medium, ~6 LoC | Flaw 6: Confirmed. FixedChunker accepts any positive value. | Full agreement | +| C4: Web Crypto silent memory buffering | Medium, ~15 LoC | Flaw 3: Confirmed. `createEncryptionStream` buffers everything on Deno. | Full agreement | +| C5: Passphrase exposure in shell history | High, ~90 LoC | Not a code defect; architectural limitation of CLI passphrase flags. | Agreement | +| C6: No KDF brute-force rate limiting | Low, ~10 LoC | Not independently discovered. Low priority. | Agreement | +| C7: GCM nonce collision risk at scale | Low, ~20 LoC | Not practically exploitable. 2^48 encryptions needed for birthday bound on 96-bit nonce. | Agreement on low priority | + +**Critical architectural flaws from Phase 2 that ARE MISSING from the backlog:** + +1. **Crypto adapter behavioral inconsistencies (Flaw 1)** — The three adapters have different validation/error behavior. This is not mentioned in any concern or backlog item. The M15 Prism milestone addressed `sha256` async consistency but left the encrypt/decrypt inconsistencies untouched. + +2. **CDC deduplication defeated by encrypt-then-chunk (Flaw 5)** — The fundamental design decision that encryption wraps the stream before chunking is not flagged as a concern or limitation in the roadmap. The Feature Matrix claims "Sub-file deduplication: Via chunking" without noting it only works for unencrypted data. + +3. **FixedChunker quadratic buffer allocation (Flaw 4)** — Minor but missing from backlog. The CDC chunker received significant optimization attention; the fixed chunker did not. + +**Backlog items that should be deprioritized:** + +- **V1 Snapshot Trees** (~410 LoC, ~19h) — Nice to have but doesn't address any Phase 2 flaw. +- **V5 Watch Mode** (~220 LoC, ~10h) — Feature creep for a storage library. +- **V3 Manifest Diff Engine** (~180 LoC, ~8h) — Diagnostic tooling, not a stability concern. + +**Backlog items that should be prioritized:** + +- **C1 Memory amplification guard** — This is the highest-severity technical debt. 20 LoC to add a configurable ceiling. +- **Crypto adapter normalization** — Not in backlog. Needs to be added. ~30 LoC to align all three adapters. +- **V4 CompressionPort** (~180 LoC, ~8h) — Gzip-only compression is a significant limitation. zstd would provide 2-3x better compression ratios with faster decompression. + +--- + +## Phase 4: The Blueprint for Success + +### Month 1: Triage & Foundation + +**Week 1–2: Crypto Adapter Normalization** + +Align all three crypto adapters to identical behavioral contracts: + +1. Add `_validateKey(key)` call to `NodeCryptoAdapter.decryptBuffer()` and `WebCryptoAdapter.decryptBuffer()`. +2. Add premature-finalize guard to `NodeCryptoAdapter.createEncryptionStream()`. +3. Make `NodeCryptoAdapter.encryptBuffer()` explicitly async (return `Promise`). +4. Add a cross-adapter behavioral test suite that asserts identical behavior for all three adapters given the same inputs. + +*Estimated: ~50 LoC changes, ~100 LoC tests.* + +**Week 2: Memory Safety Guards** + +1. Add `maxRestoreBufferSize` option to CasService constructor (default: 512 MiB). Throw `CasError('RESTORE_BUFFER_EXCEEDED')` if the concatenated chunk buffer exceeds this limit in `_restoreBuffered()`. +2. Add buffer size guard to `WebCryptoAdapter.createEncryptionStream()` — throw if accumulated buffer exceeds a configurable limit. +3. Add upper bound validation to `FixedChunker` constructor (e.g., max 100 MiB) and `CdcChunker` (already has `maxChunkSize` but no ceiling on the ceiling). + +*Estimated: ~40 LoC changes, ~30 LoC tests.* + +**Week 3: FixedChunker Buffer Optimization** + +Replace the `Buffer.concat([buffer, data])` loop in `FixedChunker.chunk()` with a pre-allocated working buffer pattern matching `CdcChunker`: + +```js +const buf = Buffer.allocUnsafe(this.#chunkSize); +let offset = 0; +for await (const data of source) { + let srcPos = 0; + while (srcPos < data.length) { + const n = Math.min(data.length - srcPos, this.#chunkSize - offset); + data.copy(buf, offset, srcPos, srcPos + n); + offset += n; + srcPos += n; + if (offset === this.#chunkSize) { + yield Buffer.from(buf); + offset = 0; + } + } +} +if (offset > 0) yield Buffer.from(buf.subarray(0, offset)); +``` + +*Estimated: ~20 LoC change.* + +**Week 4: Missing pre-commit Hook + Process Hygiene** + +1. Add `scripts/git-hooks/pre-commit` that runs `pnpm run lint`. +2. Rename `scripts/git-hooks/` to `scripts/hooks/` to match CLAUDE.md convention (or update CLAUDE.md — choose one). +3. Add `Error.captureStackTrace` guard in `CasError`: `if (Error.captureStackTrace) Error.captureStackTrace(this, this.constructor);` + +*Estimated: ~10 LoC changes.* + +### Month 2: Structural Evolution + +**CompressionPort Abstraction (V4)** + +The current gzip-only compression is hardcoded. Introduce a `CompressionPort` abstract class with `compress(source)` and `decompress(source)` async generator methods. Implement `GzipCompressor` (existing behavior) and `ZstdCompressor` (via `node:zlib` or `zstd-codec`). Update `CompressionSchema` to accept `'gzip' | 'zstd'`. + +*Estimated: ~180 LoC, aligns with V4 vision.* + +**Document the Encrypt-Then-Chunk Limitation** + +This is not fixable without a major architectural change (chunk-then-encrypt with per-chunk AEAD). The correct action is: + +1. Document that CDC deduplication is ineffective for encrypted data. +2. Consider emitting a warning when `encryption + chunking.strategy === 'cdc'` are both specified. +3. If the user explicitly opts in, allow it — but make the trade-off visible. + +*Estimated: ~10 LoC (warning), documentation update.* + +**Interactive Passphrase Prompt (V6)** + +Address concern C5 (passphrase exposure in shell history) by adding TTY-based passphrase prompts with echo disabled. Fall back to flag-based input when stdin is not a TTY. + +*Estimated: ~90 LoC, aligns with V6 vision.* + +### Month 3: Strategic Re-alignment + +**Portable Bundles (V2)** + +The air-gapped use case is a key differentiator. Implement `.casb` bundle files that package manifest + chunks for transport without Git. This enables: +- Export: `git cas export --slug --out archive.casb` +- Import: `git cas import --bundle archive.casb` + +*Estimated: ~340 LoC, aligns with V2 vision.* + +**Garbage Collection Automation** + +The `deleteAsset` and `findOrphanedChunks` methods are analysis-only. Complete the lifecycle: +1. Rename `deleteAsset` to `inspectAsset` or `getAssetMetadata` (breaking change). +2. Implement actual GC via `git prune` after vault entry removal. +3. Add `git cas gc` CLI command with `--dry-run` support. + +*Estimated: ~80 LoC.* + +**CI Hardening** + +1. Add `dependabot.yml` for dependency updates. +2. Add `CODEOWNERS` file. +3. Add security scanning (e.g., `npm audit` in CI). +4. Add `SECURITY.md` at project root (currently missing, noted in CLAUDE.md scaffolding requirements). + +--- + +### Executive Conclusion + +**Health: Strong.** This is a well-architected, thoroughly tested codebase with a clear domain model, strict port/adapter boundaries, and an unusually high test-to-code ratio (3.1:1). The 833+ unit tests with real crypto (never mocked) and fuzz coverage demonstrate a commitment to correctness that is rare in the Node.js ecosystem. + +**Intellectual Property Value: Moderate-High.** The novel contributions — Git ODB as CAS, buzhash CDC with xorshift-seeded tables, zero-cost DEK/KEK key rotation, vault commit chains with CAS semantics — represent genuine engineering innovation. These are not reimplementations of existing libraries; they are original abstractions built on well-understood primitives. + +**Technical Debt: Low.** The roadmap's 7 concerns accurately catalog the known issues. Phase 2 surfaced only 3 additional findings (crypto adapter inconsistencies, encrypt-then-chunk dedup limitation, FixedChunker buffer allocation), none of which are critical. The most urgent issue — memory amplification on encrypted restore — is a ~20 LoC fix. + +**Long-term Viability: Good with caveats.** The system is viable for its target niche (Git-native encrypted binary storage). The Git subprocess bottleneck limits throughput for very high-frequency operations, but this is an acceptable trade-off for correctness and portability. The encrypt-then-chunk design is a permanent architectural constraint that limits CDC's value for encrypted data — this should be prominently documented rather than "fixed." + +**The Honest Assessment:** This codebase punches above its weight. A ~3,900 LoC core library with 12,000 LoC of tests, multi-runtime support, envelope encryption, CDC chunking, Merkle manifests, and an interactive TUI — all with zero native dependencies and no external server requirements. The architecture is clean, the test coverage is comprehensive, and the roadmap is honest. The identified flaws are minor and addressable. This is a well-maintained project by someone who takes software engineering seriously. + +--- + +*Audit conducted at commit `0f7f8e658e6cd094176541ac68d33b2a6ec75a91`.* +*All blob hashes verified via `git hash-object` against live repository state.* From 12205b0f5460af13c068c480b985deb6796a84d8 Mon Sep 17 00:00:00 2001 From: James Ross Date: Tue, 3 Mar 2026 18:23:38 -0800 Subject: [PATCH 02/41] =?UTF-8?q?feat(roadmap):=20add=20M16=20Capstone=20m?= =?UTF-8?q?ilestone=20=E2=80=94=20audit=20remediation?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add milestone M16 (Capstone, v5.3.0) to ROADMAP.md with 13 task cards addressing all negative findings from the CODE-EVAL.md forensic audit: P0 (Critical): - 16.1 Crypto adapter behavioral normalization (C8) - 16.2 Memory restore guard (C1) - 16.11 Passphrase input security (C5 + V6) P1 (Important): - 16.3 Web Crypto encryption buffer guard (C4) - 16.5 Encrypt-then-chunk dedup warning (C10) - 16.6 Chunk size upper bound (C3) - 16.10 Orphaned blob tracking (C2) P2 (Housekeeping): - 16.4 FixedChunker pre-allocated buffer (C9) - 16.7 Lifecycle method naming (deprecate deleteAsset) - 16.8 CasError portability guard - 16.9 Pre-commit hook + hooks directory - 16.12 KDF brute-force awareness (C6) - 16.13 GCM nonce collision documentation (C7) Also registers new CasError codes RESTORE_TOO_LARGE and ENCRYPTION_BUFFER_EXCEEDED, adds concerns C8–C10 to the concerns section, and cross-references all concerns to their task cards in the summary table. --- CHANGELOG.md | 6 + ROADMAP.md | 512 +++++++++++++++++++++++++++++++++++++++++++++++++-- 2 files changed, 507 insertions(+), 11 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 194184c..022ed25 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -7,6 +7,12 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [Unreleased] +### Added +- **CODE-EVAL.md** — Forensic architectural audit (zero-knowledge code extraction, critical assessment, roadmap reconciliation, prescriptive blueprint). +- **M16 Capstone** — New milestone in ROADMAP.md addressing all 9 audit flaws and 10 concerns (C1–C10). 13 task cards, ~698 LoC, ~21h estimated. +- **Concerns C8–C10** — Three new architectural concerns identified by the audit: crypto adapter LSP violation (C8), FixedChunker quadratic allocation (C9), encrypt-then-chunk dedup loss (C10). +- **CasError codes** — `RESTORE_TOO_LARGE` and `ENCRYPTION_BUFFER_EXCEEDED` registered in canonical error code table. + ## [5.2.4] — Prism polish (2026-03-03) ### Fixed diff --git a/ROADMAP.md b/ROADMAP.md index 99ddfc4..13a145e 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -9,7 +9,7 @@ This roadmap is structured as: 3. **Contracts** — Return/throw semantics for all public methods 4. **Version Plan** — Table mapping versions to milestones 5. **Milestone Dependency Graph** — ASCII diagram -6. **Milestones & Task Cards** — 7 milestones (4 closed, 3 open), remaining task cards +6. **Milestones & Task Cards** — 8 milestones (7 closed, 1 open), remaining task cards 7. **Feature Matrix** — Competitive landscape vs. Git LFS, git-annex, Restic, Age, DVC 8. **Competitive Analysis** — When to use git-cas and when not to, with concrete scenarios @@ -56,6 +56,8 @@ Single registry of all error codes used across the codebase. Each code is a stri | `CANNOT_REMOVE_LAST_RECIPIENT` | Cannot remove the last recipient — at least one must remain. | Task 11.2 | | `ROTATION_NOT_SUPPORTED` | Key rotation requires envelope encryption (DEK/KEK model). Legacy manifests must be re-stored. | Task 12.1 | | `STREAM_NOT_CONSUMED` | `finalize()` called on encryption stream before the generator was fully consumed. | v4.0.1 | +| `RESTORE_TOO_LARGE` | Encrypted/compressed file exceeds `maxRestoreBufferSize`. Buffered restore would OOM. Suggest increasing limit or storing without encryption. | M16 | +| `ENCRYPTION_BUFFER_EXCEEDED` | Web Crypto adapter accumulated buffer exceeds limit during streaming encryption (Deno-specific). Suggest Node.js/Bun or unencrypted store. | M16 | --- @@ -191,6 +193,7 @@ Return and throw semantics for every public method (current and planned). | v3.1.0 | M13 | Bijou | TUI dashboard & progress | ✅ | | v5.0.0 | M10 | Hydra | Content-defined chunking | ✅ | | v5.1.0 | M11 | Locksmith | Multi-recipient encryption | ✅ | +| v5.3.0 | M16 | Capstone | Audit remediation — all CODE-EVAL.md findings | 🔲 | | v5.2.0 | M12 | Carousel | Key rotation | ✅ | --- @@ -206,6 +209,8 @@ M8 Spit Shine + M9 Cockpit (v4.0.1) ✅ M10 Hydra ──────────── ✅ v5.0.0 M11 Locksmith ──────── ✅ v5.1.0 └──► M12 Carousel ── ✅ v5.2.0 +M15 Prism ─────────────── ✅ + └──► M16 Capstone ────── 🔲 v5.3.0 ``` --- @@ -223,6 +228,7 @@ M11 Locksmith ──────── ✅ v5.1.0 | M10| Hydra | Content-defined chunking | v5.0.0 | 4 | ~690 | ~22h | ✅ CLOSED | | M11| Locksmith | Multi-recipient encryption | v5.1.0 | 4 | ~580 | ~20h | ✅ CLOSED | | M12| Carousel | Key rotation | v5.2.0 | 4 | ~400 | ~13h | ✅ CLOSED | +| M16| Capstone | Audit remediation | v5.3.0 | 13 | ~430 | ~28h | 🔲 OPEN | Completed task cards are in [COMPLETED_TASKS.md](./COMPLETED_TASKS.md). Superseded tasks are in [GRAVEYARD.md](./GRAVEYARD.md). @@ -262,6 +268,443 @@ All tasks completed (12.1–12.4). See [COMPLETED_TASKS.md](./COMPLETED_TASKS.md --- +# M16 — Capstone (v5.3.0) 🔲 OPEN + +Remediation milestone addressing all negative findings from the [CODE-EVAL.md](./CODE-EVAL.md) forensic architectural audit. Covers 9 code flaws (Phase 2), 7 pre-existing concerns (C1–C7), and 3 newly identified concerns (C8–C10). No new features — strictly hardening, correctness, and hygiene. + +**Source:** `CODE-EVAL.md` at commit `0f7f8e6` + +**Priority key:** P0 = critical (high severity), P1 = important (medium), P2 = housekeeping (low/negligible). + +--- + +### 16.1 — Crypto Adapter Behavioral Normalization *(P0)* — C8 + +**Problem** + +The three CryptoPort adapters (Node, Bun, Web) have inconsistent validation and error-handling behavior — a Liskov Substitution violation. Specifically: + +1. `NodeCryptoAdapter.encryptBuffer()` is synchronous; Bun and Web are async. +2. `BunCryptoAdapter.decryptBuffer()` calls `_validateKey(key)`; Node and Web do not. +3. `NodeCryptoAdapter.createEncryptionStream()` has no premature-finalize guard; Bun and Web throw `CasError('STREAM_NOT_CONSUMED')`. + +Code that works on Bun (early key validation) may produce a cryptic `node:crypto` error on Node. A bug in stream consumption produces undefined behavior on Node but a clear error on Bun/Deno. + +**Fix** + +1. Add `_validateKey(key)` call to `NodeCryptoAdapter.decryptBuffer()` and `WebCryptoAdapter.decryptBuffer()`. +2. Add `streamFinalized` guard + `CasError('STREAM_NOT_CONSUMED')` to `NodeCryptoAdapter.createEncryptionStream()`. +3. Make `NodeCryptoAdapter.encryptBuffer()` explicitly `async` (return `Promise`). +4. Add a cross-adapter behavioral conformance test suite asserting identical behavior for all three adapters given the same inputs. + +**Files:** +- `src/infrastructure/adapters/NodeCryptoAdapter.js` +- `src/infrastructure/adapters/WebCryptoAdapter.js` +- New: `test/unit/infrastructure/adapters/CryptoAdapter.conformance.test.js` + +**Tests:** +```js +describe('16.1: CryptoPort LSP conformance', () => { + // Run the same assertions against all three adapters + for (const [name, adapter] of adapters) { + it(`${name}.encryptBuffer returns a Promise`, ...); + it(`${name}.decryptBuffer rejects invalid key type before crypto error`, ...); + it(`${name}.decryptBuffer rejects wrong-length key before crypto error`, ...); + it(`${name}.createEncryptionStream.finalize() throws STREAM_NOT_CONSUMED if not consumed`, ...); + } +}); +``` + +| Estimate | ~50 LoC changes, ~100 LoC tests, ~4h | +|----------|---------------------------------------| + +--- + +### 16.2 — Memory Restore Guard *(P0)* — C1 + +**Problem** + +`_restoreBuffered()` concatenates ALL chunk blobs into a single buffer before decryption. A 1 GB encrypted file requires ~2 GB of heap. No guard, no warning, no configurable limit. + +**Fix** + +Add `maxRestoreBufferSize` option to CasService constructor (default 512 MiB). Before `Buffer.concat()` in `_restoreBuffered()`, check `manifest.size` against the limit. Throw `CasError('RESTORE_TOO_LARGE')` with an actionable message. + +**Files:** +- `src/domain/services/CasService.js` +- `index.js` (facade wiring) +- `index.d.ts` (type update) + +**Tests:** +```js +describe('16.2: Memory guard on encrypted restore', () => { + it('throws RESTORE_TOO_LARGE when manifest.size exceeds maxRestoreBufferSize', ...); + it('succeeds when manifest.size is within maxRestoreBufferSize', ...); + it('does not apply guard to unencrypted uncompressed restoreStream', ...); + it('includes actionable hint in error message', ...); + it('default maxRestoreBufferSize is 512 MiB', ...); +}); +``` + +| Estimate | ~25 LoC changes, ~40 LoC tests, ~2h | +|----------|--------------------------------------| + +--- + +### 16.3 — Web Crypto Encryption Buffer Guard *(P1)* — C4 + +**Problem** + +`WebCryptoAdapter.createEncryptionStream()` silently buffers the entire stream because Web Crypto AES-GCM is a one-shot API. On Deno, a user calling `store()` with a large encrypted source OOMs without warning. + +**Fix** + +Track accumulated bytes in the `encrypt()` generator. When total exceeds a configurable limit (default 512 MiB), throw `CasError('ENCRYPTION_BUFFER_EXCEEDED')` with an actionable message. + +**Files:** +- `src/infrastructure/adapters/WebCryptoAdapter.js` + +**Tests:** +```js +describe('16.3: Web Crypto buffering guard', () => { + it('throws ENCRYPTION_BUFFER_EXCEEDED when accumulated bytes exceed limit', ...); + it('succeeds for data within buffer limit', ...); + it('NodeCryptoAdapter does NOT throw for large streams (true streaming)', ...); +}); +``` + +| Estimate | ~15 LoC changes, ~30 LoC tests, ~1h | +|----------|--------------------------------------| + +--- + +### 16.4 — FixedChunker Pre-Allocated Buffer *(P2)* — C9 + +**Problem** + +`FixedChunker.chunk()` uses `Buffer.concat([buffer, data])` in a loop. Each call copies the entire accumulated buffer — O(n^2 / chunkSize) total copies for many small input buffers. The CDC chunker uses a pre-allocated working buffer with zero intermediate copies. + +**Fix** + +Replace the concat loop with a pre-allocated `Buffer.allocUnsafe(chunkSize)` working buffer using a copy+offset pattern, matching CdcChunker's approach. + +**Files:** +- `src/infrastructure/chunkers/FixedChunker.js` + +**Tests:** + +Existing tests cover byte-exact correctness. Add: +```js +describe('16.4: FixedChunker buffer efficiency', () => { + it('produces identical output to previous implementation (regression)', ...); + it('handles many small input buffers without excessive allocation', ...); +}); +``` + +| Estimate | ~20 LoC changes, ~15 LoC tests, ~1h | +|----------|--------------------------------------| + +--- + +### 16.5 — Encrypt-Then-Chunk Dedup Warning *(P1)* — C10 + +**Problem** + +Encryption is applied before chunking, destroying content-addressable deduplication. AES-GCM ciphertext is pseudorandom — identical plaintext produces different ciphertext. Users who enable both encryption and CDC chunking get CDC's overhead without its dedup benefit. + +This is an inherent architectural constraint (not fixable without per-chunk encryption). The correct action is documentation + a runtime warning. + +**Fix** + +1. When `store()` is called with both an encryption key/passphrase/recipients AND `chunker.strategy === 'cdc'`, emit `observability.log('warn', 'CDC deduplication is ineffective with encryption — ciphertext is pseudorandom', { strategy: 'cdc' })`. +2. Add a "Known Limitations" section to the README documenting this trade-off. + +**Files:** +- `src/domain/services/CasService.js` (warning in `store()`) + +**Tests:** +```js +describe('16.5: Encrypt-then-chunk dedup warning', () => { + it('emits warning when encryption + CDC chunking are combined', ...); + it('does not warn for encryption + fixed chunking', ...); + it('does not warn for CDC chunking without encryption', ...); +}); +``` + +| Estimate | ~10 LoC changes, ~20 LoC tests, ~1h | +|----------|--------------------------------------| + +--- + +### 16.6 — Chunk Size Upper Bound *(P1)* — C3 + +**Problem** + +`CasService` enforces a minimum chunk size (1024 bytes) but no maximum. A user can configure a 4 GB chunk size. Additionally, `FixedChunker` and `CdcChunker` accept arbitrarily large values without validation. + +**Fix** + +1. Add `if (chunkSize > MAX_CHUNK_SIZE)` guard in `CasService` constructor. 100 MiB is the cap — generous while staying within Git hosting limits. +2. Emit `observability.log('warn', ...)` when chunkSize exceeds 10 MiB. +3. Add matching validation in `FixedChunker` constructor: `if (chunkSize > 100 * 1024 * 1024) throw new RangeError(...)`. +4. Add matching validation in `CdcChunker` constructor for `maxChunkSize`. + +**Files:** +- `src/domain/services/CasService.js` +- `src/infrastructure/chunkers/FixedChunker.js` +- `src/infrastructure/chunkers/CdcChunker.js` + +**Tests:** +```js +describe('16.6: Chunk size upper bound', () => { + it('CasService throws when chunkSize exceeds 100 MiB', ...); + it('CasService accepts chunkSize of exactly 100 MiB', ...); + it('FixedChunker throws when chunkSize exceeds 100 MiB', ...); + it('CdcChunker throws when maxChunkSize exceeds 100 MiB', ...); + it('logs warning when chunkSize exceeds 10 MiB', ...); +}); +``` + +| Estimate | ~15 LoC changes, ~30 LoC tests, ~1h | +|----------|--------------------------------------| + +--- + +### 16.7 — Lifecycle Method Naming *(P2)* + +**Problem** + +`deleteAsset()` does not delete anything — it reads a manifest and returns metadata about what would be orphaned. `findOrphanedChunks()` doesn't find orphans — it collects referenced chunk OIDs. Both names are misleading. + +**Fix** + +1. Add `inspectAsset({ treeOid })` as the canonical name. `deleteAsset` becomes a deprecated alias that delegates to `inspectAsset`. +2. Add `collectReferencedChunks({ treeOids })` as the canonical name. `findOrphanedChunks` becomes a deprecated alias. +3. Emit `observability.log('warn', 'deleteAsset() is deprecated — use inspectAsset()')` on deprecated path. +4. Update `index.d.ts` with `@deprecated` JSDoc on old methods. + +This is a **non-breaking** deprecation. Removal is deferred to a future major version. + +**Files:** +- `src/domain/services/CasService.js` +- `index.js` (facade) +- `index.d.ts` + +**Tests:** +```js +describe('16.7: Lifecycle method naming', () => { + it('inspectAsset returns { slug, chunksOrphaned }', ...); + it('deleteAsset delegates to inspectAsset (deprecated alias)', ...); + it('collectReferencedChunks returns { referenced, total }', ...); + it('findOrphanedChunks delegates to collectReferencedChunks (deprecated alias)', ...); +}); +``` + +| Estimate | ~30 LoC changes, ~25 LoC tests, ~1h | +|----------|--------------------------------------| + +--- + +### 16.8 — CasError Portability Guard *(P2)* + +**Problem** + +`CasError` calls `Error.captureStackTrace(this, this.constructor)` unconditionally. This is V8-specific — it's a no-op on Bun's JavaScriptCore engine. While it doesn't crash (JSC silently ignores it), it indicates incomplete multi-runtime awareness. + +**Fix** + +Guard the call: `if (Error.captureStackTrace) Error.captureStackTrace(this, this.constructor);` + +**Files:** +- `src/domain/errors/CasError.js` + +**Tests:** +```js +describe('16.8: CasError multi-runtime portability', () => { + it('creates CasError with code and meta', ...); + it('does not throw when Error.captureStackTrace is unavailable', ...); +}); +``` + +| Estimate | ~3 LoC changes, ~10 LoC tests, ~0.5h | +|----------|---------------------------------------| + +--- + +### 16.9 — Pre-Commit Hook + Hooks Directory *(P2)* + +**Problem** + +The project has a `pre-push` hook but no `pre-commit` hook. Lint failures are not caught until push time. Additionally, the hooks directory is `scripts/git-hooks/` rather than `scripts/hooks/` per the CLAUDE.md convention. + +**Fix** + +1. Rename `scripts/git-hooks/` to `scripts/hooks/`. +2. Update `scripts/install-hooks.sh` to reference the new path. +3. Add `scripts/hooks/pre-commit` that runs `pnpm run lint`. +4. Update `.git/config` hooksPath if already set. + +**Files:** +- `scripts/git-hooks/pre-push` → `scripts/hooks/pre-push` +- New: `scripts/hooks/pre-commit` +- `scripts/install-hooks.sh` + +| Estimate | ~15 LoC, ~0.5h | +|----------|-----------------| + +--- + +### 16.10 — Orphaned Blob Tracking *(P1)* — C2 + +**Problem** + +When `_chunkAndStore()` throws `STREAM_ERROR`, chunks already written to Git are orphaned. The error meta reports `chunksDispatched` but not the blob OIDs of successful writes. There's no visibility into what was orphaned. + +**Fix** + +1. After `Promise.allSettled(pending)`, collect blob OIDs from fulfilled results. +2. Include `orphanedBlobs: string[]` in the `STREAM_ERROR` meta. +3. Emit `observability.metric('error', { action: 'orphaned_blobs', count, blobs })`. + +**Files:** +- `src/domain/services/CasService.js` + +**Tests:** +```js +describe('16.10: Orphaned blob tracking on STREAM_ERROR', () => { + it('includes orphanedBlobs array in STREAM_ERROR meta', ...); + it('orphanedBlobs contains blob OIDs from successful writes before failure', ...); + it('orphanedBlobs is empty when stream fails before any writes', ...); + it('emits orphaned_blobs metric via observability', ...); +}); +``` + +| Estimate | ~20 LoC changes, ~30 LoC tests, ~2h | +|----------|--------------------------------------| + +--- + +### 16.11 — Passphrase Input Security *(P0)* — C5 + V6 + +**Problem** + +`--vault-passphrase ` puts the passphrase in shell history and process listings. The `GIT_CAS_PASSPHRASE` env var is better but still visible in `/proc//environ`. + +**Fix** + +1. **Interactive prompt**: When `--vault-passphrase` is passed without a value and stdin is a TTY, prompt with echo disabled. Confirmation on first use (store/init). +2. **File-based input**: Add `--vault-passphrase-file ` flag that reads from a file. +3. **Stdin pipe**: `--vault-passphrase -` reads from stdin. +4. **Documentation**: Security warning in `--help` and README. + +**Files:** +- `bin/git-cas.js` +- New: `bin/ui/passphrase-prompt.js` + +**Tests:** +```js +describe('16.11: Passphrase input security', () => { + it('reads passphrase from file when --vault-passphrase-file is used', ...); + it('errors when no passphrase source is available in non-TTY mode', ...); + it('--vault-passphrase-file trims trailing newline', ...); +}); +``` + +| Estimate | ~90 LoC, ~30 LoC tests, ~4h | +|----------|------------------------------| + +--- + +### 16.12 — KDF Brute-Force Awareness *(P2)* — C6 + +**Problem** + +`deriveKey()` and the restore path have no rate limiting or audit trail. An attacker can brute-force passphrases at full CPU speed. + +**Fix** + +1. Emit `observability.metric('error', { action: 'decryption_failed', slug })` on every `INTEGRITY_ERROR` during passphrase-based restore. +2. In the CLI layer, add a 1-second delay after each failed passphrase attempt. + +**Files:** +- `src/domain/services/CasService.js` (observability metric) +- `bin/git-cas.js` (CLI delay) + +**Tests:** +```js +describe('16.12: KDF brute-force awareness', () => { + it('emits decryption_failed metric on wrong passphrase', ...); + it('emits metric with slug context for audit trail', ...); + it('library API does NOT rate-limit (callers manage their own policy)', ...); +}); +``` + +| Estimate | ~10 LoC changes, ~20 LoC tests, ~1h | +|----------|--------------------------------------| + +--- + +### 16.13 — GCM Nonce Collision Documentation *(P2)* — C7 + +**Problem** + +AES-256-GCM uses a 96-bit random nonce. Birthday bound is ~2^48; NIST recommends limiting to 2^32 invocations per key. There's no tracking, no warning, and no documentation of the bound. + +**Fix** + +1. Add `SECURITY.md` at project root documenting: GCM nonce bound, recommended key rotation frequency, KDF parameter guidance, passphrase entropy recommendations. +2. Add `encryptionCount` field to vault metadata. Increment per `store()` with encryption. Emit observability warning when count exceeds 2^31. + +**Files:** +- New: `SECURITY.md` +- `src/domain/services/VaultService.js` (counter increment) + +**Tests:** +```js +describe('16.13: Nonce usage tracking', () => { + it('vault metadata includes encryptionCount after encrypted store', ...); + it('encryptionCount increments per encrypted store', ...); + it('warns via observability when encryptionCount exceeds threshold', ...); +}); +``` + +| Estimate | ~25 LoC changes, ~20 LoC tests, ~2h | +|----------|--------------------------------------| + +--- + +### M16 Summary + +| Task | Theme | Priority | Severity | Audit Ref | Concern Ref | ~LoC | ~Hours | +|------|-------|----------|----------|-----------|-------------|------|--------| +| 16.1 | Crypto adapter normalization | P0 | High | Flaw 1 | C8 | ~150 | ~4h | +| 16.2 | Memory restore guard | P0 | High | Flaw 2 | C1 | ~65 | ~2h | +| 16.3 | Web Crypto buffer guard | P1 | Medium | Flaw 3 | C4 | ~45 | ~1h | +| 16.4 | FixedChunker buffer optimization | P2 | Low | Flaw 4 | C9 | ~35 | ~1h | +| 16.5 | Encrypt-then-chunk dedup warning | P1 | Medium | Flaw 5 | C10 | ~30 | ~1h | +| 16.6 | Chunk size upper bound | P1 | Medium | Flaw 6 | C3 | ~45 | ~1h | +| 16.7 | Lifecycle method naming | P2 | Low | Flaw 7 | — | ~55 | ~1h | +| 16.8 | CasError portability guard | P2 | Negligible | Flaw 8 | — | ~13 | ~0.5h | +| 16.9 | Pre-commit hook + hooks dir | P2 | Low | Flaw 9 | — | ~15 | ~0.5h | +| 16.10 | Orphaned blob tracking | P1 | Medium | — | C2 | ~50 | ~2h | +| 16.11 | Passphrase input security | P0 | High | — | C5+V6 | ~120 | ~4h | +| 16.12 | KDF brute-force awareness | P2 | Low | — | C6 | ~30 | ~1h | +| 16.13 | GCM nonce collision docs + counter | P2 | Low | — | C7 | ~45 | ~2h | +| **Total** | | | | | | **~698** | **~21h** | + +### Recommended Execution Order + +**Phase 1 — Safety nets (P0):** +16.8, 16.9, 16.1, 16.2, 16.11 + +**Phase 2 — Correctness (P1):** +16.6, 16.3, 16.5, 16.10 + +**Phase 3 — Polish (P2):** +16.4, 16.7, 16.12, 16.13 + +--- + # 7) Feature Matrix Competitive landscape for content-addressed storage, encrypted binary assets, and large-file Git tooling. Rows represent the union of features across the space — not just what git-cas offers, but what users encounter and expect when evaluating tools in this category. @@ -1170,17 +1613,64 @@ describe('Concern 7: Nonce uniqueness', () => { --- +## Concern 8: Crypto Adapter Liskov Substitution Violation + +**Source:** CODE-EVAL.md, Flaw 1 + +**The Problem** + +The three `CryptoPort` implementations (Node, Bun, Web) differ in observable behavior: + +1. `NodeCryptoAdapter.encryptBuffer()` is synchronous (returns plain object), while Bun and Web return `Promise`. +2. `BunCryptoAdapter.decryptBuffer()` calls `_validateKey(key)` before decryption; Node and Web do not — the invalid key hits `node:crypto` directly, producing a less informative error. +3. `NodeCryptoAdapter.createEncryptionStream()` has no premature-finalize guard. Calling `finalize()` before consuming the stream returns garbage metadata on Node, but throws a clear `CasError('STREAM_NOT_CONSUMED')` on Bun and Deno. + +M15 Prism fixed the `sha256()` async inconsistency but left these three discrepancies untouched. + +**Mitigation:** Task 16.1. + +--- + +## Concern 9: FixedChunker Quadratic Buffer Allocation + +**Source:** CODE-EVAL.md, Flaw 4 + +**The Problem** + +`FixedChunker.chunk()` uses `Buffer.concat([buffer, data])` inside its async loop. Each call allocates a new buffer and copies the accumulated bytes. For a source yielding many small buffers (e.g., 4 KiB network reads into a 256 KiB chunk), this is O(n^2 / chunkSize) total byte copies. The CdcChunker, by contrast, uses a pre-allocated `Buffer.allocUnsafe(maxChunkSize)` with zero intermediate copies. + +**Mitigation:** Task 16.4. + +--- + +## Concern 10: CDC Deduplication Defeated by Encrypt-Then-Chunk + +**Source:** CODE-EVAL.md, Flaw 5 + +**The Problem** + +Encryption is applied to the source stream *before* chunking. AES-GCM ciphertext is pseudorandom — identical plaintext produces different ciphertext (different random nonce each time). This means content-defined chunking (CDC) provides **zero deduplication benefit** for encrypted files. Users who combine `recipients` (or `encryptionKey`) with `chunking: { strategy: 'cdc' }` get CDC's computational overhead without its primary value proposition. + +This is a fundamental architectural constraint of the encrypt-then-chunk design. The alternative (chunk-then-encrypt) would require per-chunk nonces and auth tags, significantly complicating the manifest schema. This is documented as a known limitation, not a fixable bug. + +**Mitigation:** Task 16.5 (runtime warning + documentation). + +--- + ## Summary Table -| # | Type | Severity | Fix Cost | Recommended Action | -|---|------|----------|----------|-------------------| -| C1 | Memory amplification | High | ~20 LoC | Add `maxRestoreBufferSize` guard | -| C2 | Orphaned blobs | Medium | ~20 LoC | Report orphaned blob OIDs in error meta | -| C3 | No chunk size cap | Medium | ~6 LoC | Enforce 100 MiB maximum | -| C4 | Web Crypto buffering | Medium | ~15 LoC | Add buffer size guard in WebCryptoAdapter | -| C5 | Passphrase exposure | High | ~90 LoC | Interactive prompt + file-based input | -| C6 | KDF no rate limit | Low | ~10 LoC | Observability metric + CLI delay | -| C7 | GCM nonce collision | Low | ~20 LoC | Document bound + vault usage counter | +| # | Type | Severity | Fix Cost | Recommended Action | Task | +|---|------|----------|----------|--------------------|------| +| C1 | Memory amplification | High | ~20 LoC | Add `maxRestoreBufferSize` guard | **16.2** | +| C2 | Orphaned blobs | Medium | ~20 LoC | Report orphaned blob OIDs in error meta | **16.10** | +| C3 | No chunk size cap | Medium | ~6 LoC | Enforce 100 MiB maximum | **16.6** | +| C4 | Web Crypto buffering | Medium | ~15 LoC | Add buffer size guard in WebCryptoAdapter | **16.3** | +| C5 | Passphrase exposure | High | ~90 LoC | Interactive prompt + file-based input | **16.11** | +| C6 | KDF no rate limit | Low | ~10 LoC | Observability metric + CLI delay | **16.12** | +| C7 | GCM nonce collision | Low | ~20 LoC | Document bound + vault usage counter | **16.13** | +| C8 | Crypto adapter LSP violation | Medium | ~50 LoC | Normalize validation + finalize guards | **16.1** | +| C9 | FixedChunker quadratic alloc | Low | ~20 LoC | Pre-allocated buffer | **16.4** | +| C10 | Encrypt-then-chunk dedup loss | Medium | ~10 LoC | Runtime warning + documentation | **16.5** | | # | Type | Theme | Est. Cost | |---|------|-------|-----------| @@ -1189,4 +1679,4 @@ describe('Concern 7: Nonce uniqueness', () => { | V3 | Feature | Manifest diff engine | ~180 LoC, ~8h | | V4 | Feature | CompressionPort + zstd/brotli/lz4 | ~180 LoC, ~8h | | V5 | Feature | Watch mode (continuous sync) | ~220 LoC, ~10h | -| V6 | Feature | Interactive passphrase prompt | ~90 LoC, ~4h | +| V6 | Feature | Interactive passphrase prompt | ~90 LoC, ~4h — subsumed by **16.11** | From 7dfbc0e57f125ba2fd36e28c72d4091e1216a794 Mon Sep 17 00:00:00 2001 From: James Ross Date: Tue, 3 Mar 2026 19:30:44 -0800 Subject: [PATCH 03/41] fix(errors): guard Error.captureStackTrace for cross-runtime portability CasError now checks for Error.captureStackTrace before calling it. This prevents a TypeError on runtimes where captureStackTrace is not defined (e.g. Firefox, older Deno builds). Task: 16.8 --- CHANGELOG.md | 3 ++ src/domain/errors/CasError.js | 2 +- test/unit/domain/errors/CasError.test.js | 37 ++++++++++++++++++++++++ 3 files changed, 41 insertions(+), 1 deletion(-) create mode 100644 test/unit/domain/errors/CasError.test.js diff --git a/CHANGELOG.md b/CHANGELOG.md index 022ed25..8ab9cd1 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -13,6 +13,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - **Concerns C8–C10** — Three new architectural concerns identified by the audit: crypto adapter LSP violation (C8), FixedChunker quadratic allocation (C9), encrypt-then-chunk dedup loss (C10). - **CasError codes** — `RESTORE_TOO_LARGE` and `ENCRYPTION_BUFFER_EXCEEDED` registered in canonical error code table. +### Fixed +- **16.8 — CasError portability guard** — `Error.captureStackTrace` now guarded with a runtime check. CasError constructs correctly on runtimes where `captureStackTrace` is unavailable (e.g. Firefox, older Deno). + ## [5.2.4] — Prism polish (2026-03-03) ### Fixed diff --git a/src/domain/errors/CasError.js b/src/domain/errors/CasError.js index 6acc1da..9120c95 100644 --- a/src/domain/errors/CasError.js +++ b/src/domain/errors/CasError.js @@ -15,6 +15,6 @@ export default class CasError extends Error { this.name = this.constructor.name; this.code = code; this.meta = meta; - Error.captureStackTrace(this, this.constructor); + if (Error.captureStackTrace) Error.captureStackTrace(this, this.constructor); } } diff --git a/test/unit/domain/errors/CasError.test.js b/test/unit/domain/errors/CasError.test.js new file mode 100644 index 0000000..eca298f --- /dev/null +++ b/test/unit/domain/errors/CasError.test.js @@ -0,0 +1,37 @@ +import { describe, it, expect, vi, afterEach } from 'vitest'; +import CasError from '../../../../src/domain/errors/CasError.js'; + +describe('CasError', () => { + it('sets name, code, and meta properties', () => { + const err = new CasError('boom', 'TEST_CODE', { foo: 'bar' }); + expect(err.name).toBe('CasError'); + expect(err.message).toBe('boom'); + expect(err.code).toBe('TEST_CODE'); + expect(err.meta).toEqual({ foo: 'bar' }); + }); + + it('defaults meta to empty object', () => { + const err = new CasError('msg', 'CODE'); + expect(err.meta).toEqual({}); + }); + + it('is an instance of Error', () => { + const err = new CasError('msg', 'CODE'); + expect(err).toBeInstanceOf(Error); + }); + + it('constructs correctly when Error.captureStackTrace is unavailable', () => { + const original = Error.captureStackTrace; + Error.captureStackTrace = undefined; + try { + const err = new CasError('no-stack', 'NO_STACK', { x: 1 }); + expect(err.name).toBe('CasError'); + expect(err.code).toBe('NO_STACK'); + expect(err.meta).toEqual({ x: 1 }); + expect(err.message).toBe('no-stack'); + expect(err).toBeInstanceOf(Error); + } finally { + Error.captureStackTrace = original; + } + }); +}); From c713d63d0d967bdbceb4f5f6e2eb9cb820e712e1 Mon Sep 17 00:00:00 2001 From: James Ross Date: Tue, 3 Mar 2026 19:31:28 -0800 Subject: [PATCH 04/41] =?UTF-8?q?chore(hooks):=20rename=20git-hooks=20?= =?UTF-8?q?=E2=86=92=20hooks,=20add=20pre-commit=20lint=20gate?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - git mv scripts/git-hooks/pre-push → scripts/hooks/pre-push - New scripts/hooks/pre-commit runs `pnpm run lint` - install-hooks.sh updated to point at scripts/hooks/ Task: 16.9 --- CHANGELOG.md | 1 + scripts/hooks/pre-commit | 13 +++++++++++++ scripts/{git-hooks => hooks}/pre-push | 0 scripts/install-hooks.sh | 2 +- 4 files changed, 15 insertions(+), 1 deletion(-) create mode 100755 scripts/hooks/pre-commit rename scripts/{git-hooks => hooks}/pre-push (100%) diff --git a/CHANGELOG.md b/CHANGELOG.md index 8ab9cd1..fb8e706 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -15,6 +15,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Fixed - **16.8 — CasError portability guard** — `Error.captureStackTrace` now guarded with a runtime check. CasError constructs correctly on runtimes where `captureStackTrace` is unavailable (e.g. Firefox, older Deno). +- **16.9 — Pre-commit hook + hooks directory** — `scripts/git-hooks/` renamed to `scripts/hooks/` per CLAUDE.md convention. New `pre-commit` hook runs lint gate. `install-hooks.sh` updated accordingly. ## [5.2.4] — Prism polish (2026-03-03) diff --git a/scripts/hooks/pre-commit b/scripts/hooks/pre-commit new file mode 100755 index 0000000..d5e25a7 --- /dev/null +++ b/scripts/hooks/pre-commit @@ -0,0 +1,13 @@ +#!/usr/bin/env bash + +# pre-commit git hook +# Lint must pass cleanly. Zero errors, zero warnings. + +set -e + +echo "Running pre-commit lint gate..." + +echo "→ Linting..." +pnpm run lint + +echo "✅ Lint passed." diff --git a/scripts/git-hooks/pre-push b/scripts/hooks/pre-push similarity index 100% rename from scripts/git-hooks/pre-push rename to scripts/hooks/pre-push diff --git a/scripts/install-hooks.sh b/scripts/install-hooks.sh index fe569e9..567f8d9 100644 --- a/scripts/install-hooks.sh +++ b/scripts/install-hooks.sh @@ -6,7 +6,7 @@ set -e SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" -HOOKS_DIR="${SCRIPT_DIR}/git-hooks" +HOOKS_DIR="${SCRIPT_DIR}/hooks" # Make all hooks executable chmod +x "${HOOKS_DIR}"/* From 52405368896bb669be1c08a6746e81cc2c9140e0 Mon Sep 17 00:00:00 2001 From: James Ross Date: Tue, 3 Mar 2026 19:32:01 -0800 Subject: [PATCH 05/41] fix(lint): fix curly brace and unused imports from 16.8/16.9 - CasError.js: add curly braces around captureStackTrace guard - CasError.test.js: remove unused vi and afterEach imports --- src/domain/errors/CasError.js | 4 +++- test/unit/domain/errors/CasError.test.js | 2 +- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/src/domain/errors/CasError.js b/src/domain/errors/CasError.js index 9120c95..54f9ba3 100644 --- a/src/domain/errors/CasError.js +++ b/src/domain/errors/CasError.js @@ -15,6 +15,8 @@ export default class CasError extends Error { this.name = this.constructor.name; this.code = code; this.meta = meta; - if (Error.captureStackTrace) Error.captureStackTrace(this, this.constructor); + if (Error.captureStackTrace) { + Error.captureStackTrace(this, this.constructor); + } } } diff --git a/test/unit/domain/errors/CasError.test.js b/test/unit/domain/errors/CasError.test.js index eca298f..9f7fb99 100644 --- a/test/unit/domain/errors/CasError.test.js +++ b/test/unit/domain/errors/CasError.test.js @@ -1,4 +1,4 @@ -import { describe, it, expect, vi, afterEach } from 'vitest'; +import { describe, it, expect } from 'vitest'; import CasError from '../../../../src/domain/errors/CasError.js'; describe('CasError', () => { From 8163d46984cddca20d1a51890620b3d6fa20499d Mon Sep 17 00:00:00 2001 From: James Ross Date: Tue, 3 Mar 2026 19:34:06 -0800 Subject: [PATCH 06/41] fix(crypto): normalize adapter behavioral contracts across runtimes - NodeCryptoAdapter.encryptBuffer() now async (returns Promise) - decryptBuffer() validates key on Node and Web adapters - NodeCryptoAdapter.createEncryptionStream finalize() guards with STREAM_NOT_CONSUMED before stream consumption - New CryptoAdapter.conformance.test.js asserts identical contracts Task: 16.1 --- CHANGELOG.md | 1 + .../adapters/NodeCryptoAdapter.js | 12 +++- .../adapters/WebCryptoAdapter.js | 1 + .../CryptoAdapter.conformance.test.js | 72 +++++++++++++++++++ 4 files changed, 85 insertions(+), 1 deletion(-) create mode 100644 test/unit/infrastructure/adapters/CryptoAdapter.conformance.test.js diff --git a/CHANGELOG.md b/CHANGELOG.md index fb8e706..3706200 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -16,6 +16,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Fixed - **16.8 — CasError portability guard** — `Error.captureStackTrace` now guarded with a runtime check. CasError constructs correctly on runtimes where `captureStackTrace` is unavailable (e.g. Firefox, older Deno). - **16.9 — Pre-commit hook + hooks directory** — `scripts/git-hooks/` renamed to `scripts/hooks/` per CLAUDE.md convention. New `pre-commit` hook runs lint gate. `install-hooks.sh` updated accordingly. +- **16.1 — Crypto adapter behavioral normalization** — `NodeCryptoAdapter.encryptBuffer` now returns a Promise (was sync), matching Bun/Web. `decryptBuffer` validates key on all adapters. `NodeCryptoAdapter.createEncryptionStream` guards `finalize()` with `STREAM_NOT_CONSUMED`. New conformance test suite asserts identical contracts across all adapters. ## [5.2.4] — Prism polish (2026-03-03) diff --git a/src/infrastructure/adapters/NodeCryptoAdapter.js b/src/infrastructure/adapters/NodeCryptoAdapter.js index f89898c..c333f76 100644 --- a/src/infrastructure/adapters/NodeCryptoAdapter.js +++ b/src/infrastructure/adapters/NodeCryptoAdapter.js @@ -1,6 +1,7 @@ import { createHash, createCipheriv, createDecipheriv, randomBytes, pbkdf2, scrypt } from 'node:crypto'; import { promisify } from 'node:util'; import CryptoPort from '../../ports/CryptoPort.js'; +import CasError from '../../domain/errors/CasError.js'; /** * Node.js implementation of CryptoPort using node:crypto. @@ -30,7 +31,7 @@ export default class NodeCryptoAdapter extends CryptoPort { * @param {Buffer|Uint8Array} key - 32-byte encryption key. * @returns {{ buf: Buffer, meta: import('../../ports/CryptoPort.js').EncryptionMeta }} */ - encryptBuffer(buffer, key) { + async encryptBuffer(buffer, key) { this._validateKey(key); const nonce = randomBytes(12); const cipher = createCipheriv('aes-256-gcm', key, nonce); @@ -50,6 +51,7 @@ export default class NodeCryptoAdapter extends CryptoPort { * @returns {Buffer} */ decryptBuffer(buffer, key, meta) { + this._validateKey(key); const nonce = Buffer.from(meta.nonce, 'base64'); const tag = Buffer.from(meta.tag, 'base64'); const decipher = createDecipheriv('aes-256-gcm', key, nonce); @@ -66,6 +68,7 @@ export default class NodeCryptoAdapter extends CryptoPort { this._validateKey(key); const nonce = randomBytes(12); const cipher = createCipheriv('aes-256-gcm', key, nonce); + let streamFinalized = false; /** @param {AsyncIterable} source */ const encrypt = async function* (source) { @@ -79,9 +82,16 @@ export default class NodeCryptoAdapter extends CryptoPort { if (final.length > 0) { yield final; } + streamFinalized = true; }; const finalize = () => { + if (!streamFinalized) { + throw new CasError( + 'Cannot finalize before the encrypt stream is fully consumed', + 'STREAM_NOT_CONSUMED', + ); + } const tag = cipher.getAuthTag(); return this._buildMeta(nonce.toString('base64'), tag.toString('base64')); }; diff --git a/src/infrastructure/adapters/WebCryptoAdapter.js b/src/infrastructure/adapters/WebCryptoAdapter.js index 5a70733..310da32 100644 --- a/src/infrastructure/adapters/WebCryptoAdapter.js +++ b/src/infrastructure/adapters/WebCryptoAdapter.js @@ -73,6 +73,7 @@ export default class WebCryptoAdapter extends CryptoPort { * @returns {Promise} */ async decryptBuffer(buffer, key, meta) { + this._validateKey(key); const nonce = this.#fromBase64(meta.nonce); const tag = this.#fromBase64(meta.tag); const cryptoKey = await this.#importKey(key); diff --git a/test/unit/infrastructure/adapters/CryptoAdapter.conformance.test.js b/test/unit/infrastructure/adapters/CryptoAdapter.conformance.test.js new file mode 100644 index 0000000..8e45d14 --- /dev/null +++ b/test/unit/infrastructure/adapters/CryptoAdapter.conformance.test.js @@ -0,0 +1,72 @@ +import { describe, it, expect } from 'vitest'; +import NodeCryptoAdapter from '../../../../src/infrastructure/adapters/NodeCryptoAdapter.js'; +import WebCryptoAdapter from '../../../../src/infrastructure/adapters/WebCryptoAdapter.js'; +import CasError from '../../../../src/domain/errors/CasError.js'; + +/** + * Conformance test suite that asserts identical behavioral contracts across + * all crypto adapters that can run in the current environment. + */ + +const adapters = [ + ['NodeCryptoAdapter', new NodeCryptoAdapter()], + ['WebCryptoAdapter', new WebCryptoAdapter()], +]; + +// BunCryptoAdapter is only available in Bun runtime — skip in Node/Deno +if (typeof globalThis.Bun !== 'undefined') { + const { default: BunCryptoAdapter } = await import( + '../../../../src/infrastructure/adapters/BunCryptoAdapter.js' + ); + adapters.push(['BunCryptoAdapter', new BunCryptoAdapter()]); +} + +describe.each(adapters)('%s conformance', (_name, adapter) => { + const key = Buffer.alloc(32, 0xab); + + it('encryptBuffer returns a Promise (thenable)', async () => { + const result = adapter.encryptBuffer(Buffer.from('hello'), key); + expect(typeof result.then).toBe('function'); + const { buf, meta } = await result; + expect(buf).toBeInstanceOf(Buffer); + expect(meta.encrypted).toBe(true); + }); + + it('decryptBuffer rejects INVALID_KEY_TYPE for string key', async () => { + const { buf, meta } = await adapter.encryptBuffer(Buffer.from('test'), key); + await expect( + Promise.resolve().then(() => adapter.decryptBuffer(buf, 'not-a-buffer', meta)), + ).rejects.toThrow(CasError); + + try { + await Promise.resolve().then(() => adapter.decryptBuffer(buf, 'not-a-buffer', meta)); + } catch (err) { + expect(err.code).toBe('INVALID_KEY_TYPE'); + } + }); + + it('decryptBuffer rejects INVALID_KEY_LENGTH for 16-byte key', async () => { + const shortKey = Buffer.alloc(16, 0xcc); + const { buf, meta } = await adapter.encryptBuffer(Buffer.from('test'), key); + await expect( + Promise.resolve().then(() => adapter.decryptBuffer(buf, shortKey, meta)), + ).rejects.toThrow(CasError); + + try { + await Promise.resolve().then(() => adapter.decryptBuffer(buf, shortKey, meta)); + } catch (err) { + expect(err.code).toBe('INVALID_KEY_LENGTH'); + } + }); + + it('createEncryptionStream.finalize() throws STREAM_NOT_CONSUMED before consumption', () => { + const { finalize } = adapter.createEncryptionStream(key); + expect(() => finalize()).toThrow(CasError); + + try { + finalize(); + } catch (err) { + expect(err.code).toBe('STREAM_NOT_CONSUMED'); + } + }); +}); From ea4eeae470e1019048f909c3e503465e3ed0646d Mon Sep 17 00:00:00 2001 From: James Ross Date: Tue, 3 Mar 2026 19:37:10 -0800 Subject: [PATCH 07/41] feat(restore): add maxRestoreBufferSize guard for buffered restore CasService._restoreBuffered now checks total chunk size against maxRestoreBufferSize (default 512 MiB) before buffering encrypted or compressed data. Throws RESTORE_TOO_LARGE with {size, limit} meta when exceeded. Unencrypted streaming restore is unaffected. Task: 16.2 --- CHANGELOG.md | 1 + index.d.ts | 2 + index.js | 5 +- src/domain/services/CasService.js | 33 ++++- .../services/CasService.restoreGuard.test.js | 134 ++++++++++++++++++ 5 files changed, 166 insertions(+), 9 deletions(-) create mode 100644 test/unit/domain/services/CasService.restoreGuard.test.js diff --git a/CHANGELOG.md b/CHANGELOG.md index 3706200..504c2e1 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -17,6 +17,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - **16.8 — CasError portability guard** — `Error.captureStackTrace` now guarded with a runtime check. CasError constructs correctly on runtimes where `captureStackTrace` is unavailable (e.g. Firefox, older Deno). - **16.9 — Pre-commit hook + hooks directory** — `scripts/git-hooks/` renamed to `scripts/hooks/` per CLAUDE.md convention. New `pre-commit` hook runs lint gate. `install-hooks.sh` updated accordingly. - **16.1 — Crypto adapter behavioral normalization** — `NodeCryptoAdapter.encryptBuffer` now returns a Promise (was sync), matching Bun/Web. `decryptBuffer` validates key on all adapters. `NodeCryptoAdapter.createEncryptionStream` guards `finalize()` with `STREAM_NOT_CONSUMED`. New conformance test suite asserts identical contracts across all adapters. +- **16.2 — Memory restore guard** — `CasService` accepts `maxRestoreBufferSize` (default 512 MiB). `_restoreBuffered` throws `RESTORE_TOO_LARGE` with `{ size, limit }` meta when encrypted/compressed restore would exceed the limit. Unencrypted streaming restore is unaffected. ## [5.2.4] — Prism polish (2026-03-03) diff --git a/index.d.ts b/index.d.ts index c59de13..fe1b044 100644 --- a/index.d.ts +++ b/index.d.ts @@ -171,6 +171,8 @@ export interface ContentAddressableStoreOptions { concurrency?: number; chunking?: ChunkingConfig; chunker?: ChunkingPort; + /** Maximum bytes to buffer during encrypted/compressed restore. @default 536870912 (512 MiB) */ + maxRestoreBufferSize?: number; } /** A single vault entry. */ diff --git a/index.js b/index.js index 1a643fb..b26cd14 100644 --- a/index.js +++ b/index.js @@ -65,8 +65,8 @@ export default class ContentAddressableStore { * @param {{ strategy: string, chunkSize?: number, targetChunkSize?: number, minChunkSize?: number, maxChunkSize?: number }} [options.chunking] - Chunking strategy config. * @param {import('./src/ports/ChunkingPort.js').default} [options.chunker] - Pre-built ChunkingPort instance (advanced). */ - constructor({ plumbing, chunkSize, codec, policy, crypto, observability, merkleThreshold, concurrency, chunking, chunker }) { - this.#config = { plumbing, chunkSize, codec, policy, crypto, observability, merkleThreshold, concurrency, chunking, chunker }; + constructor({ plumbing, chunkSize, codec, policy, crypto, observability, merkleThreshold, concurrency, chunking, chunker, maxRestoreBufferSize }) { + this.#config = { plumbing, chunkSize, codec, policy, crypto, observability, merkleThreshold, concurrency, chunking, chunker, maxRestoreBufferSize }; this.service = null; this.#servicePromise = null; } @@ -111,6 +111,7 @@ export default class ContentAddressableStore { merkleThreshold: cfg.merkleThreshold, concurrency: cfg.concurrency, chunker, + maxRestoreBufferSize: cfg.maxRestoreBufferSize, }); const ref = new GitRefAdapter({ diff --git a/src/domain/services/CasService.js b/src/domain/services/CasService.js index 9d1370c..8c8df0b 100644 --- a/src/domain/services/CasService.js +++ b/src/domain/services/CasService.js @@ -35,11 +35,9 @@ export default class CasService { * @param {number} [options.concurrency=1] - Maximum parallel chunk I/O operations. * @param {import('../../ports/ChunkingPort.js').default} [options.chunker] - Chunking strategy (default FixedChunker). */ - constructor({ persistence, codec, crypto, observability, chunkSize = 256 * 1024, merkleThreshold = 1000, concurrency = 1, chunker }) { + constructor({ persistence, codec, crypto, observability, chunkSize = 256 * 1024, merkleThreshold = 1000, concurrency = 1, chunker, maxRestoreBufferSize = 512 * 1024 * 1024 }) { CasService._validateObservability(observability); - if (chunkSize < 1024) { - throw new Error('Chunk size must be at least 1024 bytes'); - } + CasService.#validateConstructorArgs(chunkSize, merkleThreshold, concurrency); this.persistence = persistence; this.codec = codec; this.crypto = crypto; @@ -47,15 +45,26 @@ export default class CasService { this.chunkSize = chunkSize; /** @type {import('../../ports/ChunkingPort.js').default} */ this.chunker = chunker || new FixedChunker({ chunkSize }); + this.merkleThreshold = merkleThreshold; + this.concurrency = concurrency; + this.maxRestoreBufferSize = maxRestoreBufferSize; + this.#keyResolver = new KeyResolver(crypto); + } + + /** + * Validates constructor numeric arguments. + * @private + */ + static #validateConstructorArgs(chunkSize, merkleThreshold, concurrency) { + if (chunkSize < 1024) { + throw new Error('Chunk size must be at least 1024 bytes'); + } if (!Number.isInteger(merkleThreshold) || merkleThreshold < 1) { throw new Error('Merkle threshold must be a positive integer'); } - this.merkleThreshold = merkleThreshold; if (!Number.isInteger(concurrency) || concurrency < 1) { throw new Error('Concurrency must be a positive integer'); } - this.concurrency = concurrency; - this.#keyResolver = new KeyResolver(crypto); } /** @@ -469,6 +478,16 @@ export default class CasService { * @private */ async *_restoreBuffered(manifest, key) { + const totalSize = manifest.chunks.reduce((acc, c) => acc + c.size, 0); + if (totalSize > this.maxRestoreBufferSize) { + throw new CasError( + `Encrypted/compressed restore would buffer ${totalSize} bytes ` + + `(limit: ${this.maxRestoreBufferSize}). Increase maxRestoreBufferSize ` + + 'or store without encryption.', + 'RESTORE_TOO_LARGE', + { size: totalSize, limit: this.maxRestoreBufferSize }, + ); + } let buffer = Buffer.concat(await this._readAndVerifyChunks(manifest.chunks)); if (manifest.encryption?.encrypted) { diff --git a/test/unit/domain/services/CasService.restoreGuard.test.js b/test/unit/domain/services/CasService.restoreGuard.test.js new file mode 100644 index 0000000..96f9dfd --- /dev/null +++ b/test/unit/domain/services/CasService.restoreGuard.test.js @@ -0,0 +1,134 @@ +import { describe, it, expect, vi } from 'vitest'; +import CasService from '../../../../src/domain/services/CasService.js'; +import { getTestCryptoAdapter } from '../../../helpers/crypto-adapter.js'; +import JsonCodec from '../../../../src/infrastructure/codecs/JsonCodec.js'; +import CasError from '../../../../src/domain/errors/CasError.js'; +import SilentObserver from '../../../../src/infrastructure/adapters/SilentObserver.js'; +import Manifest from '../../../../src/domain/value-objects/Manifest.js'; + +const testCrypto = await getTestCryptoAdapter(); + +function setup({ maxRestoreBufferSize } = {}) { + const mockPersistence = { + writeBlob: vi.fn().mockResolvedValue('mock-blob-oid'), + writeTree: vi.fn().mockResolvedValue('mock-tree-oid'), + readBlob: vi.fn().mockResolvedValue(Buffer.alloc(1024, 0xaa)), + readTree: vi.fn(), + }; + const opts = { + persistence: mockPersistence, + crypto: testCrypto, + codec: new JsonCodec(), + chunkSize: 1024, + observability: new SilentObserver(), + }; + if (maxRestoreBufferSize !== undefined) { + opts.maxRestoreBufferSize = maxRestoreBufferSize; + } + const service = new CasService(opts); + return { mockPersistence, service }; +} + +function makeEncryptedManifest(chunkSizes) { + const chunks = chunkSizes.map((size, i) => ({ + index: i, + size, + digest: 'a'.repeat(64), + blob: `blob-${i}`, + })); + return new Manifest({ + slug: 'test', + filename: 'test.bin', + size: chunkSizes.reduce((a, b) => a + b, 0), + chunks, + encryption: { + algorithm: 'aes-256-gcm', + nonce: Buffer.alloc(12).toString('base64'), + tag: Buffer.alloc(16).toString('base64'), + encrypted: true, + }, + }); +} + +describe('CasService — RESTORE_TOO_LARGE throws on exceed', () => { + it('throws RESTORE_TOO_LARGE when chunk sizes exceed limit', async () => { + const { service } = setup({ maxRestoreBufferSize: 2000 }); + const manifest = makeEncryptedManifest([1024, 1024, 1024]); + + await expect( + service.restoreStream({ manifest, encryptionKey: Buffer.alloc(32, 0xab) }).next(), + ).rejects.toThrow(CasError); + + try { + await service.restoreStream({ manifest, encryptionKey: Buffer.alloc(32, 0xab) }).next(); + } catch (err) { + expect(err.code).toBe('RESTORE_TOO_LARGE'); + expect(err.meta.size).toBe(3072); + expect(err.meta.limit).toBe(2000); + } + }); +}); + +describe('CasService — RESTORE_TOO_LARGE succeeds within limit', () => { + it('succeeds when within limit', async () => { + const { service, mockPersistence } = setup({ maxRestoreBufferSize: 4096 }); + const key = Buffer.alloc(32, 0xab); + + async function* source() { yield Buffer.alloc(512, 0xaa); } + const manifest = await service.store({ source: source(), slug: 'ok', filename: 'ok.bin', encryptionKey: key }); + + const storedBlobArgs = mockPersistence.writeBlob.mock.calls.map((c) => c[0]); + let blobIdx = 0; + mockPersistence.readBlob.mockImplementation(() => Promise.resolve(storedBlobArgs[blobIdx++] || Buffer.alloc(0))); + + const chunks = []; + for await (const chunk of service.restoreStream({ manifest, encryptionKey: key })) { + chunks.push(chunk); + } + expect(chunks.length).toBeGreaterThan(0); + }); +}); + +describe('CasService — RESTORE_TOO_LARGE defaults and meta', () => { + it('default maxRestoreBufferSize is 512 MiB', () => { + const { service } = setup(); + expect(service.maxRestoreBufferSize).toBe(512 * 1024 * 1024); + }); + + it('error meta includes size and limit', async () => { + const { service } = setup({ maxRestoreBufferSize: 100 }); + const manifest = makeEncryptedManifest([50, 60]); + + try { + await service.restoreStream({ manifest, encryptionKey: Buffer.alloc(32, 0xab) }).next(); + } catch (err) { + expect(err.code).toBe('RESTORE_TOO_LARGE'); + expect(err.meta).toHaveProperty('size', 110); + expect(err.meta).toHaveProperty('limit', 100); + } + }); +}); + +describe('CasService — RESTORE_TOO_LARGE does not affect streaming', () => { + it('does not apply to unencrypted/uncompressed restoreStream', async () => { + const { service, mockPersistence } = setup({ maxRestoreBufferSize: 10 }); + const manifest = new Manifest({ + slug: 'plain', + filename: 'plain.bin', + size: 2048, + chunks: [ + { index: 0, size: 1024, digest: 'a'.repeat(64), blob: 'blob-0' }, + { index: 1, size: 1024, digest: 'a'.repeat(64), blob: 'blob-1' }, + ], + }); + + mockPersistence.readBlob.mockResolvedValue(Buffer.alloc(1024, 0xcc)); + service._sha256 = vi.fn().mockResolvedValue('a'.repeat(64)); + + const chunks = []; + for await (const chunk of service.restoreStream({ manifest })) { + chunks.push(chunk); + } + expect(chunks).toHaveLength(2); + }); +}); From 23ce968980925dd0f5e61b2469620b0a9d3da456 Mon Sep 17 00:00:00 2001 From: James Ross Date: Tue, 3 Mar 2026 19:39:18 -0800 Subject: [PATCH 08/41] feat(cli): add passphrase file input and interactive TTY prompt MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - New bin/ui/passphrase-prompt.js with promptPassphrase() and readPassphraseFile() - --vault-passphrase-file option on store, restore, vault init - resolvePassphrase() now async with priority chain: file → flag → env → TTY prompt → undefined - vault init uses confirm: true for passphrase confirmation Task: 16.11 --- CHANGELOG.md | 1 + bin/git-cas.js | 35 ++++++++++--- bin/ui/passphrase-prompt.js | 66 +++++++++++++++++++++++++ test/unit/cli/passphrase-prompt.test.js | 31 ++++++++++++ 4 files changed, 126 insertions(+), 7 deletions(-) create mode 100644 bin/ui/passphrase-prompt.js create mode 100644 test/unit/cli/passphrase-prompt.test.js diff --git a/CHANGELOG.md b/CHANGELOG.md index 504c2e1..e913574 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -18,6 +18,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - **16.9 — Pre-commit hook + hooks directory** — `scripts/git-hooks/` renamed to `scripts/hooks/` per CLAUDE.md convention. New `pre-commit` hook runs lint gate. `install-hooks.sh` updated accordingly. - **16.1 — Crypto adapter behavioral normalization** — `NodeCryptoAdapter.encryptBuffer` now returns a Promise (was sync), matching Bun/Web. `decryptBuffer` validates key on all adapters. `NodeCryptoAdapter.createEncryptionStream` guards `finalize()` with `STREAM_NOT_CONSUMED`. New conformance test suite asserts identical contracts across all adapters. - **16.2 — Memory restore guard** — `CasService` accepts `maxRestoreBufferSize` (default 512 MiB). `_restoreBuffered` throws `RESTORE_TOO_LARGE` with `{ size, limit }` meta when encrypted/compressed restore would exceed the limit. Unencrypted streaming restore is unaffected. +- **16.11 — Passphrase input security** — New `--vault-passphrase-file ` CLI option reads passphrase from file (use `-` for stdin). Interactive TTY prompt added as fallback when no other passphrase source is available. `resolvePassphrase` is now async with priority: file → flag → env → TTY → undefined. ## [5.2.4] — Prism polish (2026-03-03) diff --git a/bin/git-cas.js b/bin/git-cas.js index 5e0f30d..631de09 100755 --- a/bin/git-cas.js +++ b/bin/git-cas.js @@ -12,6 +12,7 @@ import { renderManifestView } from './ui/manifest-view.js'; import { renderHeatmap } from './ui/heatmap.js'; import { runAction } from './actions.js'; import { filterEntries, formatTable, formatTabSeparated } from './ui/vault-list.js'; +import { readPassphraseFile, promptPassphrase } from './ui/passphrase-prompt.js'; const getJson = () => program.opts().json; @@ -75,13 +76,30 @@ async function deriveVaultKey(cas, metadata, passphrase) { } /** - * Resolve passphrase from --vault-passphrase flag or GIT_CAS_PASSPHRASE env var. + * Resolve passphrase from (in priority order): + * 1. --vault-passphrase-file + * 2. --vault-passphrase + * 3. GIT_CAS_PASSPHRASE env var + * 4. Interactive TTY prompt (if stdin is a TTY) * * @param {Record} opts - * @returns {string | undefined} + * @param {{ confirm?: boolean }} [extra] + * @returns {Promise} */ -function resolvePassphrase(opts) { - return opts.vaultPassphrase ?? process.env.GIT_CAS_PASSPHRASE; +async function resolvePassphrase(opts, extra = {}) { + if (opts.vaultPassphraseFile) { + return await readPassphraseFile(opts.vaultPassphraseFile); + } + if (opts.vaultPassphrase) { + return opts.vaultPassphrase; + } + if (process.env.GIT_CAS_PASSPHRASE) { + return process.env.GIT_CAS_PASSPHRASE; + } + if (process.stdin.isTTY) { + return await promptPassphrase({ confirm: extra.confirm || false }); + } + return undefined; } /** @@ -95,7 +113,7 @@ async function resolveEncryptionKey(cas, opts) { if (opts.keyFile) { return readKeyFile(opts.keyFile); } - const passphrase = resolvePassphrase(opts); + const passphrase = await resolvePassphrase(opts); if (!passphrase) { return undefined; } @@ -186,9 +204,10 @@ program .option('--tree', 'Also create a Git tree and print its OID') .option('--force', 'Overwrite existing vault entry') .option('--vault-passphrase ', 'Vault-level passphrase for encryption (prefer GIT_CAS_PASSPHRASE env var)') + .option('--vault-passphrase-file ', 'Read vault passphrase from file (use - for stdin)') .option('--cwd ', 'Git working directory', '.') .action(runAction(async (/** @type {string} */ file, /** @type {Record} */ opts) => { - if (opts.recipient && (opts.keyFile || resolvePassphrase(opts))) { + if (opts.recipient && (opts.keyFile || await resolvePassphrase(opts))) { throw new Error('Provide --key-file/--vault-passphrase or --recipient, not both'); } if (opts.force && !opts.tree) { @@ -275,6 +294,7 @@ program .option('--oid ', 'Direct tree OID') .option('--key-file ', 'Path to 32-byte raw encryption key file') .option('--vault-passphrase ', 'Vault-level passphrase for decryption (prefer GIT_CAS_PASSPHRASE env var)') + .option('--vault-passphrase-file ', 'Read vault passphrase from file (use - for stdin)') .option('--cwd ', 'Git working directory', '.') .action(runAction(async (/** @type {Record} */ opts) => { validateRestoreFlags(opts); @@ -345,13 +365,14 @@ vault .command('init') .description('Initialize the vault') .option('--vault-passphrase ', 'Passphrase for vault-level encryption (prefer GIT_CAS_PASSPHRASE env var)') + .option('--vault-passphrase-file ', 'Read vault passphrase from file (use - for stdin)') .option('--algorithm ', 'KDF algorithm (pbkdf2 or scrypt)', 'pbkdf2') .option('--cwd ', 'Git working directory', '.') .action(runAction(async (/** @type {Record} */ opts) => { const cas = createCas(opts.cwd); /** @type {{ passphrase?: string, kdfOptions?: { algorithm: 'pbkdf2' | 'scrypt' } }} */ const initOpts = {}; - const passphrase = resolvePassphrase(opts); + const passphrase = await resolvePassphrase(opts, { confirm: true }); if (passphrase) { initOpts.passphrase = passphrase; initOpts.kdfOptions = { algorithm: /** @type {'pbkdf2' | 'scrypt'} */ (opts.algorithm) }; diff --git a/bin/ui/passphrase-prompt.js b/bin/ui/passphrase-prompt.js new file mode 100644 index 0000000..ce64ae3 --- /dev/null +++ b/bin/ui/passphrase-prompt.js @@ -0,0 +1,66 @@ +import { createInterface } from 'node:readline'; +import { readFile } from 'node:fs/promises'; + +/** + * Prompts for a passphrase on stderr with echo disabled. + * + * @param {Object} [options] + * @param {boolean} [options.confirm=false] - Require confirmation (ask twice). + * @returns {Promise} + */ +export async function promptPassphrase({ confirm = false } = {}) { + if (!process.stdin.isTTY) { + throw new Error( + 'Cannot prompt for passphrase: stdin is not a TTY. ' + + 'Use --vault-passphrase-file or GIT_CAS_PASSPHRASE.', + ); + } + const pass = await readHidden('Passphrase: '); + if (confirm) { + const pass2 = await readHidden('Confirm passphrase: '); + if (pass !== pass2) { + throw new Error('Passphrases do not match'); + } + } + return pass; +} + +/** + * Reads a passphrase from a file path, or from stdin when path is '-'. + * + * @param {string} filePath - File path, or '-' for stdin. + * @returns {Promise} + */ +export async function readPassphraseFile(filePath) { + if (filePath === '-') { + const chunks = []; + for await (const chunk of process.stdin) { + chunks.push(chunk); + } + return Buffer.concat(chunks).toString('utf8').replace(/\n$/, ''); + } + const content = await readFile(filePath, 'utf8'); + return content.replace(/\n$/, ''); +} + +/** + * Reads a line with echo disabled. + * @param {string} prompt - Prompt text. + * @returns {Promise} + */ +function readHidden(prompt) { + return new Promise((resolve) => { + const rl = createInterface({ + input: process.stdin, + output: process.stderr, + terminal: true, + }); + process.stderr.write(prompt); + rl.question('', (answer) => { + rl.close(); + process.stderr.write('\n'); + resolve(answer); + }); + rl._writeToOutput = () => {}; + }); +} diff --git a/test/unit/cli/passphrase-prompt.test.js b/test/unit/cli/passphrase-prompt.test.js new file mode 100644 index 0000000..fd97e76 --- /dev/null +++ b/test/unit/cli/passphrase-prompt.test.js @@ -0,0 +1,31 @@ +import { describe, it, expect, afterEach } from 'vitest'; +import { writeFile, unlink } from 'node:fs/promises'; +import { tmpdir } from 'node:os'; +import { join } from 'node:path'; +import { readPassphraseFile } from '../../../bin/ui/passphrase-prompt.js'; + +describe('readPassphraseFile', () => { + const tmpPath = join(tmpdir(), `test-passphrase-${Date.now()}.txt`); + + afterEach(async () => { + try { await unlink(tmpPath); } catch { /* may not exist */ } + }); + + it('reads from file and trims trailing newline', async () => { + await writeFile(tmpPath, 'my-secret\n', 'utf8'); + const result = await readPassphraseFile(tmpPath); + expect(result).toBe('my-secret'); + }); + + it('preserves content without trailing newline', async () => { + await writeFile(tmpPath, 'no-newline', 'utf8'); + const result = await readPassphraseFile(tmpPath); + expect(result).toBe('no-newline'); + }); + + it('preserves internal newlines', async () => { + await writeFile(tmpPath, 'line1\nline2\n', 'utf8'); + const result = await readPassphraseFile(tmpPath); + expect(result).toBe('line1\nline2'); + }); +}); From cf915483af629a5cb39cc716c62ab274077bf099 Mon Sep 17 00:00:00 2001 From: James Ross Date: Tue, 3 Mar 2026 19:40:39 -0800 Subject: [PATCH 09/41] feat(chunking): enforce 100 MiB upper bound on chunk size CasService, FixedChunker, and CdcChunker constructors now throw when chunk size exceeds 100 MiB. CasService warns when > 10 MiB. Prevents accidental creation of excessively large blobs. Task: 16.6 --- CHANGELOG.md | 1 + src/domain/services/CasService.js | 7 +++ src/infrastructure/chunkers/CdcChunker.js | 5 +++ src/infrastructure/chunkers/FixedChunker.js | 5 +++ .../CasService.chunkSizeBound.test.js | 44 +++++++++++++++++++ .../chunkers/ChunkerBounds.test.js | 35 +++++++++++++++ 6 files changed, 97 insertions(+) create mode 100644 test/unit/domain/services/CasService.chunkSizeBound.test.js create mode 100644 test/unit/infrastructure/chunkers/ChunkerBounds.test.js diff --git a/CHANGELOG.md b/CHANGELOG.md index e913574..55fb76b 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -19,6 +19,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - **16.1 — Crypto adapter behavioral normalization** — `NodeCryptoAdapter.encryptBuffer` now returns a Promise (was sync), matching Bun/Web. `decryptBuffer` validates key on all adapters. `NodeCryptoAdapter.createEncryptionStream` guards `finalize()` with `STREAM_NOT_CONSUMED`. New conformance test suite asserts identical contracts across all adapters. - **16.2 — Memory restore guard** — `CasService` accepts `maxRestoreBufferSize` (default 512 MiB). `_restoreBuffered` throws `RESTORE_TOO_LARGE` with `{ size, limit }` meta when encrypted/compressed restore would exceed the limit. Unencrypted streaming restore is unaffected. - **16.11 — Passphrase input security** — New `--vault-passphrase-file ` CLI option reads passphrase from file (use `-` for stdin). Interactive TTY prompt added as fallback when no other passphrase source is available. `resolvePassphrase` is now async with priority: file → flag → env → TTY → undefined. +- **16.6 — Chunk size upper bound** — CasService, FixedChunker, and CdcChunker now reject chunk sizes exceeding 100 MiB. CasService logs a warning when chunk size exceeds 10 MiB. ## [5.2.4] — Prism polish (2026-03-03) diff --git a/src/domain/services/CasService.js b/src/domain/services/CasService.js index 8c8df0b..7d0ec03 100644 --- a/src/domain/services/CasService.js +++ b/src/domain/services/CasService.js @@ -43,6 +43,9 @@ export default class CasService { this.crypto = crypto; this.observability = observability; this.chunkSize = chunkSize; + if (chunkSize > 10 * 1024 * 1024) { + observability.log('warn', `Chunk size ${chunkSize} exceeds 10 MiB — consider a smaller value`, { chunkSize }); + } /** @type {import('../../ports/ChunkingPort.js').default} */ this.chunker = chunker || new FixedChunker({ chunkSize }); this.merkleThreshold = merkleThreshold; @@ -59,6 +62,10 @@ export default class CasService { if (chunkSize < 1024) { throw new Error('Chunk size must be at least 1024 bytes'); } + const MAX_CHUNK_SIZE = 100 * 1024 * 1024; + if (chunkSize > MAX_CHUNK_SIZE) { + throw new Error(`Chunk size must not exceed ${MAX_CHUNK_SIZE} bytes (100 MiB)`); + } if (!Number.isInteger(merkleThreshold) || merkleThreshold < 1) { throw new Error('Merkle threshold must be a positive integer'); } diff --git a/src/infrastructure/chunkers/CdcChunker.js b/src/infrastructure/chunkers/CdcChunker.js index 0eaac3d..536f65c 100644 --- a/src/infrastructure/chunkers/CdcChunker.js +++ b/src/infrastructure/chunkers/CdcChunker.js @@ -277,6 +277,11 @@ export default class CdcChunker extends ChunkingPort { `targetChunkSize (${targetChunkSize}) must be in [${minChunkSize}, ${maxChunkSize}]`, ); } + if (maxChunkSize > 100 * 1024 * 1024) { + throw new RangeError( + `maxChunkSize must not exceed 104857600 bytes (100 MiB), got ${maxChunkSize}`, + ); + } this.#minChunkSize = minChunkSize; this.#maxChunkSize = maxChunkSize; diff --git a/src/infrastructure/chunkers/FixedChunker.js b/src/infrastructure/chunkers/FixedChunker.js index 1477e18..69b8e7a 100644 --- a/src/infrastructure/chunkers/FixedChunker.js +++ b/src/infrastructure/chunkers/FixedChunker.js @@ -17,6 +17,11 @@ export default class FixedChunker extends ChunkingPort { */ constructor({ chunkSize = 262144 } = {}) { super(); + if (chunkSize > 100 * 1024 * 1024) { + throw new RangeError( + `Chunk size must not exceed 104857600 bytes (100 MiB), got ${chunkSize}`, + ); + } this.#chunkSize = chunkSize; } diff --git a/test/unit/domain/services/CasService.chunkSizeBound.test.js b/test/unit/domain/services/CasService.chunkSizeBound.test.js new file mode 100644 index 0000000..05d6c9b --- /dev/null +++ b/test/unit/domain/services/CasService.chunkSizeBound.test.js @@ -0,0 +1,44 @@ +import { describe, it, expect, vi } from 'vitest'; +import CasService from '../../../../src/domain/services/CasService.js'; +import { getTestCryptoAdapter } from '../../../helpers/crypto-adapter.js'; +import JsonCodec from '../../../../src/infrastructure/codecs/JsonCodec.js'; +import SilentObserver from '../../../../src/infrastructure/adapters/SilentObserver.js'; + +const testCrypto = await getTestCryptoAdapter(); + +const MiB = 1024 * 1024; + +function makeService(chunkSize, observability) { + return new CasService({ + persistence: { writeBlob: vi.fn(), writeTree: vi.fn(), readBlob: vi.fn() }, + crypto: testCrypto, + codec: new JsonCodec(), + chunkSize, + observability: observability || new SilentObserver(), + }); +} + +describe('CasService — chunk size upper bound', () => { + it('throws when chunkSize > 100 MiB', () => { + expect(() => makeService(100 * MiB + 1)).toThrow(/must not exceed/i); + }); + + it('accepts exactly 100 MiB', () => { + const service = makeService(100 * MiB); + expect(service.chunkSize).toBe(100 * MiB); + }); + + it('warns when chunkSize > 10 MiB', () => { + const observability = { + metric: vi.fn(), + log: vi.fn(), + span: vi.fn().mockReturnValue({ end: vi.fn() }), + }; + makeService(11 * MiB, observability); + expect(observability.log).toHaveBeenCalledWith( + 'warn', + expect.stringContaining('exceeds 10 MiB'), + expect.objectContaining({ chunkSize: 11 * MiB }), + ); + }); +}); diff --git a/test/unit/infrastructure/chunkers/ChunkerBounds.test.js b/test/unit/infrastructure/chunkers/ChunkerBounds.test.js new file mode 100644 index 0000000..7a86559 --- /dev/null +++ b/test/unit/infrastructure/chunkers/ChunkerBounds.test.js @@ -0,0 +1,35 @@ +import { describe, it, expect } from 'vitest'; +import FixedChunker from '../../../../src/infrastructure/chunkers/FixedChunker.js'; +import CdcChunker from '../../../../src/infrastructure/chunkers/CdcChunker.js'; + +const MiB = 1024 * 1024; + +describe('FixedChunker — chunk size upper bound', () => { + it('throws when chunkSize > 100 MiB', () => { + expect(() => new FixedChunker({ chunkSize: 100 * MiB + 1 })).toThrow(RangeError); + }); + + it('accepts exactly 100 MiB', () => { + const chunker = new FixedChunker({ chunkSize: 100 * MiB }); + expect(chunker.params.chunkSize).toBe(100 * MiB); + }); +}); + +describe('CdcChunker — chunk size upper bound', () => { + it('throws when maxChunkSize > 100 MiB', () => { + expect(() => new CdcChunker({ + maxChunkSize: 100 * MiB + 1, + minChunkSize: 1024, + targetChunkSize: 50 * MiB, + })).toThrow(RangeError); + }); + + it('accepts exactly 100 MiB as maxChunkSize', () => { + const chunker = new CdcChunker({ + maxChunkSize: 100 * MiB, + minChunkSize: 1024, + targetChunkSize: 50 * MiB, + }); + expect(chunker.params.max).toBe(100 * MiB); + }); +}); From 314f36f0280cc40195b016ada17f646530877a64 Mon Sep 17 00:00:00 2001 From: James Ross Date: Tue, 3 Mar 2026 19:42:56 -0800 Subject: [PATCH 10/41] feat(crypto): add encryption buffer guard to WebCryptoAdapter WebCryptoAdapter now accepts maxEncryptionBufferSize (default 512 MiB). Streaming encryption throws ENCRYPTION_BUFFER_EXCEEDED when the accumulated plaintext exceeds the limit, since Web Crypto AES-GCM is a one-shot API that buffers all data. Refactored encrypt generator into a static private method. Task: 16.3 --- CHANGELOG.md | 1 + .../adapters/WebCryptoAdapter.js | 77 ++++++++++++------- .../WebCryptoAdapter.bufferGuard.test.js | 67 ++++++++++++++++ 3 files changed, 116 insertions(+), 29 deletions(-) create mode 100644 test/unit/infrastructure/adapters/WebCryptoAdapter.bufferGuard.test.js diff --git a/CHANGELOG.md b/CHANGELOG.md index 55fb76b..16b346b 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -20,6 +20,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - **16.2 — Memory restore guard** — `CasService` accepts `maxRestoreBufferSize` (default 512 MiB). `_restoreBuffered` throws `RESTORE_TOO_LARGE` with `{ size, limit }` meta when encrypted/compressed restore would exceed the limit. Unencrypted streaming restore is unaffected. - **16.11 — Passphrase input security** — New `--vault-passphrase-file ` CLI option reads passphrase from file (use `-` for stdin). Interactive TTY prompt added as fallback when no other passphrase source is available. `resolvePassphrase` is now async with priority: file → flag → env → TTY → undefined. - **16.6 — Chunk size upper bound** — CasService, FixedChunker, and CdcChunker now reject chunk sizes exceeding 100 MiB. CasService logs a warning when chunk size exceeds 10 MiB. +- **16.3 — Web Crypto encryption buffer guard** — `WebCryptoAdapter` accepts `maxEncryptionBufferSize` (default 512 MiB). Throws `ENCRYPTION_BUFFER_EXCEEDED` when streaming encryption exceeds the limit, since Web Crypto AES-GCM is a one-shot API. NodeCryptoAdapter uses true streaming and is unaffected. ## [5.2.4] — Prism polish (2026-03-03) diff --git a/src/infrastructure/adapters/WebCryptoAdapter.js b/src/infrastructure/adapters/WebCryptoAdapter.js index 310da32..d46ae92 100644 --- a/src/infrastructure/adapters/WebCryptoAdapter.js +++ b/src/infrastructure/adapters/WebCryptoAdapter.js @@ -9,6 +9,18 @@ import CasError from '../../domain/errors/CasError.js'; * AES-GCM is a one-shot API (the GCM tag is computed over the entire plaintext). */ export default class WebCryptoAdapter extends CryptoPort { + /** @type {number} */ + #maxEncryptionBufferSize; + + /** + * @param {Object} [options] + * @param {number} [options.maxEncryptionBufferSize=536870912] - Max bytes to buffer during streaming encryption (default 512 MiB). + */ + constructor({ maxEncryptionBufferSize = 512 * 1024 * 1024 } = {}) { + super(); + this.#maxEncryptionBufferSize = maxEncryptionBufferSize; + } + /** * @override * @param {Buffer|Uint8Array} buf - Data to hash. @@ -105,49 +117,56 @@ export default class WebCryptoAdapter extends CryptoPort { this._validateKey(key); const nonce = this.randomBytes(12); const cryptoKeyPromise = this.#importKey(key); + const maxBuf = this.#maxEncryptionBufferSize; + const state = { /** @type {Uint8Array|null} */ tag: null, consumed: false }; + + const encrypt = WebCryptoAdapter.#makeEncryptGenerator({ cryptoKeyPromise, nonce, maxBuf, state }); - // Web Crypto buffers all data for the one-shot AES-GCM call (GCM tag spans the whole plaintext). - /** @type {Buffer[]} */ - const chunks = []; - /** @type {Uint8Array|null} */ - let finalTag = null; - let streamConsumed = false; + const finalize = () => { + if (!state.consumed) { + throw new CasError('Cannot finalize before the encrypt stream is fully consumed', 'STREAM_NOT_CONSUMED'); + } + return this._buildMeta(this.#toBase64(nonce), this.#toBase64(/** @type {Uint8Array} */ (state.tag))); + }; + + return { encrypt, finalize }; + } - /** @param {AsyncIterable} source */ - const encrypt = async function* (source) { + /** + * Builds the encrypt async generator for createEncryptionStream. + * @param {{ cryptoKeyPromise: Promise, nonce: Buffer|Uint8Array, maxBuf: number, state: { tag: Uint8Array|null, consumed: boolean } }} ctx + * @returns {(source: AsyncIterable) => AsyncGenerator} + */ + static #makeEncryptGenerator({ cryptoKeyPromise, nonce, maxBuf, state }) { + return async function* (source) { + /** @type {Buffer[]} */ + const chunks = []; + let accumulatedBytes = 0; for await (const chunk of source) { + accumulatedBytes += chunk.length; + if (accumulatedBytes > maxBuf) { + throw new CasError( + `Streaming encryption buffered ${accumulatedBytes} bytes (limit: ${maxBuf}). ` + + 'Web Crypto AES-GCM buffers all data. Use Node.js/Bun or store without encryption for large files.', + 'ENCRYPTION_BUFFER_EXCEEDED', + { accumulated: accumulatedBytes, limit: maxBuf }, + ); + } chunks.push(chunk); } - const buffer = Buffer.concat(chunks); const cryptoKey = await cryptoKeyPromise; const encrypted = await globalThis.crypto.subtle.encrypt( // @ts-ignore -- Uint8Array satisfies BufferSource at runtime { name: 'AES-GCM', iv: /** @type {Uint8Array} */ (nonce) }, - cryptoKey, - buffer + cryptoKey, buffer, ); - const fullBuffer = new Uint8Array(encrypted); const tagLength = 16; - const ciphertext = fullBuffer.slice(0, -tagLength); - finalTag = fullBuffer.slice(-tagLength); - streamConsumed = true; - - yield Buffer.from(ciphertext); + state.tag = fullBuffer.slice(-tagLength); + state.consumed = true; + yield Buffer.from(fullBuffer.slice(0, -tagLength)); }; - - const finalize = () => { - if (!streamConsumed) { - throw new CasError( - 'Cannot finalize before the encrypt stream is fully consumed', - 'STREAM_NOT_CONSUMED', - ); - } - return this._buildMeta(this.#toBase64(nonce), this.#toBase64(/** @type {Uint8Array} */ (finalTag))); - }; - - return { encrypt, finalize }; } /** diff --git a/test/unit/infrastructure/adapters/WebCryptoAdapter.bufferGuard.test.js b/test/unit/infrastructure/adapters/WebCryptoAdapter.bufferGuard.test.js new file mode 100644 index 0000000..e22d010 --- /dev/null +++ b/test/unit/infrastructure/adapters/WebCryptoAdapter.bufferGuard.test.js @@ -0,0 +1,67 @@ +import { describe, it, expect } from 'vitest'; +import WebCryptoAdapter from '../../../../src/infrastructure/adapters/WebCryptoAdapter.js'; +import NodeCryptoAdapter from '../../../../src/infrastructure/adapters/NodeCryptoAdapter.js'; +import CasError from '../../../../src/domain/errors/CasError.js'; + +const key = Buffer.alloc(32, 0xab); + +async function* makeSource(totalBytes, chunkSize = 1024) { + let remaining = totalBytes; + while (remaining > 0) { + const size = Math.min(chunkSize, remaining); + yield Buffer.alloc(size, 0xcc); + remaining -= size; + } +} + +async function consumeStream(encrypt, source) { + const chunks = []; + for await (const chunk of encrypt(source)) { + chunks.push(chunk); + } + return chunks; +} + +describe('WebCryptoAdapter — ENCRYPTION_BUFFER_EXCEEDED', () => { + it('throws ENCRYPTION_BUFFER_EXCEEDED when data exceeds limit', async () => { + const adapter = new WebCryptoAdapter({ maxEncryptionBufferSize: 2000 }); + const { encrypt } = adapter.createEncryptionStream(key); + + await expect( + consumeStream(encrypt, makeSource(3000)), + ).rejects.toThrow(CasError); + + try { + const adapter2 = new WebCryptoAdapter({ maxEncryptionBufferSize: 2000 }); + const { encrypt: encrypt2 } = adapter2.createEncryptionStream(key); + await consumeStream(encrypt2, makeSource(3000)); + } catch (err) { + expect(err.code).toBe('ENCRYPTION_BUFFER_EXCEEDED'); + expect(err.meta.limit).toBe(2000); + } + }); + + it('succeeds within limit', async () => { + const adapter = new WebCryptoAdapter({ maxEncryptionBufferSize: 4096 }); + const { encrypt, finalize } = adapter.createEncryptionStream(key); + + const chunks = await consumeStream(encrypt, makeSource(1024)); + expect(chunks.length).toBeGreaterThan(0); + + const meta = finalize(); + expect(meta.encrypted).toBe(true); + }); +}); + +describe('NodeCryptoAdapter — no buffer guard for streaming', () => { + it('does NOT throw for same-size stream (true streaming)', async () => { + const adapter = new NodeCryptoAdapter(); + const { encrypt, finalize } = adapter.createEncryptionStream(key); + + const chunks = await consumeStream(encrypt, makeSource(3000)); + expect(chunks.length).toBeGreaterThan(0); + + const meta = finalize(); + expect(meta.encrypted).toBe(true); + }); +}); From 263c608c830c3f2d1ae671f8d32ecf81c12154b6 Mon Sep 17 00:00:00 2001 From: James Ross Date: Tue, 3 Mar 2026 19:44:17 -0800 Subject: [PATCH 11/41] feat(store): warn when CDC chunking is combined with encryption MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit CDC deduplication is ineffective with encryption since ciphertext is pseudorandom — content-defined boundaries provide no dedup benefit. CasService.store() now emits an observability warning for this case. Task: 16.5 --- CHANGELOG.md | 1 + src/domain/services/CasService.js | 7 ++ .../services/CasService.dedupWarning.test.js | 65 +++++++++++++++++++ 3 files changed, 73 insertions(+) create mode 100644 test/unit/domain/services/CasService.dedupWarning.test.js diff --git a/CHANGELOG.md b/CHANGELOG.md index 16b346b..b6e9232 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -21,6 +21,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - **16.11 — Passphrase input security** — New `--vault-passphrase-file ` CLI option reads passphrase from file (use `-` for stdin). Interactive TTY prompt added as fallback when no other passphrase source is available. `resolvePassphrase` is now async with priority: file → flag → env → TTY → undefined. - **16.6 — Chunk size upper bound** — CasService, FixedChunker, and CdcChunker now reject chunk sizes exceeding 100 MiB. CasService logs a warning when chunk size exceeds 10 MiB. - **16.3 — Web Crypto encryption buffer guard** — `WebCryptoAdapter` accepts `maxEncryptionBufferSize` (default 512 MiB). Throws `ENCRYPTION_BUFFER_EXCEEDED` when streaming encryption exceeds the limit, since Web Crypto AES-GCM is a one-shot API. NodeCryptoAdapter uses true streaming and is unaffected. +- **16.5 — Encrypt-then-chunk dedup warning** — `CasService.store()` now logs a warning when encryption is combined with CDC chunking, since ciphertext is pseudorandom and content-defined boundaries provide no dedup benefit. ## [5.2.4] — Prism polish (2026-03-03) diff --git a/src/domain/services/CasService.js b/src/domain/services/CasService.js index 7d0ec03..2dfd689 100644 --- a/src/domain/services/CasService.js +++ b/src/domain/services/CasService.js @@ -273,6 +273,13 @@ export default class CasService { const manifestData = this._buildManifestData(slug, filename, compression); const processedSource = compression ? this._compressStream(source) : source; + if (keyInfo.key && this.chunker.strategy === 'cdc') { + this.observability.log( + 'warn', + 'CDC deduplication is ineffective with encryption — ciphertext is pseudorandom', + { strategy: 'cdc' }, + ); + } if (keyInfo.key) { const { encrypt, finalize } = this.crypto.createEncryptionStream(keyInfo.key); await this._chunkAndStore(encrypt(processedSource), manifestData); diff --git a/test/unit/domain/services/CasService.dedupWarning.test.js b/test/unit/domain/services/CasService.dedupWarning.test.js new file mode 100644 index 0000000..d2d0342 --- /dev/null +++ b/test/unit/domain/services/CasService.dedupWarning.test.js @@ -0,0 +1,65 @@ +import { describe, it, expect, vi } from 'vitest'; +import CasService from '../../../../src/domain/services/CasService.js'; +import { getTestCryptoAdapter } from '../../../helpers/crypto-adapter.js'; +import JsonCodec from '../../../../src/infrastructure/codecs/JsonCodec.js'; +import CdcChunker from '../../../../src/infrastructure/chunkers/CdcChunker.js'; +import FixedChunker from '../../../../src/infrastructure/chunkers/FixedChunker.js'; + +const testCrypto = await getTestCryptoAdapter(); + +function makeObserver() { + return { + metric: vi.fn(), + log: vi.fn(), + span: vi.fn().mockReturnValue({ end: vi.fn() }), + }; +} + +function makeService(chunker, observability) { + return new CasService({ + persistence: { writeBlob: vi.fn().mockResolvedValue('oid'), writeTree: vi.fn(), readBlob: vi.fn() }, + crypto: testCrypto, + codec: new JsonCodec(), + chunkSize: 1024, + observability, + chunker, + }); +} + +describe('CasService — CDC + encryption dedup warning', () => { + it('emits warning when encryption + CDC', async () => { + const obs = makeObserver(); + const service = makeService(new CdcChunker({ minChunkSize: 1024, targetChunkSize: 2048, maxChunkSize: 4096 }), obs); + const key = Buffer.alloc(32, 0xab); + + async function* source() { yield Buffer.alloc(2048, 0xcc); } + await service.store({ source: source(), slug: 'enc-cdc', filename: 'f.bin', encryptionKey: key }); + + const warnCalls = obs.log.mock.calls.filter((c) => c[0] === 'warn' && c[1].includes('CDC deduplication')); + expect(warnCalls).toHaveLength(1); + expect(warnCalls[0][2]).toEqual({ strategy: 'cdc' }); + }); + + it('does NOT warn for encryption + fixed chunking', async () => { + const obs = makeObserver(); + const service = makeService(new FixedChunker({ chunkSize: 1024 }), obs); + const key = Buffer.alloc(32, 0xab); + + async function* source() { yield Buffer.alloc(2048, 0xcc); } + await service.store({ source: source(), slug: 'enc-fixed', filename: 'f.bin', encryptionKey: key }); + + const warnCalls = obs.log.mock.calls.filter((c) => c[0] === 'warn' && c[1].includes('CDC deduplication')); + expect(warnCalls).toHaveLength(0); + }); + + it('does NOT warn for CDC without encryption', async () => { + const obs = makeObserver(); + const service = makeService(new CdcChunker({ minChunkSize: 1024, targetChunkSize: 2048, maxChunkSize: 4096 }), obs); + + async function* source() { yield Buffer.alloc(2048, 0xcc); } + await service.store({ source: source(), slug: 'plain-cdc', filename: 'f.bin' }); + + const warnCalls = obs.log.mock.calls.filter((c) => c[0] === 'warn' && c[1].includes('CDC deduplication')); + expect(warnCalls).toHaveLength(0); + }); +}); From 23474deb6d3f994d824139e317202f00a8251558 Mon Sep 17 00:00:00 2001 From: James Ross Date: Tue, 3 Mar 2026 19:48:46 -0800 Subject: [PATCH 12/41] feat(store): track orphaned blobs on stream failure MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit STREAM_ERROR now includes meta.orphanedBlobs — an array of OIDs for blobs that were successfully persisted before the stream failed. The error metric also reports the orphanedBlobs count for observability. Resolves task 16.10. --- CHANGELOG.md | 1 + src/domain/services/CasService.js | 13 ++- .../services/CasService.orphanedBlobs.test.js | 93 +++++++++++++++++++ 3 files changed, 103 insertions(+), 4 deletions(-) create mode 100644 test/unit/domain/services/CasService.orphanedBlobs.test.js diff --git a/CHANGELOG.md b/CHANGELOG.md index b6e9232..d5c323d 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -22,6 +22,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - **16.6 — Chunk size upper bound** — CasService, FixedChunker, and CdcChunker now reject chunk sizes exceeding 100 MiB. CasService logs a warning when chunk size exceeds 10 MiB. - **16.3 — Web Crypto encryption buffer guard** — `WebCryptoAdapter` accepts `maxEncryptionBufferSize` (default 512 MiB). Throws `ENCRYPTION_BUFFER_EXCEEDED` when streaming encryption exceeds the limit, since Web Crypto AES-GCM is a one-shot API. NodeCryptoAdapter uses true streaming and is unaffected. - **16.5 — Encrypt-then-chunk dedup warning** — `CasService.store()` now logs a warning when encryption is combined with CDC chunking, since ciphertext is pseudorandom and content-defined boundaries provide no dedup benefit. +- **16.10 — Orphaned blob tracking** — `STREAM_ERROR` now includes `meta.orphanedBlobs` — an array of OIDs for blobs successfully written before the stream failure. Error metric includes `orphanedBlobs` count for observability. ## [5.2.4] — Prism polish (2026-03-03) diff --git a/src/domain/services/CasService.js b/src/domain/services/CasService.js index 2dfd689..f4a4ab9 100644 --- a/src/domain/services/CasService.js +++ b/src/domain/services/CasService.js @@ -143,15 +143,20 @@ export default class CasService { launchWrite(chunk, nextIndex++); } } catch (err) { - await Promise.allSettled(pending); + const settled = await Promise.allSettled(pending); + const orphanedBlobs = settled + .filter((r) => r.status === 'fulfilled') + .map((r) => r.value.blob); if (err instanceof CasError) { throw err; } const casErr = new CasError( `Stream error during store: ${err.message}`, 'STREAM_ERROR', - { chunksDispatched: nextIndex, originalError: err }, + { chunksDispatched: nextIndex, orphanedBlobs, originalError: err }, ); - await Promise.allSettled(pending); - this.observability.metric('error', { code: casErr.code, message: casErr.message }); + this.observability.metric('error', { + code: casErr.code, message: casErr.message, + orphanedBlobs: orphanedBlobs.length, + }); throw casErr; } diff --git a/test/unit/domain/services/CasService.orphanedBlobs.test.js b/test/unit/domain/services/CasService.orphanedBlobs.test.js new file mode 100644 index 0000000..7c2d466 --- /dev/null +++ b/test/unit/domain/services/CasService.orphanedBlobs.test.js @@ -0,0 +1,93 @@ +import { describe, it, expect, vi, beforeEach } from 'vitest'; +import CasService from '../../../../src/domain/services/CasService.js'; +import { getTestCryptoAdapter } from '../../../helpers/crypto-adapter.js'; +import JsonCodec from '../../../../src/infrastructure/codecs/JsonCodec.js'; + +const testCrypto = await getTestCryptoAdapter(); + +function failingSource(chunksBeforeError, chunkSize = 1024) { + let yielded = 0; + return { + [Symbol.asyncIterator]() { + return { + async next() { + if (yielded >= chunksBeforeError) { + throw new Error('simulated stream failure'); + } + yielded++; + return { value: Buffer.alloc(chunkSize, 0xaa), done: false }; + }, + }; + }, + }; +} + +function buildService() { + let blobCounter = 0; + const mockPersistence = { + writeBlob: vi.fn().mockImplementation(() => Promise.resolve(`blob-${blobCounter++}`)), + writeTree: vi.fn().mockResolvedValue('mock-tree-oid'), + readBlob: vi.fn().mockResolvedValue(Buffer.from('data')), + }; + const observability = { + metric: vi.fn(), + log: vi.fn(), + span: vi.fn().mockReturnValue({ end: vi.fn() }), + }; + const service = new CasService({ + persistence: mockPersistence, + crypto: testCrypto, + codec: new JsonCodec(), + chunkSize: 1024, + observability, + }); + return { service, mockPersistence, observability }; +} + +describe('CasService — orphaned blob tracking in STREAM_ERROR', () => { + let service; + let observability; + + beforeEach(() => { + ({ service, observability } = buildService()); + }); + + it('STREAM_ERROR meta includes orphanedBlobs array', async () => { + try { + await service.store({ source: failingSource(3), slug: 'fail', filename: 'f.bin' }); + } catch (err) { + expect(err.code).toBe('STREAM_ERROR'); + expect(Array.isArray(err.meta.orphanedBlobs)).toBe(true); + } + }); + + it('orphanedBlobs contain OIDs from successful writes', async () => { + try { + await service.store({ source: failingSource(3), slug: 'fail', filename: 'f.bin' }); + } catch (err) { + expect(err.meta.orphanedBlobs.length).toBe(3); + expect(err.meta.orphanedBlobs).toContain('blob-0'); + expect(err.meta.orphanedBlobs).toContain('blob-1'); + expect(err.meta.orphanedBlobs).toContain('blob-2'); + } + }); + + it('empty array when stream fails before any writes', async () => { + try { + await service.store({ source: failingSource(0), slug: 'fail', filename: 'f.bin' }); + } catch (err) { + expect(err.meta.orphanedBlobs).toEqual([]); + } + }); + + it('emits metric with orphaned blob count', async () => { + try { + await service.store({ source: failingSource(2), slug: 'fail', filename: 'f.bin' }); + } catch { + // expected + } + const errorMetrics = observability.metric.mock.calls.filter((c) => c[0] === 'error'); + expect(errorMetrics.length).toBeGreaterThan(0); + expect(errorMetrics[0][1]).toHaveProperty('orphanedBlobs', 2); + }); +}); From 605036fb8b0edf11aef7b7791f14d4223e4a0d4d Mon Sep 17 00:00:00 2001 From: James Ross Date: Tue, 3 Mar 2026 19:50:18 -0800 Subject: [PATCH 13/41] perf(chunking): replace Buffer.concat loop with pre-allocated buffer MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit FixedChunker.chunk() now uses a pre-allocated Buffer.allocUnsafe(chunkSize) working buffer with a copy+offset pattern, matching CdcChunker's approach. Eliminates O(n²/chunkSize) total copies when the source yields many small buffers. Resolves task 16.4. --- CHANGELOG.md | 1 + src/infrastructure/chunkers/FixedChunker.js | 22 ++++--- .../chunkers/FixedChunker.test.js | 63 +++++++++++++++++++ 3 files changed, 79 insertions(+), 7 deletions(-) create mode 100644 test/unit/infrastructure/chunkers/FixedChunker.test.js diff --git a/CHANGELOG.md b/CHANGELOG.md index d5c323d..f864ec1 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -23,6 +23,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - **16.3 — Web Crypto encryption buffer guard** — `WebCryptoAdapter` accepts `maxEncryptionBufferSize` (default 512 MiB). Throws `ENCRYPTION_BUFFER_EXCEEDED` when streaming encryption exceeds the limit, since Web Crypto AES-GCM is a one-shot API. NodeCryptoAdapter uses true streaming and is unaffected. - **16.5 — Encrypt-then-chunk dedup warning** — `CasService.store()` now logs a warning when encryption is combined with CDC chunking, since ciphertext is pseudorandom and content-defined boundaries provide no dedup benefit. - **16.10 — Orphaned blob tracking** — `STREAM_ERROR` now includes `meta.orphanedBlobs` — an array of OIDs for blobs successfully written before the stream failure. Error metric includes `orphanedBlobs` count for observability. +- **16.4 — FixedChunker pre-allocated buffer** — Replaced `Buffer.concat()` loop with a pre-allocated `Buffer.allocUnsafe(chunkSize)` working buffer, eliminating O(n²) copies for many small input buffers. Matches the allocation strategy used by `CdcChunker`. ## [5.2.4] — Prism polish (2026-03-03) diff --git a/src/infrastructure/chunkers/FixedChunker.js b/src/infrastructure/chunkers/FixedChunker.js index 69b8e7a..ef76c63 100644 --- a/src/infrastructure/chunkers/FixedChunker.js +++ b/src/infrastructure/chunkers/FixedChunker.js @@ -41,18 +41,26 @@ export default class FixedChunker extends ChunkingPort { * @yields {Buffer} */ async *chunk(source) { - let buffer = Buffer.alloc(0); + const cs = this.#chunkSize; + const buf = Buffer.allocUnsafe(cs); + let offset = 0; for await (const data of source) { - buffer = Buffer.concat([buffer, data]); - while (buffer.length >= this.#chunkSize) { - yield buffer.slice(0, this.#chunkSize); - buffer = buffer.slice(this.#chunkSize); + let srcPos = 0; + while (srcPos < data.length) { + const n = Math.min(cs - offset, data.length - srcPos); + data.copy(buf, offset, srcPos, srcPos + n); + offset += n; + srcPos += n; + if (offset === cs) { + yield Buffer.from(buf); + offset = 0; + } } } - if (buffer.length > 0) { - yield buffer; + if (offset > 0) { + yield Buffer.from(buf.subarray(0, offset)); } } } diff --git a/test/unit/infrastructure/chunkers/FixedChunker.test.js b/test/unit/infrastructure/chunkers/FixedChunker.test.js new file mode 100644 index 0000000..78233f7 --- /dev/null +++ b/test/unit/infrastructure/chunkers/FixedChunker.test.js @@ -0,0 +1,63 @@ +import { describe, it, expect } from 'vitest'; +import FixedChunker from '../../../../src/infrastructure/chunkers/FixedChunker.js'; + +async function* toAsyncIter(buffers) { + for (const b of buffers) { yield b; } +} + +async function collect(iter) { + const result = []; + for await (const chunk of iter) { result.push(chunk); } + return result; +} + +describe('16.4: FixedChunker pre-allocated buffer — regression', () => { + it('produces byte-exact output for a single large input', async () => { + const chunkSize = 64; + const chunker = new FixedChunker({ chunkSize }); + const input = Buffer.alloc(200); + for (let i = 0; i < input.length; i++) { input[i] = i & 0xff; } + + const chunks = await collect(chunker.chunk(toAsyncIter([input]))); + expect(chunks.map((c) => c.length)).toEqual([64, 64, 64, 8]); + expect(Buffer.concat(chunks).equals(input)).toBe(true); + }); + + it('exact multiple of chunkSize produces no partial', async () => { + const chunkSize = 128; + const chunker = new FixedChunker({ chunkSize }); + const input = Buffer.alloc(chunkSize * 3, 0xbb); + const chunks = await collect(chunker.chunk(toAsyncIter([input]))); + expect(chunks.length).toBe(3); + expect(chunks.every((c) => c.length === chunkSize)).toBe(true); + }); +}); + +describe('16.4: FixedChunker pre-allocated buffer — edge cases', () => { + it('many small input buffers reassemble correctly', async () => { + const chunkSize = 256; + const chunker = new FixedChunker({ chunkSize }); + const total = 1024; + const smallBufs = Array.from({ length: total }, (_, i) => Buffer.from([i & 0xff])); + + const chunks = await collect(chunker.chunk(toAsyncIter(smallBufs))); + expect(chunks.length).toBe(4); + const reassembled = Buffer.concat(chunks); + for (let i = 0; i < total; i++) { + expect(reassembled[i]).toBe(i & 0xff); + } + }); + + it('empty source produces no chunks', async () => { + const chunker = new FixedChunker({ chunkSize: 64 }); + const chunks = await collect(chunker.chunk(toAsyncIter([]))); + expect(chunks.length).toBe(0); + }); + + it('single byte produces one partial chunk', async () => { + const chunker = new FixedChunker({ chunkSize: 64 }); + const chunks = await collect(chunker.chunk(toAsyncIter([Buffer.from([42])]))); + expect(chunks.length).toBe(1); + expect(chunks[0]).toEqual(Buffer.from([42])); + }); +}); From 47828e5314eb40b21b4ef2f93531ed5263e00020 Mon Sep 17 00:00:00 2001 From: James Ross Date: Tue, 3 Mar 2026 19:52:26 -0800 Subject: [PATCH 14/41] refactor(api): rename lifecycle methods with deprecated aliases MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add inspectAsset() and collectReferencedChunks() as canonical names for deleteAsset() and findOrphanedChunks() respectively. The old names were misleading — neither method performs destructive operations. Old names are preserved as deprecated aliases that emit observability warnings. Updated on CasService, facade, and all .d.ts files. Resolves task 16.7. --- CHANGELOG.md | 1 + index.d.ts | 10 ++ index.js | 24 +++- src/domain/services/CasService.d.ts | 10 ++ src/domain/services/CasService.js | 28 ++++- .../services/CasService.lifecycle.test.js | 119 ++++++++++++++++++ 6 files changed, 188 insertions(+), 4 deletions(-) create mode 100644 test/unit/domain/services/CasService.lifecycle.test.js diff --git a/CHANGELOG.md b/CHANGELOG.md index f864ec1..00da690 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -24,6 +24,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - **16.5 — Encrypt-then-chunk dedup warning** — `CasService.store()` now logs a warning when encryption is combined with CDC chunking, since ciphertext is pseudorandom and content-defined boundaries provide no dedup benefit. - **16.10 — Orphaned blob tracking** — `STREAM_ERROR` now includes `meta.orphanedBlobs` — an array of OIDs for blobs successfully written before the stream failure. Error metric includes `orphanedBlobs` count for observability. - **16.4 — FixedChunker pre-allocated buffer** — Replaced `Buffer.concat()` loop with a pre-allocated `Buffer.allocUnsafe(chunkSize)` working buffer, eliminating O(n²) copies for many small input buffers. Matches the allocation strategy used by `CdcChunker`. +- **16.7 — Lifecycle method naming** — Added `inspectAsset()` (replaces `deleteAsset()`) and `collectReferencedChunks()` (replaces `findOrphanedChunks()`) as canonical names on both `CasService` and the facade. Old names are preserved as deprecated aliases that emit observability warnings. Type definitions updated with `@deprecated` JSDoc. ## [5.2.4] — Prism polish (2026-03-03) diff --git a/index.d.ts b/index.d.ts index fe1b044..a8c301c 100644 --- a/index.d.ts +++ b/index.d.ts @@ -343,10 +343,20 @@ export default class ContentAddressableStore { readManifest(options: { treeOid: string }): Promise; + inspectAsset(options: { + treeOid: string; + }): Promise<{ slug: string; chunksOrphaned: number }>; + + /** @deprecated Use {@link inspectAsset} instead. */ deleteAsset(options: { treeOid: string; }): Promise<{ slug: string; chunksOrphaned: number }>; + collectReferencedChunks(options: { + treeOids: string[]; + }): Promise<{ referenced: Set; total: number }>; + + /** @deprecated Use {@link collectReferencedChunks} instead. */ findOrphanedChunks(options: { treeOids: string[]; }): Promise<{ referenced: Set; total: number }>; diff --git a/index.js b/index.js index b26cd14..1f65e6c 100644 --- a/index.js +++ b/index.js @@ -315,7 +315,18 @@ export default class ContentAddressableStore { } /** - * Returns deletion metadata for an asset stored in a Git tree. + * Reads a manifest from a Git tree and returns inspection metadata. + * @param {Object} options + * @param {string} options.treeOid - Git tree OID of the asset. + * @returns {Promise<{ slug: string, chunksOrphaned: number }>} + */ + async inspectAsset(options) { + const service = await this.#getService(); + return await service.inspectAsset(options); + } + + /** + * @deprecated Use {@link inspectAsset} instead. * @param {Object} options * @param {string} options.treeOid - Git tree OID of the asset. * @returns {Promise<{ slug: string, chunksOrphaned: number }>} @@ -331,6 +342,17 @@ export default class ContentAddressableStore { * @param {string[]} options.treeOids - Git tree OIDs to analyze. * @returns {Promise<{ referenced: Set, total: number }>} */ + async collectReferencedChunks(options) { + const service = await this.#getService(); + return await service.collectReferencedChunks(options); + } + + /** + * @deprecated Use {@link collectReferencedChunks} instead. + * @param {Object} options + * @param {string[]} options.treeOids - Git tree OIDs to analyze. + * @returns {Promise<{ referenced: Set, total: number }>} + */ async findOrphanedChunks(options) { const service = await this.#getService(); return await service.findOrphanedChunks(options); diff --git a/src/domain/services/CasService.d.ts b/src/domain/services/CasService.d.ts index 358579b..80440a8 100644 --- a/src/domain/services/CasService.d.ts +++ b/src/domain/services/CasService.d.ts @@ -131,10 +131,20 @@ export default class CasService { readManifest(options: { treeOid: string }): Promise; + inspectAsset(options: { + treeOid: string; + }): Promise<{ slug: string; chunksOrphaned: number }>; + + /** @deprecated Use {@link inspectAsset} instead. */ deleteAsset(options: { treeOid: string; }): Promise<{ slug: string; chunksOrphaned: number }>; + collectReferencedChunks(options: { + treeOids: string[]; + }): Promise<{ referenced: Set; total: number }>; + + /** @deprecated Use {@link collectReferencedChunks} instead. */ findOrphanedChunks(options: { treeOids: string[]; }): Promise<{ referenced: Set; total: number }>; diff --git a/src/domain/services/CasService.js b/src/domain/services/CasService.js index f4a4ab9..9c3a847 100644 --- a/src/domain/services/CasService.js +++ b/src/domain/services/CasService.js @@ -664,7 +664,7 @@ export default class CasService { } /** - * Returns deletion metadata for an asset stored in a Git tree. + * Reads a manifest from a Git tree and returns inspection metadata. * Does not perform any destructive Git operations. * * @param {Object} options @@ -672,7 +672,7 @@ export default class CasService { * @returns {Promise<{ chunksOrphaned: number, slug: string }>} * @throws {CasError} MANIFEST_NOT_FOUND if the tree has no manifest */ - async deleteAsset({ treeOid }) { + async inspectAsset({ treeOid }) { const manifest = await this.readManifest({ treeOid }); return { slug: manifest.slug, @@ -680,6 +680,17 @@ export default class CasService { }; } + /** + * @deprecated Use {@link inspectAsset} instead. + * @param {Object} options + * @param {string} options.treeOid - Git tree OID of the asset + * @returns {Promise<{ chunksOrphaned: number, slug: string }>} + */ + async deleteAsset(options) { + this.observability.log('warn', 'deleteAsset() is deprecated — use inspectAsset()'); + return await this.inspectAsset(options); + } + /** * Aggregates referenced chunk blob OIDs across multiple stored assets. * Analysis only — does not delete or modify anything. @@ -689,7 +700,7 @@ export default class CasService { * @returns {Promise<{ referenced: Set, total: number }>} * @throws {CasError} MANIFEST_NOT_FOUND if any treeOid lacks a manifest */ - async findOrphanedChunks({ treeOids }) { + async collectReferencedChunks({ treeOids }) { const referenced = new Set(); let total = 0; @@ -704,6 +715,17 @@ export default class CasService { return { referenced, total }; } + /** + * @deprecated Use {@link collectReferencedChunks} instead. + * @param {Object} options + * @param {string[]} options.treeOids - Git tree OIDs to analyze + * @returns {Promise<{ referenced: Set, total: number }>} + */ + async findOrphanedChunks(options) { + this.observability.log('warn', 'findOrphanedChunks() is deprecated — use collectReferencedChunks()'); + return await this.collectReferencedChunks(options); + } + /** * Derives an encryption key from a passphrase using PBKDF2 or scrypt. * @param {Object} options diff --git a/test/unit/domain/services/CasService.lifecycle.test.js b/test/unit/domain/services/CasService.lifecycle.test.js new file mode 100644 index 0000000..acd7630 --- /dev/null +++ b/test/unit/domain/services/CasService.lifecycle.test.js @@ -0,0 +1,119 @@ +import { describe, it, expect, vi } from 'vitest'; +import CasService from '../../../../src/domain/services/CasService.js'; +import { getTestCryptoAdapter } from '../../../helpers/crypto-adapter.js'; +import JsonCodec from '../../../../src/infrastructure/codecs/JsonCodec.js'; +import { digestOf } from '../../../helpers/crypto.js'; + +const testCrypto = await getTestCryptoAdapter(); + +function makeChunk(index, seed, blobOid) { + return { index, size: 1024, digest: digestOf(seed), blob: blobOid }; +} + +function setup() { + const mockPersistence = { + writeBlob: vi.fn(), + writeTree: vi.fn(), + readBlob: vi.fn(), + readTree: vi.fn(), + }; + const observability = { + metric: vi.fn(), + log: vi.fn(), + span: vi.fn().mockReturnValue({ end: vi.fn() }), + }; + const service = new CasService({ + persistence: mockPersistence, + crypto: testCrypto, + codec: new JsonCodec(), + chunkSize: 1024, + observability, + }); + return { mockPersistence, observability, service }; +} + +function mockManifest(mockPersistence, manifest) { + const codec = new JsonCodec(); + mockPersistence.readTree.mockResolvedValue([ + { mode: '100644', type: 'blob', oid: 'mf-oid', name: 'manifest.json' }, + ]); + mockPersistence.readBlob.mockResolvedValue(codec.encode(manifest)); +} + +describe('16.7: inspectAsset (canonical name)', () => { + it('returns { slug, chunksOrphaned }', async () => { + const { service, mockPersistence } = setup(); + const manifest = { + slug: 'asset-1', filename: 'f.bin', size: 2048, + chunks: [makeChunk(0, 'c0', 'b0'), makeChunk(1, 'c1', 'b1')], + }; + mockManifest(mockPersistence, manifest); + const result = await service.inspectAsset({ treeOid: 'tree-1' }); + expect(result).toEqual({ slug: 'asset-1', chunksOrphaned: 2 }); + }); +}); + +describe('16.7: deleteAsset (deprecated alias)', () => { + it('delegates to inspectAsset and returns same result', async () => { + const { service, mockPersistence } = setup(); + const manifest = { + slug: 'asset-2', filename: 'g.bin', size: 1024, + chunks: [makeChunk(0, 'd0', 'b0')], + }; + mockManifest(mockPersistence, manifest); + const result = await service.deleteAsset({ treeOid: 'tree-2' }); + expect(result).toEqual({ slug: 'asset-2', chunksOrphaned: 1 }); + }); + + it('emits deprecation warning via observability', async () => { + const { service, mockPersistence, observability } = setup(); + const manifest = { + slug: 'x', filename: 'x.bin', size: 0, chunks: [], + }; + mockManifest(mockPersistence, manifest); + await service.deleteAsset({ treeOid: 'tree-x' }); + expect(observability.log).toHaveBeenCalledWith( + 'warn', 'deleteAsset() is deprecated — use inspectAsset()', + ); + }); +}); + +describe('16.7: collectReferencedChunks (canonical name)', () => { + it('returns { referenced, total }', async () => { + const { service, mockPersistence } = setup(); + const manifest = { + slug: 'asset-3', filename: 'h.bin', size: 2048, + chunks: [makeChunk(0, 'e0', 'b0'), makeChunk(1, 'e1', 'b1')], + }; + mockManifest(mockPersistence, manifest); + const result = await service.collectReferencedChunks({ treeOids: ['tree-3'] }); + expect(result.referenced.size).toBe(2); + expect(result.total).toBe(2); + }); +}); + +describe('16.7: findOrphanedChunks (deprecated alias)', () => { + it('delegates to collectReferencedChunks', async () => { + const { service, mockPersistence } = setup(); + const manifest = { + slug: 'asset-4', filename: 'i.bin', size: 1024, + chunks: [makeChunk(0, 'f0', 'b0')], + }; + mockManifest(mockPersistence, manifest); + const result = await service.findOrphanedChunks({ treeOids: ['tree-4'] }); + expect(result.referenced.size).toBe(1); + expect(result.total).toBe(1); + }); + + it('emits deprecation warning via observability', async () => { + const { service, mockPersistence, observability } = setup(); + const manifest = { + slug: 'y', filename: 'y.bin', size: 0, chunks: [], + }; + mockManifest(mockPersistence, manifest); + await service.findOrphanedChunks({ treeOids: ['tree-y'] }); + expect(observability.log).toHaveBeenCalledWith( + 'warn', 'findOrphanedChunks() is deprecated — use collectReferencedChunks()', + ); + }); +}); From 67f9bcd7b451cab6a9951ac750344ccd958c3414 Mon Sep 17 00:00:00 2001 From: James Ross Date: Tue, 3 Mar 2026 19:55:34 -0800 Subject: [PATCH 15/41] feat(security): add KDF brute-force awareness metrics and CLI delay MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit CasService emits a decryption_failed metric with slug context on every INTEGRITY_ERROR during encrypted restore, providing an audit trail for monitoring failed passphrase attempts. The CLI layer adds a 1-second delay after INTEGRITY_ERROR to slow brute-force attacks. The library API itself imposes no rate-limiting — callers manage their own policy. Resolves task 16.12. --- CHANGELOG.md | 1 + bin/actions.js | 12 ++ src/domain/services/CasService.js | 9 +- test/unit/cli/actions.test.js | 34 ++++++ .../services/CasService.kdfBruteForce.test.js | 103 ++++++++++++++++++ 5 files changed, 158 insertions(+), 1 deletion(-) create mode 100644 test/unit/domain/services/CasService.kdfBruteForce.test.js diff --git a/CHANGELOG.md b/CHANGELOG.md index 00da690..18f1165 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -25,6 +25,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - **16.10 — Orphaned blob tracking** — `STREAM_ERROR` now includes `meta.orphanedBlobs` — an array of OIDs for blobs successfully written before the stream failure. Error metric includes `orphanedBlobs` count for observability. - **16.4 — FixedChunker pre-allocated buffer** — Replaced `Buffer.concat()` loop with a pre-allocated `Buffer.allocUnsafe(chunkSize)` working buffer, eliminating O(n²) copies for many small input buffers. Matches the allocation strategy used by `CdcChunker`. - **16.7 — Lifecycle method naming** — Added `inspectAsset()` (replaces `deleteAsset()`) and `collectReferencedChunks()` (replaces `findOrphanedChunks()`) as canonical names on both `CasService` and the facade. Old names are preserved as deprecated aliases that emit observability warnings. Type definitions updated with `@deprecated` JSDoc. +- **16.12 — KDF brute-force awareness** — `CasService` now emits `decryption_failed` metric with slug context when decryption fails with `INTEGRITY_ERROR` during encrypted restore. CLI adds a 1-second delay after `INTEGRITY_ERROR` to slow brute-force attempts. Library API imposes no delay — callers manage their own rate-limiting policy. ## [5.2.4] — Prism polish (2026-03-03) diff --git a/bin/actions.js b/bin/actions.js index d1cb54a..c6388b8 100644 --- a/bin/actions.js +++ b/bin/actions.js @@ -56,6 +56,15 @@ function getHint(code) { return undefined; } +/** + * Delay utility for rate-limiting after sensitive failures. + * @param {number} ms + * @returns {Promise} + */ +function delay(ms) { + return new Promise((resolve) => { setTimeout(resolve, ms); }); +} + /** * Wrap a command action with structured error handling. * @@ -68,6 +77,9 @@ export function runAction(fn, getJson) { try { await fn(...args); } catch (/** @type {any} */ err) { + if (err?.code === 'INTEGRITY_ERROR') { + await delay(1000); + } writeError(err, getJson()); process.exitCode = 1; } diff --git a/src/domain/services/CasService.js b/src/domain/services/CasService.js index 9c3a847..05d9bca 100644 --- a/src/domain/services/CasService.js +++ b/src/domain/services/CasService.js @@ -510,7 +510,14 @@ export default class CasService { let buffer = Buffer.concat(await this._readAndVerifyChunks(manifest.chunks)); if (manifest.encryption?.encrypted) { - buffer = await this.decrypt({ buffer, key, meta: manifest.encryption }); + try { + buffer = await this.decrypt({ buffer, key, meta: manifest.encryption }); + } catch (err) { + if (err instanceof CasError && err.code === 'INTEGRITY_ERROR') { + this.observability.metric('error', { action: 'decryption_failed', slug: manifest.slug }); + } + throw err; + } } if (manifest.compression) { diff --git a/test/unit/cli/actions.test.js b/test/unit/cli/actions.test.js index 3d4fd3d..e7839c5 100644 --- a/test/unit/cli/actions.test.js +++ b/test/unit/cli/actions.test.js @@ -109,6 +109,40 @@ describe('runAction', () => { }); }); +describe('runAction — INTEGRITY_ERROR rate-limiting', () => { + let stderrSpy; + const originalExitCode = process.exitCode; + + beforeEach(() => { + process.exitCode = undefined; + stderrSpy = vi.spyOn(process.stderr, 'write').mockImplementation(() => true); + }); + + afterEach(() => { + process.exitCode = originalExitCode; + stderrSpy.mockRestore(); + }); + + it('delays ~1s on INTEGRITY_ERROR before writing output', async () => { + const err = Object.assign(new Error('bad key'), { code: 'INTEGRITY_ERROR' }); + const action = runAction(async () => { throw err; }, () => false); + const start = Date.now(); + await action(); + const elapsed = Date.now() - start; + expect(elapsed).toBeGreaterThanOrEqual(900); + expect(process.exitCode).toBe(1); + }); + + it('no delay for non-INTEGRITY_ERROR codes', async () => { + const err = Object.assign(new Error('gone'), { code: 'MISSING_KEY' }); + const action = runAction(async () => { throw err; }, () => false); + const start = Date.now(); + await action(); + const elapsed = Date.now() - start; + expect(elapsed).toBeLessThan(200); + }); +}); + describe('HINTS', () => { it('contains expected error codes', () => { expect(HINTS).toHaveProperty('MISSING_KEY'); diff --git a/test/unit/domain/services/CasService.kdfBruteForce.test.js b/test/unit/domain/services/CasService.kdfBruteForce.test.js new file mode 100644 index 0000000..26a9922 --- /dev/null +++ b/test/unit/domain/services/CasService.kdfBruteForce.test.js @@ -0,0 +1,103 @@ +import { describe, it, expect, vi } from 'vitest'; +import CasService from '../../../../src/domain/services/CasService.js'; +import { getTestCryptoAdapter } from '../../../helpers/crypto-adapter.js'; +import JsonCodec from '../../../../src/infrastructure/codecs/JsonCodec.js'; +import Manifest from '../../../../src/domain/value-objects/Manifest.js'; + +const testCrypto = await getTestCryptoAdapter(); + +const CHUNK_DATA = Buffer.alloc(128, 0xaa); +const CHUNK_DIGEST = await testCrypto.sha256(CHUNK_DATA); + +function setup() { + const observability = { + metric: vi.fn(), + log: vi.fn(), + span: vi.fn().mockReturnValue({ end: vi.fn() }), + }; + const mockPersistence = { + writeBlob: vi.fn(), + writeTree: vi.fn(), + readBlob: vi.fn().mockResolvedValue(CHUNK_DATA), + readTree: vi.fn(), + }; + const service = new CasService({ + persistence: mockPersistence, + crypto: testCrypto, + codec: new JsonCodec(), + chunkSize: 1024, + observability, + }); + return { service, observability }; +} + +function encryptedManifest(slug) { + return new Manifest({ + slug, + filename: `${slug}.bin`, + size: 128, + chunks: [ + { index: 0, size: 128, digest: CHUNK_DIGEST, blob: 'blob-0' }, + ], + encryption: { + algorithm: 'aes-256-gcm', + nonce: 'deadbeef', + tag: 'cafebabe', + encrypted: true, + }, + }); +} + +describe('16.12: KDF brute-force — decryption_failed metric', () => { + it('emits metric on wrong key', async () => { + const { service, observability } = setup(); + const manifest = encryptedManifest('secret-file'); + const wrongKey = testCrypto.randomBytes(32); + + try { + await service.restore({ manifest, encryptionKey: wrongKey }); + expect.unreachable('should have thrown'); + } catch (err) { + expect(err.code).toBe('INTEGRITY_ERROR'); + } + + const dfMetrics = observability.metric.mock.calls.filter( + (c) => c[0] === 'error' && c[1].action === 'decryption_failed', + ); + expect(dfMetrics.length).toBe(1); + }); + + it('includes slug context for audit trail', async () => { + const { service, observability } = setup(); + const manifest = encryptedManifest('audit-slug'); + const wrongKey = testCrypto.randomBytes(32); + + try { + await service.restore({ manifest, encryptionKey: wrongKey }); + } catch { + // expected + } + + const dfMetrics = observability.metric.mock.calls.filter( + (c) => c[0] === 'error' && c[1].action === 'decryption_failed', + ); + expect(dfMetrics[0][1]).toHaveProperty('slug', 'audit-slug'); + }); +}); + +describe('16.12: KDF brute-force — library rate-limiting', () => { + it('library API does NOT rate-limit', async () => { + const { service } = setup(); + const manifest = encryptedManifest('rate-test'); + const wrongKey = testCrypto.randomBytes(32); + + const start = Date.now(); + try { + await service.restore({ manifest, encryptionKey: wrongKey }); + } catch { + // expected + } + const elapsed = Date.now() - start; + expect(elapsed).toBeLessThan(500); + }); +}); From aae160a59900cb5a87b8e28e8f3017dab8e1c223 Mon Sep 17 00:00:00 2001 From: James Ross Date: Tue, 3 Mar 2026 19:59:31 -0800 Subject: [PATCH 16/41] feat(security): add encryption counter and move SECURITY.md to root Move docs/SECURITY.md to project root with new sections covering GCM nonce bounds (2^32 NIST limit), recommended key rotation frequency, KDF parameter guidance (PBKDF2/scrypt), and passphrase entropy. VaultService now tracks encryptionCount in vault metadata, incremented on each addToVault when the vault has encryption configured. An observability warning fires when the count exceeds 2^31, providing a safety margin before the NIST 2^32 limit. VaultService accepts an optional observability port (no-op default for backward compat). Resolves task 16.13. --- CHANGELOG.md | 1 + README.md | 2 +- docs/SECURITY.md => SECURITY.md | 53 +++++++++-- index.d.ts | 2 + index.js | 2 +- src/domain/services/VaultService.js | 22 ++++- .../VaultService.encryptionCount.test.js | 95 +++++++++++++++++++ 7 files changed, 164 insertions(+), 13 deletions(-) rename docs/SECURITY.md => SECURITY.md (91%) create mode 100644 test/unit/domain/services/VaultService.encryptionCount.test.js diff --git a/CHANGELOG.md b/CHANGELOG.md index 18f1165..aab1940 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -26,6 +26,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - **16.4 — FixedChunker pre-allocated buffer** — Replaced `Buffer.concat()` loop with a pre-allocated `Buffer.allocUnsafe(chunkSize)` working buffer, eliminating O(n²) copies for many small input buffers. Matches the allocation strategy used by `CdcChunker`. - **16.7 — Lifecycle method naming** — Added `inspectAsset()` (replaces `deleteAsset()`) and `collectReferencedChunks()` (replaces `findOrphanedChunks()`) as canonical names on both `CasService` and the facade. Old names are preserved as deprecated aliases that emit observability warnings. Type definitions updated with `@deprecated` JSDoc. - **16.12 — KDF brute-force awareness** — `CasService` now emits `decryption_failed` metric with slug context when decryption fails with `INTEGRITY_ERROR` during encrypted restore. CLI adds a 1-second delay after `INTEGRITY_ERROR` to slow brute-force attempts. Library API imposes no delay — callers manage their own rate-limiting policy. +- **16.13 — GCM nonce collision docs + encryption counter** — `SECURITY.md` moved to project root with new sections: GCM nonce bound (2^32 NIST limit), key rotation frequency, KDF parameter guidance, and passphrase entropy recommendations. Vault metadata now tracks `encryptionCount`, incremented per encrypted `addToVault()`. Observability warning emitted when count exceeds 2^31. `VaultService` accepts optional `observability` port. ## [5.2.4] — Prism polish (2026-03-03) diff --git a/README.md b/README.md index 21b3946..1013a25 100644 --- a/README.md +++ b/README.md @@ -304,7 +304,7 @@ git cas store ./data.bin --slug my-data --tree --json - [Guide](./GUIDE.md) — progressive walkthrough - [API Reference](./docs/API.md) — full method documentation - [Architecture](./ARCHITECTURE.md) — hexagonal design overview -- [Security](./docs/SECURITY.md) — crypto design and threat model +- [Security](./SECURITY.md) — crypto design and threat model ## When to use git-cas (and when not to) diff --git a/docs/SECURITY.md b/SECURITY.md similarity index 91% rename from docs/SECURITY.md rename to SECURITY.md index b626c26..00a5b5e 100644 --- a/docs/SECURITY.md +++ b/SECURITY.md @@ -4,15 +4,50 @@ This document describes the security architecture, cryptographic design, and lim ## Table of Contents -1. [Threat Model](#threat-model) -2. [Cryptographic Design](#cryptographic-design) -3. [Key Handling](#key-handling) -4. [Encryption Flow](#encryption-flow) -5. [Decryption Flow](#decryption-flow) -6. [Chunk Digest Verification](#chunk-digest-verification) -7. [Limitations](#limitations) -8. [Git Object Immutability](#git-object-immutability) -9. [Error Codes for Security Operations](#error-codes-for-security-operations) +1. [Operational Limits](#operational-limits) +2. [Threat Model](#threat-model) +3. [Cryptographic Design](#cryptographic-design) +4. [Key Handling](#key-handling) +5. [Encryption Flow](#encryption-flow) +6. [Decryption Flow](#decryption-flow) +7. [Chunk Digest Verification](#chunk-digest-verification) +8. [Limitations](#limitations) +9. [Git Object Immutability](#git-object-immutability) +10. [Error Codes for Security Operations](#error-codes-for-security-operations) + +--- + +## Operational Limits + +### GCM Nonce Bound + +AES-256-GCM uses a 96-bit random nonce per encryption. NIST SP 800-38D recommends limiting to **2^32 invocations per key** to keep the nonce collision probability below an acceptable threshold. The birthday bound is approximately 2^48 for random 96-bit nonces, but the conservative NIST guidance of 2^32 accounts for the catastrophic consequences of a collision (full plaintext and authentication key recovery). + +git-cas tracks encryption operations via `encryptionCount` in vault metadata. When the count exceeds **2^31** (2,147,483,648), an observability warning is emitted, providing a safety margin before the 2^32 NIST limit. + +**Recommended key rotation frequency**: Rotate the vault passphrase (or encryption key) before `encryptionCount` reaches 2^31, or every 90 days, whichever comes first. + +### KDF Parameter Guidance + +When using passphrase-based encryption, git-cas derives keys using PBKDF2 or scrypt. + +| Algorithm | Recommended Parameters | Notes | +|-----------|----------------------|-------| +| PBKDF2 | iterations ≥ 600,000 (SHA-256) | OWASP 2024 recommendation | +| scrypt | N=2^17, r=8, p=1 | ~128 MiB memory | + +Higher iteration counts / cost parameters increase resistance to brute-force attacks but also increase the time to derive a key. Choose parameters based on your threat model and latency tolerance. + +### Passphrase Entropy Recommendations + +| Entropy (bits) | Example | Brute-Force Resistance | +|---------------|---------|----------------------| +| < 40 | `password123` | Trivially crackable | +| 40–60 | 4–5 random dictionary words | Weak against GPU attacks | +| 60–80 | 6+ random dictionary words or 12+ mixed characters | Moderate | +| > 80 | 8+ random dictionary words or 16+ mixed characters | Strong | + +**Minimum recommendation**: 80+ bits of entropy for vault passphrases. Use a random passphrase generator (e.g., Diceware) rather than human-chosen passwords. --- diff --git a/index.d.ts b/index.d.ts index a8c301c..05fa0e4 100644 --- a/index.d.ts +++ b/index.d.ts @@ -184,6 +184,8 @@ export interface VaultEntry { /** Vault metadata stored in .vault.json. */ export interface VaultMetadata { version: number; + /** Number of encrypted store operations performed with this vault key. */ + encryptionCount?: number; encryption?: { cipher: string; kdf: { diff --git a/index.js b/index.js index 1f65e6c..0ff524d 100644 --- a/index.js +++ b/index.js @@ -118,7 +118,7 @@ export default class ContentAddressableStore { plumbing: cfg.plumbing, policy: cfg.policy, }); - this.#vault = new VaultService({ persistence, ref, crypto }); + this.#vault = new VaultService({ persistence, ref, crypto, observability: this.service.observability }); return this.service; } diff --git a/src/domain/services/VaultService.js b/src/domain/services/VaultService.js index d5a1ac2..09009b8 100644 --- a/src/domain/services/VaultService.js +++ b/src/domain/services/VaultService.js @@ -80,16 +80,22 @@ function hasControlChars(str) { export default class VaultService { static VAULT_REF = VAULT_REF; + /** @type {number} Nonce usage warning threshold (2^31). */ + static ENCRYPTION_COUNT_WARN = 2 ** 31; + /** * @param {Object} options * @param {import('../../ports/GitPersistencePort.js').default} options.persistence * @param {import('../../ports/GitRefPort.js').default} options.ref * @param {import('../../ports/CryptoPort.js').default} options.crypto + * @param {import('../../ports/ObservabilityPort.js').default} [options.observability] */ - constructor({ persistence, ref, crypto }) { + constructor({ persistence, ref, crypto, observability }) { this.persistence = persistence; this.ref = ref; this.crypto = crypto; + /** @type {{ metric: Function, log: Function, span: Function }} */ + this.observability = observability || { metric() {}, log() {}, span: () => ({ end() {} }) }; } // --------------------------------------------------------------------------- @@ -389,9 +395,21 @@ export default class VaultService { } const isUpdate = state.entries.has(slug); state.entries.set(slug, treeOid); + const metadata = state.metadata || { version: 1 }; + if (metadata.encryption) { + metadata.encryptionCount = (metadata.encryptionCount || 0) + 1; + if (metadata.encryptionCount >= VaultService.ENCRYPTION_COUNT_WARN) { + this.observability.log( + 'warn', + `Vault encryption count (${metadata.encryptionCount}) exceeds ` + + `${VaultService.ENCRYPTION_COUNT_WARN} — rotate your key`, + { encryptionCount: metadata.encryptionCount }, + ); + } + } return { entries: state.entries, - metadata: state.metadata || { version: 1 }, + metadata, message: isUpdate ? `vault: update ${slug}` : `vault: add ${slug}`, }; }); diff --git a/test/unit/domain/services/VaultService.encryptionCount.test.js b/test/unit/domain/services/VaultService.encryptionCount.test.js new file mode 100644 index 0000000..c349149 --- /dev/null +++ b/test/unit/domain/services/VaultService.encryptionCount.test.js @@ -0,0 +1,95 @@ +import { describe, it, expect, vi } from 'vitest'; +import VaultService from '../../../../src/domain/services/VaultService.js'; +import { getTestCryptoAdapter } from '../../../helpers/crypto-adapter.js'; + +const testCrypto = await getTestCryptoAdapter(); + +function encryptedMetadata(overrides = {}) { + return { + version: 1, + encryption: { + cipher: 'aes-256-gcm', + kdf: { algorithm: 'pbkdf2', salt: 'c2FsdA==', iterations: 100000, keyLength: 32 }, + }, + ...overrides, + }; +} + +function setup(metadata = encryptedMetadata()) { + const observability = { + metric: vi.fn(), + log: vi.fn(), + span: vi.fn().mockReturnValue({ end: vi.fn() }), + }; + const persistence = { + writeBlob: vi.fn().mockResolvedValue('blob-oid'), + writeTree: vi.fn().mockResolvedValue('tree-oid'), + readBlob: vi.fn().mockResolvedValue(Buffer.from(JSON.stringify(metadata))), + readTree: vi.fn().mockResolvedValue([ + { mode: '100644', type: 'blob', oid: 'meta-oid', name: '.vault.json' }, + ]), + }; + const ref = { + resolveRef: vi.fn().mockResolvedValue('commit-oid'), + resolveTree: vi.fn().mockResolvedValue('root-tree-oid'), + createCommit: vi.fn().mockResolvedValue('new-commit-oid'), + updateRef: vi.fn().mockResolvedValue(undefined), + }; + const vault = new VaultService({ + persistence, ref, crypto: testCrypto, observability, + }); + return { vault, persistence, ref, observability }; +} + +describe('16.13: Nonce usage tracking — encryptionCount', () => { + it('vault metadata includes encryptionCount after add', async () => { + const { vault, persistence } = setup(); + await vault.addToVault({ slug: 'asset-1', treeOid: 'tree-1' }); + + const writtenMetadata = JSON.parse(persistence.writeBlob.mock.calls[0][0]); + expect(writtenMetadata).toHaveProperty('encryptionCount', 1); + }); + + it('encryptionCount increments per encrypted store', async () => { + const meta = encryptedMetadata({ encryptionCount: 5 }); + const { vault, persistence } = setup(meta); + await vault.addToVault({ slug: 'asset-2', treeOid: 'tree-2' }); + + const writtenMetadata = JSON.parse(persistence.writeBlob.mock.calls[0][0]); + expect(writtenMetadata.encryptionCount).toBe(6); + }); +}); + +describe('16.13: Nonce usage tracking — threshold warning', () => { + it('warns when encryptionCount exceeds threshold', async () => { + const threshold = VaultService.ENCRYPTION_COUNT_WARN; + const meta = encryptedMetadata({ encryptionCount: threshold - 1 }); + const { vault, observability } = setup(meta); + await vault.addToVault({ slug: 'asset-3', treeOid: 'tree-3' }); + + const warnCalls = observability.log.mock.calls.filter( + (c) => c[0] === 'warn' && c[1].includes('encryption count'), + ); + expect(warnCalls.length).toBe(1); + }); + + it('no warning below threshold', async () => { + const meta = encryptedMetadata({ encryptionCount: 0 }); + const { vault, observability } = setup(meta); + await vault.addToVault({ slug: 'asset-4', treeOid: 'tree-4' }); + + const warnCalls = observability.log.mock.calls.filter( + (c) => c[0] === 'warn' && c[1].includes('encryption count'), + ); + expect(warnCalls.length).toBe(0); + }); + + it('no counter increment for unencrypted vault', async () => { + const meta = { version: 1 }; + const { vault, persistence } = setup(meta); + await vault.addToVault({ slug: 'plain-1', treeOid: 'tree-p' }); + + const writtenMetadata = JSON.parse(persistence.writeBlob.mock.calls[0][0]); + expect(writtenMetadata).not.toHaveProperty('encryptionCount'); + }); +}); From fd3eab4284e99aea5f35ddcb09a37b475506edba Mon Sep 17 00:00:00 2001 From: James Ross Date: Tue, 3 Mar 2026 20:02:32 -0800 Subject: [PATCH 17/41] docs(roadmap): add V7 (OTLP observability adapter) and V8 (auto-rotation) --- ROADMAP.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/ROADMAP.md b/ROADMAP.md index 13a145e..2c540bb 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -1680,3 +1680,5 @@ This is a fundamental architectural constraint of the encrypt-then-chunk design. | V4 | Feature | CompressionPort + zstd/brotli/lz4 | ~180 LoC, ~8h | | V5 | Feature | Watch mode (continuous sync) | ~220 LoC, ~10h | | V6 | Feature | Interactive passphrase prompt | ~90 LoC, ~4h — subsumed by **16.11** | +| V7 | Feature | Prometheus/OpenTelemetry ObservabilityPort adapter — export metrics (chunk throughput, encryption counts, error rates) to Prometheus or OTLP. The `decryption_failed` and `encryptionCount` metrics from M16 are natural candidates for alerting dashboards. | ~150 LoC, ~6h | +| V8 | Feature | `encryptionCount` auto-rotation — when count reaches a configurable threshold, automatically trigger `rotateVaultPassphrase` with a new passphrase derived from the old one, making nonce exhaustion impossible for long-lived vaults. | ~120 LoC, ~5h | From 4d908a08d3a8743b3e64f1eab77afca957f113a1 Mon Sep 17 00:00:00 2001 From: James Ross Date: Tue, 3 Mar 2026 22:38:04 -0800 Subject: [PATCH 18/41] test(vault): wire observability port into VaultService tests VaultService.test.js now passes a mockObservability() port (vi.fn() mocks) to all 46 tests instead of relying on the silent no-op default. rotateVaultPassphrase.test.js now passes SilentObserver explicitly. If observability wiring breaks, the test suite will catch it. --- CHANGELOG.md | 3 +++ test/unit/domain/services/rotateVaultPassphrase.test.js | 2 +- test/unit/vault/VaultService.test.js | 5 +++++ 3 files changed, 9 insertions(+), 1 deletion(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index aab1940..ff2a198 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -13,6 +13,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - **Concerns C8–C10** — Three new architectural concerns identified by the audit: crypto adapter LSP violation (C8), FixedChunker quadratic allocation (C9), encrypt-then-chunk dedup loss (C10). - **CasError codes** — `RESTORE_TOO_LARGE` and `ENCRYPTION_BUFFER_EXCEEDED` registered in canonical error code table. +### Changed +- **VaultService test observability wiring** — `VaultService.test.js` now passes a `mockObservability()` port to all 78 tests instead of relying on the silent no-op default. `rotateVaultPassphrase.test.js` now passes `SilentObserver` explicitly. If observability wiring breaks, the test suite will catch it. + ### Fixed - **16.8 — CasError portability guard** — `Error.captureStackTrace` now guarded with a runtime check. CasError constructs correctly on runtimes where `captureStackTrace` is unavailable (e.g. Firefox, older Deno). - **16.9 — Pre-commit hook + hooks directory** — `scripts/git-hooks/` renamed to `scripts/hooks/` per CLAUDE.md convention. New `pre-commit` hook runs lint gate. `install-hooks.sh` updated accordingly. diff --git a/test/unit/domain/services/rotateVaultPassphrase.test.js b/test/unit/domain/services/rotateVaultPassphrase.test.js index 4539557..a16c9cc 100644 --- a/test/unit/domain/services/rotateVaultPassphrase.test.js +++ b/test/unit/domain/services/rotateVaultPassphrase.test.js @@ -34,7 +34,7 @@ async function createDeps(repoDir) { const service = new CasService({ persistence, codec: new JsonCodec(), crypto, observability: new SilentObserver(), chunkSize: 1024, }); - const vault = new VaultService({ persistence, ref, crypto }); + const vault = new VaultService({ persistence, ref, crypto, observability: new SilentObserver() }); return { service, vault }; } diff --git a/test/unit/vault/VaultService.test.js b/test/unit/vault/VaultService.test.js index a85e219..93d3697 100644 --- a/test/unit/vault/VaultService.test.js +++ b/test/unit/vault/VaultService.test.js @@ -34,11 +34,16 @@ function mockCrypto() { }; } +function mockObservability() { + return { metric: vi.fn(), log: vi.fn(), span: vi.fn().mockReturnValue({ end: vi.fn() }) }; +} + function createVault(overrides = {}) { return new VaultService({ persistence: overrides.persistence || mockPersistence(), ref: overrides.ref || mockRef(), crypto: overrides.crypto || mockCrypto(), + observability: overrides.observability || mockObservability(), }); } From 804cfd5bf20452620177e0a7e71059e63fd30ce0 Mon Sep 17 00:00:00 2001 From: James Ross Date: Tue, 3 Mar 2026 22:40:35 -0800 Subject: [PATCH 19/41] docs(changelog): fix test count (46, not 78) --- CHANGELOG.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index ff2a198..f238c37 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -14,7 +14,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - **CasError codes** — `RESTORE_TOO_LARGE` and `ENCRYPTION_BUFFER_EXCEEDED` registered in canonical error code table. ### Changed -- **VaultService test observability wiring** — `VaultService.test.js` now passes a `mockObservability()` port to all 78 tests instead of relying on the silent no-op default. `rotateVaultPassphrase.test.js` now passes `SilentObserver` explicitly. If observability wiring breaks, the test suite will catch it. +- **VaultService test observability wiring** — `VaultService.test.js` now passes a `mockObservability()` port to all 46 tests instead of relying on the silent no-op default. `rotateVaultPassphrase.test.js` now passes `SilentObserver` explicitly. If observability wiring breaks, the test suite will catch it. ### Fixed - **16.8 — CasError portability guard** — `Error.captureStackTrace` now guarded with a runtime check. CasError constructs correctly on runtimes where `captureStackTrace` is unavailable (e.g. Firefox, older Deno). From 8cb6c3423fda3810372625181dc6773329c83a47 Mon Sep 17 00:00:00 2001 From: James Ross Date: Tue, 3 Mar 2026 22:41:02 -0800 Subject: [PATCH 20/41] docs(changelog): drop exact test count from observability entry --- CHANGELOG.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index f238c37..6c2ea95 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -14,7 +14,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - **CasError codes** — `RESTORE_TOO_LARGE` and `ENCRYPTION_BUFFER_EXCEEDED` registered in canonical error code table. ### Changed -- **VaultService test observability wiring** — `VaultService.test.js` now passes a `mockObservability()` port to all 46 tests instead of relying on the silent no-op default. `rotateVaultPassphrase.test.js` now passes `SilentObserver` explicitly. If observability wiring breaks, the test suite will catch it. +- **VaultService test observability wiring** — `VaultService.test.js` now passes a `mockObservability()` port to all tests instead of relying on the silent no-op default. `rotateVaultPassphrase.test.js` now passes `SilentObserver` explicitly. If observability wiring breaks, the test suite will catch it. ### Fixed - **16.8 — CasError portability guard** — `Error.captureStackTrace` now guarded with a runtime check. CasError constructs correctly on runtimes where `captureStackTrace` is unavailable (e.g. Firefox, older Deno). From a1dc2c941eac38f3977e81e2a36bb57354ca8ab2 Mon Sep 17 00:00:00 2001 From: James Ross Date: Tue, 3 Mar 2026 23:13:34 -0800 Subject: [PATCH 21/41] fix(cli): defer passphrase prompt until vault encryption is confirmed - resolveEncryptionKey now checks vault metadata before calling resolvePassphrase, avoiding unnecessary TTY prompts for unencrypted vaults. - Store action recipient-conflict check uses hasPassphraseSource() (flag/env inspection) instead of awaiting resolvePassphrase(), which could consume stdin as a side effect. - readPassphraseFile strips trailing CRLF (\r\n) in addition to LF, preventing mismatched passphrases from Windows-edited files. --- bin/git-cas.js | 27 ++++++++++++++++++------- bin/ui/passphrase-prompt.js | 4 ++-- test/unit/cli/passphrase-prompt.test.js | 6 ++++++ 3 files changed, 28 insertions(+), 9 deletions(-) diff --git a/bin/git-cas.js b/bin/git-cas.js index 631de09..0cf848c 100755 --- a/bin/git-cas.js +++ b/bin/git-cas.js @@ -86,6 +86,17 @@ async function deriveVaultKey(cas, metadata, passphrase) { * @param {{ confirm?: boolean }} [extra] * @returns {Promise} */ +/** + * Returns true when a non-interactive passphrase source exists (flag or env). + * Does NOT trigger prompts or consume stdin. + * + * @param {Record} opts + * @returns {boolean} + */ +function hasPassphraseSource(opts) { + return Boolean(opts.vaultPassphraseFile || opts.vaultPassphrase || process.env.GIT_CAS_PASSPHRASE); +} + async function resolvePassphrase(opts, extra = {}) { if (opts.vaultPassphraseFile) { return await readPassphraseFile(opts.vaultPassphraseFile); @@ -113,16 +124,18 @@ async function resolveEncryptionKey(cas, opts) { if (opts.keyFile) { return readKeyFile(opts.keyFile); } + const metadata = await cas.getVaultMetadata(); + if (!metadata?.encryption) { + if (hasPassphraseSource(opts)) { + process.stderr.write('warning: passphrase ignored (vault is not encrypted)\n'); + } + return undefined; + } const passphrase = await resolvePassphrase(opts); if (!passphrase) { return undefined; } - const metadata = await cas.getVaultMetadata(); - if (metadata?.encryption) { - return deriveVaultKey(cas, metadata, passphrase); - } - process.stderr.write('warning: passphrase ignored (vault is not encrypted)\n'); - return undefined; + return deriveVaultKey(cas, metadata, passphrase); } /** @@ -207,7 +220,7 @@ program .option('--vault-passphrase-file ', 'Read vault passphrase from file (use - for stdin)') .option('--cwd ', 'Git working directory', '.') .action(runAction(async (/** @type {string} */ file, /** @type {Record} */ opts) => { - if (opts.recipient && (opts.keyFile || await resolvePassphrase(opts))) { + if (opts.recipient && (opts.keyFile || hasPassphraseSource(opts))) { throw new Error('Provide --key-file/--vault-passphrase or --recipient, not both'); } if (opts.force && !opts.tree) { diff --git a/bin/ui/passphrase-prompt.js b/bin/ui/passphrase-prompt.js index ce64ae3..3e261c9 100644 --- a/bin/ui/passphrase-prompt.js +++ b/bin/ui/passphrase-prompt.js @@ -37,10 +37,10 @@ export async function readPassphraseFile(filePath) { for await (const chunk of process.stdin) { chunks.push(chunk); } - return Buffer.concat(chunks).toString('utf8').replace(/\n$/, ''); + return Buffer.concat(chunks).toString('utf8').replace(/\r?\n$/, ''); } const content = await readFile(filePath, 'utf8'); - return content.replace(/\n$/, ''); + return content.replace(/\r?\n$/, ''); } /** diff --git a/test/unit/cli/passphrase-prompt.test.js b/test/unit/cli/passphrase-prompt.test.js index fd97e76..3587713 100644 --- a/test/unit/cli/passphrase-prompt.test.js +++ b/test/unit/cli/passphrase-prompt.test.js @@ -28,4 +28,10 @@ describe('readPassphraseFile', () => { const result = await readPassphraseFile(tmpPath); expect(result).toBe('line1\nline2'); }); + + it('strips trailing CRLF (Windows line ending)', async () => { + await writeFile(tmpPath, 'win-secret\r\n', 'utf8'); + const result = await readPassphraseFile(tmpPath); + expect(result).toBe('win-secret'); + }); }); From a3db8adf0b42a9e7f52888d9fefeaa436a296990 Mon Sep 17 00:00:00 2001 From: James Ross Date: Tue, 3 Mar 2026 23:15:45 -0800 Subject: [PATCH 22/41] fix: validate constructor params for buffer/chunk size bounds - CasService: validate maxRestoreBufferSize (integer >= 1024). #validateConstructorArgs now accepts an object to stay within max-params lint rule. - WebCryptoAdapter: validate maxEncryptionBufferSize (finite, positive). - FixedChunker: validate chunkSize lower bound (positive integer). Prevents infinite loops from chunkSize=0 or NaN. --- src/domain/services/CasService.js | 7 +++-- .../adapters/WebCryptoAdapter.js | 3 ++ src/infrastructure/chunkers/FixedChunker.js | 3 ++ .../services/CasService.restoreGuard.test.js | 29 +++++++++++++++---- .../WebCryptoAdapter.bufferGuard.test.js | 18 ++++++++++++ .../chunkers/ChunkerBounds.test.js | 23 +++++++++++++++ 6 files changed, 76 insertions(+), 7 deletions(-) diff --git a/src/domain/services/CasService.js b/src/domain/services/CasService.js index 05d9bca..6050efa 100644 --- a/src/domain/services/CasService.js +++ b/src/domain/services/CasService.js @@ -37,7 +37,7 @@ export default class CasService { */ constructor({ persistence, codec, crypto, observability, chunkSize = 256 * 1024, merkleThreshold = 1000, concurrency = 1, chunker, maxRestoreBufferSize = 512 * 1024 * 1024 }) { CasService._validateObservability(observability); - CasService.#validateConstructorArgs(chunkSize, merkleThreshold, concurrency); + CasService.#validateConstructorArgs({ chunkSize, merkleThreshold, concurrency, maxRestoreBufferSize }); this.persistence = persistence; this.codec = codec; this.crypto = crypto; @@ -58,7 +58,7 @@ export default class CasService { * Validates constructor numeric arguments. * @private */ - static #validateConstructorArgs(chunkSize, merkleThreshold, concurrency) { + static #validateConstructorArgs({ chunkSize, merkleThreshold, concurrency, maxRestoreBufferSize }) { if (chunkSize < 1024) { throw new Error('Chunk size must be at least 1024 bytes'); } @@ -72,6 +72,9 @@ export default class CasService { if (!Number.isInteger(concurrency) || concurrency < 1) { throw new Error('Concurrency must be a positive integer'); } + if (!Number.isInteger(maxRestoreBufferSize) || maxRestoreBufferSize < 1024) { + throw new Error('maxRestoreBufferSize must be a positive integer >= 1024'); + } } /** diff --git a/src/infrastructure/adapters/WebCryptoAdapter.js b/src/infrastructure/adapters/WebCryptoAdapter.js index d46ae92..e40f1ed 100644 --- a/src/infrastructure/adapters/WebCryptoAdapter.js +++ b/src/infrastructure/adapters/WebCryptoAdapter.js @@ -18,6 +18,9 @@ export default class WebCryptoAdapter extends CryptoPort { */ constructor({ maxEncryptionBufferSize = 512 * 1024 * 1024 } = {}) { super(); + if (!Number.isFinite(maxEncryptionBufferSize) || maxEncryptionBufferSize <= 0) { + throw new RangeError('maxEncryptionBufferSize must be a finite positive number'); + } this.#maxEncryptionBufferSize = maxEncryptionBufferSize; } diff --git a/src/infrastructure/chunkers/FixedChunker.js b/src/infrastructure/chunkers/FixedChunker.js index ef76c63..4444823 100644 --- a/src/infrastructure/chunkers/FixedChunker.js +++ b/src/infrastructure/chunkers/FixedChunker.js @@ -17,6 +17,9 @@ export default class FixedChunker extends ChunkingPort { */ constructor({ chunkSize = 262144 } = {}) { super(); + if (!Number.isInteger(chunkSize) || chunkSize < 1) { + throw new RangeError(`chunkSize must be a positive integer, got ${chunkSize}`); + } if (chunkSize > 100 * 1024 * 1024) { throw new RangeError( `Chunk size must not exceed 104857600 bytes (100 MiB), got ${chunkSize}`, diff --git a/test/unit/domain/services/CasService.restoreGuard.test.js b/test/unit/domain/services/CasService.restoreGuard.test.js index 96f9dfd..576c912 100644 --- a/test/unit/domain/services/CasService.restoreGuard.test.js +++ b/test/unit/domain/services/CasService.restoreGuard.test.js @@ -96,22 +96,41 @@ describe('CasService — RESTORE_TOO_LARGE defaults and meta', () => { }); it('error meta includes size and limit', async () => { - const { service } = setup({ maxRestoreBufferSize: 100 }); - const manifest = makeEncryptedManifest([50, 60]); + const { service } = setup({ maxRestoreBufferSize: 2048 }); + const manifest = makeEncryptedManifest([1100, 1100]); try { await service.restoreStream({ manifest, encryptionKey: Buffer.alloc(32, 0xab) }).next(); } catch (err) { expect(err.code).toBe('RESTORE_TOO_LARGE'); - expect(err.meta).toHaveProperty('size', 110); - expect(err.meta).toHaveProperty('limit', 100); + expect(err.meta).toHaveProperty('size', 2200); + expect(err.meta).toHaveProperty('limit', 2048); } }); }); +describe('CasService — maxRestoreBufferSize validation', () => { + it('throws for non-integer', () => { + expect(() => setup({ maxRestoreBufferSize: 1.5 })).toThrow(); + }); + + it('throws for value below 1024', () => { + expect(() => setup({ maxRestoreBufferSize: 512 })).toThrow(); + }); + + it('throws for NaN', () => { + expect(() => setup({ maxRestoreBufferSize: NaN })).toThrow(); + }); + + it('accepts 1024', () => { + const { service } = setup({ maxRestoreBufferSize: 1024 }); + expect(service.maxRestoreBufferSize).toBe(1024); + }); +}); + describe('CasService — RESTORE_TOO_LARGE does not affect streaming', () => { it('does not apply to unencrypted/uncompressed restoreStream', async () => { - const { service, mockPersistence } = setup({ maxRestoreBufferSize: 10 }); + const { service, mockPersistence } = setup({ maxRestoreBufferSize: 1024 }); const manifest = new Manifest({ slug: 'plain', filename: 'plain.bin', diff --git a/test/unit/infrastructure/adapters/WebCryptoAdapter.bufferGuard.test.js b/test/unit/infrastructure/adapters/WebCryptoAdapter.bufferGuard.test.js index e22d010..2d9bf62 100644 --- a/test/unit/infrastructure/adapters/WebCryptoAdapter.bufferGuard.test.js +++ b/test/unit/infrastructure/adapters/WebCryptoAdapter.bufferGuard.test.js @@ -53,6 +53,24 @@ describe('WebCryptoAdapter — ENCRYPTION_BUFFER_EXCEEDED', () => { }); }); +describe('WebCryptoAdapter — maxEncryptionBufferSize validation', () => { + it('throws for NaN', () => { + expect(() => new WebCryptoAdapter({ maxEncryptionBufferSize: NaN })).toThrow(RangeError); + }); + + it('throws for 0', () => { + expect(() => new WebCryptoAdapter({ maxEncryptionBufferSize: 0 })).toThrow(RangeError); + }); + + it('throws for negative', () => { + expect(() => new WebCryptoAdapter({ maxEncryptionBufferSize: -1 })).toThrow(RangeError); + }); + + it('throws for Infinity', () => { + expect(() => new WebCryptoAdapter({ maxEncryptionBufferSize: Infinity })).toThrow(RangeError); + }); +}); + describe('NodeCryptoAdapter — no buffer guard for streaming', () => { it('does NOT throw for same-size stream (true streaming)', async () => { const adapter = new NodeCryptoAdapter(); diff --git a/test/unit/infrastructure/chunkers/ChunkerBounds.test.js b/test/unit/infrastructure/chunkers/ChunkerBounds.test.js index 7a86559..2a5a2d6 100644 --- a/test/unit/infrastructure/chunkers/ChunkerBounds.test.js +++ b/test/unit/infrastructure/chunkers/ChunkerBounds.test.js @@ -15,6 +15,29 @@ describe('FixedChunker — chunk size upper bound', () => { }); }); +describe('FixedChunker — chunk size lower bound', () => { + it('throws when chunkSize is 0', () => { + expect(() => new FixedChunker({ chunkSize: 0 })).toThrow(RangeError); + }); + + it('throws when chunkSize is negative', () => { + expect(() => new FixedChunker({ chunkSize: -1 })).toThrow(RangeError); + }); + + it('throws when chunkSize is NaN', () => { + expect(() => new FixedChunker({ chunkSize: NaN })).toThrow(RangeError); + }); + + it('throws when chunkSize is not an integer', () => { + expect(() => new FixedChunker({ chunkSize: 1.5 })).toThrow(RangeError); + }); + + it('accepts chunkSize of 1', () => { + const chunker = new FixedChunker({ chunkSize: 1 }); + expect(chunker.params.chunkSize).toBe(1); + }); +}); + describe('CdcChunker — chunk size upper bound', () => { it('throws when maxChunkSize > 100 MiB', () => { expect(() => new CdcChunker({ From e93053e3789e52ef1fbe51dcaee8355379d01e80 Mon Sep 17 00:00:00 2001 From: James Ross Date: Tue, 3 Mar 2026 23:16:35 -0800 Subject: [PATCH 23/41] fix(restore): enforce size limit after decompression _restoreBuffered already checked pre-decompression chunk sizes against maxRestoreBufferSize, but decompression can inflate far beyond that bound. A second check after _decompress now throws RESTORE_TOO_LARGE when the decompressed buffer exceeds the configured limit. --- src/domain/services/CasService.js | 7 +++++ .../services/CasService.restoreGuard.test.js | 26 +++++++++++++++++++ 2 files changed, 33 insertions(+) diff --git a/src/domain/services/CasService.js b/src/domain/services/CasService.js index 6050efa..b841b54 100644 --- a/src/domain/services/CasService.js +++ b/src/domain/services/CasService.js @@ -525,6 +525,13 @@ export default class CasService { if (manifest.compression) { buffer = await this._decompress(buffer); + if (buffer.length > this.maxRestoreBufferSize) { + throw new CasError( + `Decompressed restore is ${buffer.length} bytes (limit: ${this.maxRestoreBufferSize})`, + 'RESTORE_TOO_LARGE', + { size: buffer.length, limit: this.maxRestoreBufferSize }, + ); + } } this.observability.metric('file', { diff --git a/test/unit/domain/services/CasService.restoreGuard.test.js b/test/unit/domain/services/CasService.restoreGuard.test.js index 576c912..bce80ba 100644 --- a/test/unit/domain/services/CasService.restoreGuard.test.js +++ b/test/unit/domain/services/CasService.restoreGuard.test.js @@ -109,6 +109,32 @@ describe('CasService — RESTORE_TOO_LARGE defaults and meta', () => { }); }); +describe('CasService — RESTORE_TOO_LARGE after decompression', () => { + it('throws when decompressed size exceeds limit', async () => { + const { service, mockPersistence } = setup({ maxRestoreBufferSize: 4096 }); + const key = Buffer.alloc(32, 0xab); + + // Store a small encrypted+compressed manifest that fits pre-decompression + async function* source() { yield Buffer.alloc(2048, 0xaa); } + const manifest = await service.store({ + source: source(), slug: 'bomb', filename: 'bomb.bin', + encryptionKey: key, compression: { algorithm: 'gzip' }, + }); + + // Wire readBlob to return the stored blobs + const storedBlobs = mockPersistence.writeBlob.mock.calls.map((c) => c[0]); + let idx = 0; + mockPersistence.readBlob.mockImplementation(() => Promise.resolve(storedBlobs[idx++] || Buffer.alloc(0))); + + // Mock _decompress to return a buffer larger than the limit + service._decompress = vi.fn().mockResolvedValue(Buffer.alloc(8192, 0xbb)); + + await expect( + service.restoreStream({ manifest, encryptionKey: key }).next(), + ).rejects.toMatchObject({ code: 'RESTORE_TOO_LARGE' }); + }); +}); + describe('CasService — maxRestoreBufferSize validation', () => { it('throws for non-integer', () => { expect(() => setup({ maxRestoreBufferSize: 1.5 })).toThrow(); From e257aeed793684cb35ee65156734ebd3bd8c25d8 Mon Sep 17 00:00:00 2001 From: James Ross Date: Tue, 3 Mar 2026 23:17:47 -0800 Subject: [PATCH 24/41] test: harden error-path assertions to fail on missing throws - orphanedBlobs: add expect.unreachable after store() in 3 error tests. - restoreGuard: add expect.unreachable in size/limit meta test. - kdfBruteForce: assert INTEGRITY_ERROR code, not just timing. - conformance: consolidate double try/catch into toMatchObject / objectContaining single assertions; remove unused CasError import. --- .../services/CasService.kdfBruteForce.test.js | 7 +++-- .../services/CasService.orphanedBlobs.test.js | 3 +++ .../services/CasService.restoreGuard.test.js | 1 + .../CryptoAdapter.conformance.test.js | 27 ++++--------------- 4 files changed, 14 insertions(+), 24 deletions(-) diff --git a/test/unit/domain/services/CasService.kdfBruteForce.test.js b/test/unit/domain/services/CasService.kdfBruteForce.test.js index 26a9922..063893b 100644 --- a/test/unit/domain/services/CasService.kdfBruteForce.test.js +++ b/test/unit/domain/services/CasService.kdfBruteForce.test.js @@ -92,12 +92,15 @@ describe('16.12: KDF brute-force — library rate-limiting', () => { const wrongKey = testCrypto.randomBytes(32); const start = Date.now(); + let caught; try { await service.restore({ manifest, encryptionKey: wrongKey }); - } catch { - // expected + expect.unreachable('should have thrown INTEGRITY_ERROR'); + } catch (err) { + caught = err; } const elapsed = Date.now() - start; + expect(caught?.code).toBe('INTEGRITY_ERROR'); expect(elapsed).toBeLessThan(500); }); }); diff --git a/test/unit/domain/services/CasService.orphanedBlobs.test.js b/test/unit/domain/services/CasService.orphanedBlobs.test.js index 7c2d466..5903ccd 100644 --- a/test/unit/domain/services/CasService.orphanedBlobs.test.js +++ b/test/unit/domain/services/CasService.orphanedBlobs.test.js @@ -55,6 +55,7 @@ describe('CasService — orphaned blob tracking in STREAM_ERROR', () => { it('STREAM_ERROR meta includes orphanedBlobs array', async () => { try { await service.store({ source: failingSource(3), slug: 'fail', filename: 'f.bin' }); + expect.unreachable('should have thrown STREAM_ERROR'); } catch (err) { expect(err.code).toBe('STREAM_ERROR'); expect(Array.isArray(err.meta.orphanedBlobs)).toBe(true); @@ -64,6 +65,7 @@ describe('CasService — orphaned blob tracking in STREAM_ERROR', () => { it('orphanedBlobs contain OIDs from successful writes', async () => { try { await service.store({ source: failingSource(3), slug: 'fail', filename: 'f.bin' }); + expect.unreachable('should have thrown STREAM_ERROR'); } catch (err) { expect(err.meta.orphanedBlobs.length).toBe(3); expect(err.meta.orphanedBlobs).toContain('blob-0'); @@ -75,6 +77,7 @@ describe('CasService — orphaned blob tracking in STREAM_ERROR', () => { it('empty array when stream fails before any writes', async () => { try { await service.store({ source: failingSource(0), slug: 'fail', filename: 'f.bin' }); + expect.unreachable('should have thrown STREAM_ERROR'); } catch (err) { expect(err.meta.orphanedBlobs).toEqual([]); } diff --git a/test/unit/domain/services/CasService.restoreGuard.test.js b/test/unit/domain/services/CasService.restoreGuard.test.js index bce80ba..f4402db 100644 --- a/test/unit/domain/services/CasService.restoreGuard.test.js +++ b/test/unit/domain/services/CasService.restoreGuard.test.js @@ -101,6 +101,7 @@ describe('CasService — RESTORE_TOO_LARGE defaults and meta', () => { try { await service.restoreStream({ manifest, encryptionKey: Buffer.alloc(32, 0xab) }).next(); + expect.unreachable('should have thrown RESTORE_TOO_LARGE'); } catch (err) { expect(err.code).toBe('RESTORE_TOO_LARGE'); expect(err.meta).toHaveProperty('size', 2200); diff --git a/test/unit/infrastructure/adapters/CryptoAdapter.conformance.test.js b/test/unit/infrastructure/adapters/CryptoAdapter.conformance.test.js index 8e45d14..3361a41 100644 --- a/test/unit/infrastructure/adapters/CryptoAdapter.conformance.test.js +++ b/test/unit/infrastructure/adapters/CryptoAdapter.conformance.test.js @@ -1,7 +1,6 @@ import { describe, it, expect } from 'vitest'; import NodeCryptoAdapter from '../../../../src/infrastructure/adapters/NodeCryptoAdapter.js'; import WebCryptoAdapter from '../../../../src/infrastructure/adapters/WebCryptoAdapter.js'; -import CasError from '../../../../src/domain/errors/CasError.js'; /** * Conformance test suite that asserts identical behavioral contracts across @@ -36,13 +35,7 @@ describe.each(adapters)('%s conformance', (_name, adapter) => { const { buf, meta } = await adapter.encryptBuffer(Buffer.from('test'), key); await expect( Promise.resolve().then(() => adapter.decryptBuffer(buf, 'not-a-buffer', meta)), - ).rejects.toThrow(CasError); - - try { - await Promise.resolve().then(() => adapter.decryptBuffer(buf, 'not-a-buffer', meta)); - } catch (err) { - expect(err.code).toBe('INVALID_KEY_TYPE'); - } + ).rejects.toMatchObject({ code: 'INVALID_KEY_TYPE' }); }); it('decryptBuffer rejects INVALID_KEY_LENGTH for 16-byte key', async () => { @@ -50,23 +43,13 @@ describe.each(adapters)('%s conformance', (_name, adapter) => { const { buf, meta } = await adapter.encryptBuffer(Buffer.from('test'), key); await expect( Promise.resolve().then(() => adapter.decryptBuffer(buf, shortKey, meta)), - ).rejects.toThrow(CasError); - - try { - await Promise.resolve().then(() => adapter.decryptBuffer(buf, shortKey, meta)); - } catch (err) { - expect(err.code).toBe('INVALID_KEY_LENGTH'); - } + ).rejects.toMatchObject({ code: 'INVALID_KEY_LENGTH' }); }); it('createEncryptionStream.finalize() throws STREAM_NOT_CONSUMED before consumption', () => { const { finalize } = adapter.createEncryptionStream(key); - expect(() => finalize()).toThrow(CasError); - - try { - finalize(); - } catch (err) { - expect(err.code).toBe('STREAM_NOT_CONSUMED'); - } + expect(() => finalize()).toThrow( + expect.objectContaining({ code: 'STREAM_NOT_CONSUMED' }), + ); }); }); From 621f361f440426329b9b9cf70eea36028b831f46 Mon Sep 17 00:00:00 2001 From: James Ross Date: Tue, 3 Mar 2026 23:18:43 -0800 Subject: [PATCH 25/41] docs: fix JSDoc return types, add maxRestoreBufferSize param, fix heading level - NodeCryptoAdapter.encryptBuffer @returns wrapped in Promise<>. - index.js constructor JSDoc and #config type include maxRestoreBufferSize. - ROADMAP.md: add ## Task Cards heading to fix MD001 heading-level jump. --- ROADMAP.md | 2 ++ index.js | 3 ++- src/infrastructure/adapters/NodeCryptoAdapter.js | 2 +- 3 files changed, 5 insertions(+), 2 deletions(-) diff --git a/ROADMAP.md b/ROADMAP.md index 2c540bb..4f79fc4 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -278,6 +278,8 @@ Remediation milestone addressing all negative findings from the [CODE-EVAL.md](. --- +## Task Cards + ### 16.1 — Crypto Adapter Behavioral Normalization *(P0)* — C8 **Problem** diff --git a/index.js b/index.js index 0ff524d..85f9154 100644 --- a/index.js +++ b/index.js @@ -64,6 +64,7 @@ export default class ContentAddressableStore { * @param {number} [options.concurrency=1] - Maximum parallel chunk I/O operations. * @param {{ strategy: string, chunkSize?: number, targetChunkSize?: number, minChunkSize?: number, maxChunkSize?: number }} [options.chunking] - Chunking strategy config. * @param {import('./src/ports/ChunkingPort.js').default} [options.chunker] - Pre-built ChunkingPort instance (advanced). + * @param {number} [options.maxRestoreBufferSize=536870912] - Max buffered restore size in bytes for encrypted/compressed restores (default 512 MiB). */ constructor({ plumbing, chunkSize, codec, policy, crypto, observability, merkleThreshold, concurrency, chunking, chunker, maxRestoreBufferSize }) { this.#config = { plumbing, chunkSize, codec, policy, crypto, observability, merkleThreshold, concurrency, chunking, chunker, maxRestoreBufferSize }; @@ -71,7 +72,7 @@ export default class ContentAddressableStore { this.#servicePromise = null; } - /** @type {{ plumbing: *, chunkSize?: number, codec?: *, policy?: *, crypto?: *, observability?: *, merkleThreshold?: number, concurrency?: number, chunking?: *, chunker?: * }} */ + /** @type {{ plumbing: *, chunkSize?: number, codec?: *, policy?: *, crypto?: *, observability?: *, merkleThreshold?: number, concurrency?: number, chunking?: *, chunker?: *, maxRestoreBufferSize?: number }} */ #config; /** @type {VaultService|null} */ #vault = null; diff --git a/src/infrastructure/adapters/NodeCryptoAdapter.js b/src/infrastructure/adapters/NodeCryptoAdapter.js index c333f76..a317a11 100644 --- a/src/infrastructure/adapters/NodeCryptoAdapter.js +++ b/src/infrastructure/adapters/NodeCryptoAdapter.js @@ -29,7 +29,7 @@ export default class NodeCryptoAdapter extends CryptoPort { * @override * @param {Buffer|Uint8Array} buffer - Plaintext to encrypt. * @param {Buffer|Uint8Array} key - 32-byte encryption key. - * @returns {{ buf: Buffer, meta: import('../../ports/CryptoPort.js').EncryptionMeta }} + * @returns {Promise<{ buf: Buffer, meta: import('../../ports/CryptoPort.js').EncryptionMeta }>} */ async encryptBuffer(buffer, key) { this._validateKey(key); From c910cda941f5fbbd396ef7d4ecd299c1e98fcf6d Mon Sep 17 00:00:00 2001 From: James Ross Date: Tue, 3 Mar 2026 23:19:25 -0800 Subject: [PATCH 26/41] docs(changelog): add PR feedback fixes to unreleased section --- CHANGELOG.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 6c2ea95..77fde54 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -15,8 +15,16 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Changed - **VaultService test observability wiring** — `VaultService.test.js` now passes a `mockObservability()` port to all tests instead of relying on the silent no-op default. `rotateVaultPassphrase.test.js` now passes `SilentObserver` explicitly. If observability wiring breaks, the test suite will catch it. +- **`NodeCryptoAdapter.encryptBuffer` JSDoc** — `@returns` annotation corrected to `Promise<...>`, matching the async implementation. +- **`maxRestoreBufferSize` documented** — constructor JSDoc and `#config` type in `ContentAddressableStore` now include the parameter. +- **ROADMAP.md heading level** — added `## Task Cards` heading between `# M16` and `### 16.1` to satisfy MD001 heading-increment rule. ### Fixed +- **Post-decompression size guard** — `_restoreBuffered` now enforces `maxRestoreBufferSize` after decompression, not just before. Compressed payloads that inflate beyond the configured limit now throw `RESTORE_TOO_LARGE` instead of silently allocating unbounded memory. +- **CLI passphrase prompt deferral** — `resolveEncryptionKey` now checks vault metadata before calling `resolvePassphrase`, avoiding unnecessary TTY prompts for unencrypted vaults. Store action recipient-conflict check inspects flags/env without consuming stdin. +- **CRLF passphrase normalization** — `readPassphraseFile` now strips trailing `\r\n` (Windows line endings) in addition to `\n`, preventing passphrase mismatches from Windows-edited files. +- **Constructor validation** — `CasService.maxRestoreBufferSize` (integer >= 1024), `WebCryptoAdapter.maxEncryptionBufferSize` (finite, positive), and `FixedChunker.chunkSize` (positive integer) are now validated at construction time, preventing silent misconfiguration. +- **Error-path test hardening** — `orphanedBlobs`, `restoreGuard`, `kdfBruteForce`, and `conformance` tests now fail explicitly when expected errors are not thrown (previously silent pass-through). - **16.8 — CasError portability guard** — `Error.captureStackTrace` now guarded with a runtime check. CasError constructs correctly on runtimes where `captureStackTrace` is unavailable (e.g. Firefox, older Deno). - **16.9 — Pre-commit hook + hooks directory** — `scripts/git-hooks/` renamed to `scripts/hooks/` per CLAUDE.md convention. New `pre-commit` hook runs lint gate. `install-hooks.sh` updated accordingly. - **16.1 — Crypto adapter behavioral normalization** — `NodeCryptoAdapter.encryptBuffer` now returns a Promise (was sync), matching Bun/Web. `decryptBuffer` validates key on all adapters. `NodeCryptoAdapter.createEncryptionStream` guards `finalize()` with `STREAM_NOT_CONSUMED`. New conformance test suite asserts identical contracts across all adapters. From 2593cdddcb7eb40d4cf8ec593f26d64ecadda02f Mon Sep 17 00:00:00 2001 From: James Ross Date: Sun, 8 Mar 2026 10:28:26 -0700 Subject: [PATCH 27/41] fix: resolve 19 pre-PR review findings Major fixes: - CasService.d.ts: add missing maxRestoreBufferSize and chunker to CasServiceOptions interface, preventing type drift for ./service consumers - VaultService.addToVault: shallow-copy state.metadata before mutation, preventing encryptionCount from double-incrementing on CAS retries Minor fixes: - CasService: add Number.isInteger check for chunkSize validation - CasService: attach orphanedBlobs meta to CasError on re-throw in _chunkAndStore instead of discarding it - CasService: add @param JSDoc for maxRestoreBufferSize - VaultService: fix observability type annotation (ObservabilityPort) - bin/git-cas.js: fix orphaned JSDoc block for resolvePassphrase - passphrase-prompt: reject empty passphrases, warn on insecure file permissions, add error/close listeners to readline - WebCryptoAdapter: document static #makeEncryptGenerator pattern - actions.test.js: use fake timers instead of real 1s wall-clock delay - CHANGELOG: re-categorize features/refactors misplaced under Fixed - ROADMAP: fix version plan table sort order - GUIDE: update chunkSize error message to match implementation --- CHANGELOG.md | 24 ++++++------ GUIDE.md | 2 +- ROADMAP.md | 2 +- bin/git-cas.js | 22 +++++------ bin/ui/passphrase-prompt.js | 34 ++++++++++++++++- src/domain/services/CasService.d.ts | 10 +++++ src/domain/services/CasService.js | 10 +++-- src/domain/services/VaultService.js | 7 +++- .../adapters/WebCryptoAdapter.js | 5 +++ test/unit/cli/actions.test.js | 14 +++---- test/unit/cli/passphrase-prompt.test.js | 38 ++++++++++++++++--- .../domain/services/CasService.errors.test.js | 4 +- 12 files changed, 127 insertions(+), 45 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 77fde54..e3c20e6 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -12,32 +12,34 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - **M16 Capstone** — New milestone in ROADMAP.md addressing all 9 audit flaws and 10 concerns (C1–C10). 13 task cards, ~698 LoC, ~21h estimated. - **Concerns C8–C10** — Three new architectural concerns identified by the audit: crypto adapter LSP violation (C8), FixedChunker quadratic allocation (C9), encrypt-then-chunk dedup loss (C10). - **CasError codes** — `RESTORE_TOO_LARGE` and `ENCRYPTION_BUFFER_EXCEEDED` registered in canonical error code table. +- **16.2 — Memory restore guard** — `CasService` accepts `maxRestoreBufferSize` (default 512 MiB). `_restoreBuffered` throws `RESTORE_TOO_LARGE` with `{ size, limit }` meta when encrypted/compressed restore would exceed the limit. Unencrypted streaming restore is unaffected. +- **16.3 — Web Crypto encryption buffer guard** — `WebCryptoAdapter` accepts `maxEncryptionBufferSize` (default 512 MiB). Throws `ENCRYPTION_BUFFER_EXCEEDED` when streaming encryption exceeds the limit, since Web Crypto AES-GCM is a one-shot API. NodeCryptoAdapter uses true streaming and is unaffected. +- **16.5 — Encrypt-then-chunk dedup warning** — `CasService.store()` now logs a warning when encryption is combined with CDC chunking, since ciphertext is pseudorandom and content-defined boundaries provide no dedup benefit. +- **16.10 — Orphaned blob tracking** — `STREAM_ERROR` now includes `meta.orphanedBlobs` — an array of OIDs for blobs successfully written before the stream failure. Error metric includes `orphanedBlobs` count for observability. +- **16.11 — Passphrase input security** — New `--vault-passphrase-file ` CLI option reads passphrase from file (use `-` for stdin). Interactive TTY prompt added as fallback when no other passphrase source is available. `resolvePassphrase` is now async with priority: file → flag → env → TTY → undefined. Empty passphrases rejected. File permission warning on group/world-readable files. +- **16.12 — KDF brute-force awareness** — `CasService` now emits `decryption_failed` metric with slug context when decryption fails with `INTEGRITY_ERROR` during encrypted restore. CLI adds a 1-second delay after `INTEGRITY_ERROR` to slow brute-force attempts. Library API imposes no delay — callers manage their own rate-limiting policy. +- **16.13 — GCM nonce collision docs + encryption counter** — `SECURITY.md` moved to project root with new sections: GCM nonce bound (2^32 NIST limit), key rotation frequency, KDF parameter guidance, and passphrase entropy recommendations. Vault metadata now tracks `encryptionCount`, incremented per encrypted `addToVault()`. Observability warning emitted when count exceeds 2^31. `VaultService` accepts optional `observability` port. +- **16.7 — Lifecycle method naming** — Added `inspectAsset()` (replaces `deleteAsset()`) and `collectReferencedChunks()` (replaces `findOrphanedChunks()`) as canonical names on both `CasService` and the facade. Old names are preserved as deprecated aliases that emit observability warnings. Type definitions updated with `@deprecated` JSDoc. ### Changed - **VaultService test observability wiring** — `VaultService.test.js` now passes a `mockObservability()` port to all tests instead of relying on the silent no-op default. `rotateVaultPassphrase.test.js` now passes `SilentObserver` explicitly. If observability wiring breaks, the test suite will catch it. - **`NodeCryptoAdapter.encryptBuffer` JSDoc** — `@returns` annotation corrected to `Promise<...>`, matching the async implementation. - **`maxRestoreBufferSize` documented** — constructor JSDoc and `#config` type in `ContentAddressableStore` now include the parameter. - **ROADMAP.md heading level** — added `## Task Cards` heading between `# M16` and `### 16.1` to satisfy MD001 heading-increment rule. +- **16.1 — Crypto adapter behavioral normalization** — `NodeCryptoAdapter.encryptBuffer` now returns a Promise (was sync), matching Bun/Web. `decryptBuffer` validates key on all adapters. `NodeCryptoAdapter.createEncryptionStream` guards `finalize()` with `STREAM_NOT_CONSUMED`. New conformance test suite asserts identical contracts across all adapters. +- **16.4 — FixedChunker pre-allocated buffer** — Replaced `Buffer.concat()` loop with a pre-allocated `Buffer.allocUnsafe(chunkSize)` working buffer, eliminating O(n²) copies for many small input buffers. Matches the allocation strategy used by `CdcChunker`. ### Fixed - **Post-decompression size guard** — `_restoreBuffered` now enforces `maxRestoreBufferSize` after decompression, not just before. Compressed payloads that inflate beyond the configured limit now throw `RESTORE_TOO_LARGE` instead of silently allocating unbounded memory. - **CLI passphrase prompt deferral** — `resolveEncryptionKey` now checks vault metadata before calling `resolvePassphrase`, avoiding unnecessary TTY prompts for unencrypted vaults. Store action recipient-conflict check inspects flags/env without consuming stdin. - **CRLF passphrase normalization** — `readPassphraseFile` now strips trailing `\r\n` (Windows line endings) in addition to `\n`, preventing passphrase mismatches from Windows-edited files. -- **Constructor validation** — `CasService.maxRestoreBufferSize` (integer >= 1024), `WebCryptoAdapter.maxEncryptionBufferSize` (finite, positive), and `FixedChunker.chunkSize` (positive integer) are now validated at construction time, preventing silent misconfiguration. +- **Constructor validation** — `CasService.maxRestoreBufferSize` (integer >= 1024), `CasService.chunkSize` (integer >= 1024), `WebCryptoAdapter.maxEncryptionBufferSize` (finite, positive), and `FixedChunker.chunkSize` (positive integer) are now validated at construction time, preventing silent misconfiguration. - **Error-path test hardening** — `orphanedBlobs`, `restoreGuard`, `kdfBruteForce`, and `conformance` tests now fail explicitly when expected errors are not thrown (previously silent pass-through). +- **Orphaned blob enrichment on CasError re-throw** — `_chunkAndStore` now attaches `orphanedBlobs` metadata to existing `CasError` instances before re-throwing, instead of discarding the information. +- **VaultService metadata mutation on retry** — `addToVault` now shallow-copies `state.metadata` before mutation, preventing `encryptionCount` from being incremented multiple times across CAS retries. - **16.8 — CasError portability guard** — `Error.captureStackTrace` now guarded with a runtime check. CasError constructs correctly on runtimes where `captureStackTrace` is unavailable (e.g. Firefox, older Deno). - **16.9 — Pre-commit hook + hooks directory** — `scripts/git-hooks/` renamed to `scripts/hooks/` per CLAUDE.md convention. New `pre-commit` hook runs lint gate. `install-hooks.sh` updated accordingly. -- **16.1 — Crypto adapter behavioral normalization** — `NodeCryptoAdapter.encryptBuffer` now returns a Promise (was sync), matching Bun/Web. `decryptBuffer` validates key on all adapters. `NodeCryptoAdapter.createEncryptionStream` guards `finalize()` with `STREAM_NOT_CONSUMED`. New conformance test suite asserts identical contracts across all adapters. -- **16.2 — Memory restore guard** — `CasService` accepts `maxRestoreBufferSize` (default 512 MiB). `_restoreBuffered` throws `RESTORE_TOO_LARGE` with `{ size, limit }` meta when encrypted/compressed restore would exceed the limit. Unencrypted streaming restore is unaffected. -- **16.11 — Passphrase input security** — New `--vault-passphrase-file ` CLI option reads passphrase from file (use `-` for stdin). Interactive TTY prompt added as fallback when no other passphrase source is available. `resolvePassphrase` is now async with priority: file → flag → env → TTY → undefined. - **16.6 — Chunk size upper bound** — CasService, FixedChunker, and CdcChunker now reject chunk sizes exceeding 100 MiB. CasService logs a warning when chunk size exceeds 10 MiB. -- **16.3 — Web Crypto encryption buffer guard** — `WebCryptoAdapter` accepts `maxEncryptionBufferSize` (default 512 MiB). Throws `ENCRYPTION_BUFFER_EXCEEDED` when streaming encryption exceeds the limit, since Web Crypto AES-GCM is a one-shot API. NodeCryptoAdapter uses true streaming and is unaffected. -- **16.5 — Encrypt-then-chunk dedup warning** — `CasService.store()` now logs a warning when encryption is combined with CDC chunking, since ciphertext is pseudorandom and content-defined boundaries provide no dedup benefit. -- **16.10 — Orphaned blob tracking** — `STREAM_ERROR` now includes `meta.orphanedBlobs` — an array of OIDs for blobs successfully written before the stream failure. Error metric includes `orphanedBlobs` count for observability. -- **16.4 — FixedChunker pre-allocated buffer** — Replaced `Buffer.concat()` loop with a pre-allocated `Buffer.allocUnsafe(chunkSize)` working buffer, eliminating O(n²) copies for many small input buffers. Matches the allocation strategy used by `CdcChunker`. -- **16.7 — Lifecycle method naming** — Added `inspectAsset()` (replaces `deleteAsset()`) and `collectReferencedChunks()` (replaces `findOrphanedChunks()`) as canonical names on both `CasService` and the facade. Old names are preserved as deprecated aliases that emit observability warnings. Type definitions updated with `@deprecated` JSDoc. -- **16.12 — KDF brute-force awareness** — `CasService` now emits `decryption_failed` metric with slug context when decryption fails with `INTEGRITY_ERROR` during encrypted restore. CLI adds a 1-second delay after `INTEGRITY_ERROR` to slow brute-force attempts. Library API imposes no delay — callers manage their own rate-limiting policy. -- **16.13 — GCM nonce collision docs + encryption counter** — `SECURITY.md` moved to project root with new sections: GCM nonce bound (2^32 NIST limit), key rotation frequency, KDF parameter guidance, and passphrase entropy recommendations. Vault metadata now tracks `encryptionCount`, incremented per encrypted `addToVault()`. Observability warning emitted when count exceeds 2^31. `VaultService` accepts optional `observability` port. ## [5.2.4] — Prism polish (2026-03-03) diff --git a/GUIDE.md b/GUIDE.md index be6ea26..b72c470 100644 --- a/GUIDE.md +++ b/GUIDE.md @@ -1564,7 +1564,7 @@ file size. However, the restore operation currently concatenates all chunks into a single buffer, so restoring very large files requires enough memory to hold the entire file. -### Q: I get "Chunk size must be at least 1024 bytes" +### Q: I get "Chunk size must be an integer >= 1024 bytes" The minimum chunk size is 1 KiB. This prevents pathologically small chunks that would create excessive Git objects. Increase your `chunkSize` parameter. diff --git a/ROADMAP.md b/ROADMAP.md index 4f79fc4..9a82ecc 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -193,8 +193,8 @@ Return and throw semantics for every public method (current and planned). | v3.1.0 | M13 | Bijou | TUI dashboard & progress | ✅ | | v5.0.0 | M10 | Hydra | Content-defined chunking | ✅ | | v5.1.0 | M11 | Locksmith | Multi-recipient encryption | ✅ | -| v5.3.0 | M16 | Capstone | Audit remediation — all CODE-EVAL.md findings | 🔲 | | v5.2.0 | M12 | Carousel | Key rotation | ✅ | +| v5.3.0 | M16 | Capstone | Audit remediation — all CODE-EVAL.md findings | 🔲 | --- diff --git a/bin/git-cas.js b/bin/git-cas.js index 0cf848c..461e2b8 100755 --- a/bin/git-cas.js +++ b/bin/git-cas.js @@ -75,17 +75,6 @@ async function deriveVaultKey(cas, metadata, passphrase) { return key; } -/** - * Resolve passphrase from (in priority order): - * 1. --vault-passphrase-file - * 2. --vault-passphrase - * 3. GIT_CAS_PASSPHRASE env var - * 4. Interactive TTY prompt (if stdin is a TTY) - * - * @param {Record} opts - * @param {{ confirm?: boolean }} [extra] - * @returns {Promise} - */ /** * Returns true when a non-interactive passphrase source exists (flag or env). * Does NOT trigger prompts or consume stdin. @@ -97,6 +86,17 @@ function hasPassphraseSource(opts) { return Boolean(opts.vaultPassphraseFile || opts.vaultPassphrase || process.env.GIT_CAS_PASSPHRASE); } +/** + * Resolve passphrase from (in priority order): + * 1. --vault-passphrase-file + * 2. --vault-passphrase + * 3. GIT_CAS_PASSPHRASE env var + * 4. Interactive TTY prompt (if stdin is a TTY) + * + * @param {Record} opts + * @param {{ confirm?: boolean }} [extra] + * @returns {Promise} + */ async function resolvePassphrase(opts, extra = {}) { if (opts.vaultPassphraseFile) { return await readPassphraseFile(opts.vaultPassphraseFile); diff --git a/bin/ui/passphrase-prompt.js b/bin/ui/passphrase-prompt.js index 3e261c9..5da323c 100644 --- a/bin/ui/passphrase-prompt.js +++ b/bin/ui/passphrase-prompt.js @@ -1,5 +1,5 @@ import { createInterface } from 'node:readline'; -import { readFile } from 'node:fs/promises'; +import { readFile, stat } from 'node:fs/promises'; /** * Prompts for a passphrase on stderr with echo disabled. @@ -16,6 +16,9 @@ export async function promptPassphrase({ confirm = false } = {}) { ); } const pass = await readHidden('Passphrase: '); + if (!pass) { + throw new Error('Passphrase must not be empty'); + } if (confirm) { const pass2 = await readHidden('Confirm passphrase: '); if (pass !== pass2) { @@ -25,6 +28,24 @@ export async function promptPassphrase({ confirm = false } = {}) { return pass; } +/** + * Warns to stderr if the file at `filePath` is group- or world-readable. + * + * @param {string} filePath + */ +async function warnInsecurePermissions(filePath) { + try { + const st = await stat(filePath); + if (st.mode & 0o077) { + process.stderr.write( + `warning: ${filePath} has insecure permissions — consider chmod 600\n`, + ); + } + } catch { + // stat may fail on non-Unix or non-existent paths; silently skip. + } +} + /** * Reads a passphrase from a file path, or from stdin when path is '-'. * @@ -39,24 +60,33 @@ export async function readPassphraseFile(filePath) { } return Buffer.concat(chunks).toString('utf8').replace(/\r?\n$/, ''); } + await warnInsecurePermissions(filePath); const content = await readFile(filePath, 'utf8'); return content.replace(/\r?\n$/, ''); } /** * Reads a line with echo disabled. + * + * Uses Node.js private API `rl._writeToOutput` to suppress echo — + * this is an intentional access to an undocumented API for password + * input, as there is no public readline API for hidden input. + * * @param {string} prompt - Prompt text. * @returns {Promise} */ function readHidden(prompt) { - return new Promise((resolve) => { + return new Promise((resolve, reject) => { const rl = createInterface({ input: process.stdin, output: process.stderr, terminal: true, }); process.stderr.write(prompt); + rl.on('error', reject); + rl.on('close', () => reject(new Error('readline closed without input'))); rl.question('', (answer) => { + rl.removeAllListeners('close'); rl.close(); process.stderr.write('\n'); resolve(answer); diff --git a/src/domain/services/CasService.d.ts b/src/domain/services/CasService.d.ts index 80440a8..069e82a 100644 --- a/src/domain/services/CasService.d.ts +++ b/src/domain/services/CasService.d.ts @@ -46,6 +46,13 @@ export interface ObservabilityPort { span(name: string): { end(meta?: Record): void }; } +/** Port interface for chunking strategies (fixed, CDC, etc.). */ +export interface ChunkingPort { + chunk(source: AsyncIterable): AsyncIterable; + readonly strategy: string; + readonly params: Record; +} + /** Constructor options for {@link CasService}. */ export interface CasServiceOptions { persistence: GitPersistencePort; @@ -55,6 +62,8 @@ export interface CasServiceOptions { chunkSize?: number; merkleThreshold?: number; concurrency?: number; + chunker?: ChunkingPort; + maxRestoreBufferSize?: number; } /** Options for key derivation. */ @@ -90,6 +99,7 @@ export default class CasService { readonly chunkSize: number; readonly merkleThreshold: number; readonly concurrency: number; + readonly maxRestoreBufferSize: number; constructor(options: CasServiceOptions); diff --git a/src/domain/services/CasService.js b/src/domain/services/CasService.js index b841b54..851fd75 100644 --- a/src/domain/services/CasService.js +++ b/src/domain/services/CasService.js @@ -34,6 +34,7 @@ export default class CasService { * @param {number} [options.merkleThreshold=1000] - Chunk count threshold for Merkle manifests. * @param {number} [options.concurrency=1] - Maximum parallel chunk I/O operations. * @param {import('../../ports/ChunkingPort.js').default} [options.chunker] - Chunking strategy (default FixedChunker). + * @param {number} [options.maxRestoreBufferSize=536870912] - Max bytes for buffered restore (default 512 MiB). */ constructor({ persistence, codec, crypto, observability, chunkSize = 256 * 1024, merkleThreshold = 1000, concurrency = 1, chunker, maxRestoreBufferSize = 512 * 1024 * 1024 }) { CasService._validateObservability(observability); @@ -59,8 +60,8 @@ export default class CasService { * @private */ static #validateConstructorArgs({ chunkSize, merkleThreshold, concurrency, maxRestoreBufferSize }) { - if (chunkSize < 1024) { - throw new Error('Chunk size must be at least 1024 bytes'); + if (!Number.isInteger(chunkSize) || chunkSize < 1024) { + throw new Error('Chunk size must be an integer >= 1024 bytes'); } const MAX_CHUNK_SIZE = 100 * 1024 * 1024; if (chunkSize > MAX_CHUNK_SIZE) { @@ -150,7 +151,10 @@ export default class CasService { const orphanedBlobs = settled .filter((r) => r.status === 'fulfilled') .map((r) => r.value.blob); - if (err instanceof CasError) { throw err; } + if (err instanceof CasError) { + err.meta = { ...err.meta, orphanedBlobs }; + throw err; + } const casErr = new CasError( `Stream error during store: ${err.message}`, 'STREAM_ERROR', diff --git a/src/domain/services/VaultService.js b/src/domain/services/VaultService.js index 09009b8..c793a39 100644 --- a/src/domain/services/VaultService.js +++ b/src/domain/services/VaultService.js @@ -94,7 +94,7 @@ export default class VaultService { this.persistence = persistence; this.ref = ref; this.crypto = crypto; - /** @type {{ metric: Function, log: Function, span: Function }} */ + /** @type {import('../../ports/ObservabilityPort.js').default} */ this.observability = observability || { metric() {}, log() {}, span: () => ({ end() {} }) }; } @@ -395,8 +395,11 @@ export default class VaultService { } const isUpdate = state.entries.has(slug); state.entries.set(slug, treeOid); - const metadata = state.metadata || { version: 1 }; + // Shallow copy to avoid mutating readState()'s object on CAS retries. + const metadata = { ...(state.metadata || { version: 1 }) }; if (metadata.encryption) { + // Tracks nonce-relevant operations: every addToVault on an encrypted + // vault implies an encryption occurred at the store layer. metadata.encryptionCount = (metadata.encryptionCount || 0) + 1; if (metadata.encryptionCount >= VaultService.ENCRYPTION_COUNT_WARN) { this.observability.log( diff --git a/src/infrastructure/adapters/WebCryptoAdapter.js b/src/infrastructure/adapters/WebCryptoAdapter.js index e40f1ed..1032934 100644 --- a/src/infrastructure/adapters/WebCryptoAdapter.js +++ b/src/infrastructure/adapters/WebCryptoAdapter.js @@ -137,6 +137,11 @@ export default class WebCryptoAdapter extends CryptoPort { /** * Builds the encrypt async generator for createEncryptionStream. + * + * A static method is used (rather than closures) because `async function*` + * cannot be an arrow function — `this` binding would be lost. The `state` + * object bridges mutable data between the generator and `finalize()`. + * * @param {{ cryptoKeyPromise: Promise, nonce: Buffer|Uint8Array, maxBuf: number, state: { tag: Uint8Array|null, consumed: boolean } }} ctx * @returns {(source: AsyncIterable) => AsyncGenerator} */ diff --git a/test/unit/cli/actions.test.js b/test/unit/cli/actions.test.js index e7839c5..9019b82 100644 --- a/test/unit/cli/actions.test.js +++ b/test/unit/cli/actions.test.js @@ -116,9 +116,11 @@ describe('runAction — INTEGRITY_ERROR rate-limiting', () => { beforeEach(() => { process.exitCode = undefined; stderrSpy = vi.spyOn(process.stderr, 'write').mockImplementation(() => true); + vi.useFakeTimers(); }); afterEach(() => { + vi.useRealTimers(); process.exitCode = originalExitCode; stderrSpy.mockRestore(); }); @@ -126,20 +128,18 @@ describe('runAction — INTEGRITY_ERROR rate-limiting', () => { it('delays ~1s on INTEGRITY_ERROR before writing output', async () => { const err = Object.assign(new Error('bad key'), { code: 'INTEGRITY_ERROR' }); const action = runAction(async () => { throw err; }, () => false); - const start = Date.now(); - await action(); - const elapsed = Date.now() - start; - expect(elapsed).toBeGreaterThanOrEqual(900); + const promise = action(); + await vi.advanceTimersByTimeAsync(1000); + await promise; expect(process.exitCode).toBe(1); + expect(stderrSpy).toHaveBeenCalled(); }); it('no delay for non-INTEGRITY_ERROR codes', async () => { const err = Object.assign(new Error('gone'), { code: 'MISSING_KEY' }); const action = runAction(async () => { throw err; }, () => false); - const start = Date.now(); await action(); - const elapsed = Date.now() - start; - expect(elapsed).toBeLessThan(200); + expect(process.exitCode).toBe(1); }); }); diff --git a/test/unit/cli/passphrase-prompt.test.js b/test/unit/cli/passphrase-prompt.test.js index 3587713..77df918 100644 --- a/test/unit/cli/passphrase-prompt.test.js +++ b/test/unit/cli/passphrase-prompt.test.js @@ -4,13 +4,13 @@ import { tmpdir } from 'node:os'; import { join } from 'node:path'; import { readPassphraseFile } from '../../../bin/ui/passphrase-prompt.js'; -describe('readPassphraseFile', () => { - const tmpPath = join(tmpdir(), `test-passphrase-${Date.now()}.txt`); +const tmpPath = join(tmpdir(), `test-passphrase-${Date.now()}.txt`); - afterEach(async () => { - try { await unlink(tmpPath); } catch { /* may not exist */ } - }); +afterEach(async () => { + try { await unlink(tmpPath); } catch { /* may not exist */ } +}); +describe('readPassphraseFile', () => { it('reads from file and trims trailing newline', async () => { await writeFile(tmpPath, 'my-secret\n', 'utf8'); const result = await readPassphraseFile(tmpPath); @@ -35,3 +35,31 @@ describe('readPassphraseFile', () => { expect(result).toBe('win-secret'); }); }); + +describe('readPassphraseFile — permission warnings', () => { + it('warns on group/world-readable file permissions', async () => { + const writeSpy = []; + const origWrite = process.stderr.write; + process.stderr.write = (/** @type {any} */ chunk) => { writeSpy.push(String(chunk)); return true; }; + try { + await writeFile(tmpPath, 'secret\n', { mode: 0o644 }); + await readPassphraseFile(tmpPath); + expect(writeSpy.some((s) => s.includes('permissions'))).toBe(true); + } finally { + process.stderr.write = origWrite; + } + }); + + it('no warning for restricted file permissions', async () => { + const writeSpy = []; + const origWrite = process.stderr.write; + process.stderr.write = (/** @type {any} */ chunk) => { writeSpy.push(String(chunk)); return true; }; + try { + await writeFile(tmpPath, 'secret\n', { mode: 0o600 }); + await readPassphraseFile(tmpPath); + expect(writeSpy.some((s) => s.includes('permissions'))).toBe(false); + } finally { + process.stderr.write = origWrite; + } + }); +}); diff --git a/test/unit/domain/services/CasService.errors.test.js b/test/unit/domain/services/CasService.errors.test.js index ef1d13a..c06172c 100644 --- a/test/unit/domain/services/CasService.errors.test.js +++ b/test/unit/domain/services/CasService.errors.test.js @@ -26,13 +26,13 @@ describe('CasService – constructor – chunkSize validation', () => { it('throws when chunkSize is 0', () => { expect( () => new CasService({ persistence: mockPersistence, crypto: testCrypto, codec: new JsonCodec(), chunkSize: 0, observability: new SilentObserver() }), - ).toThrow('Chunk size must be at least 1024 bytes'); + ).toThrow('Chunk size must be an integer >= 1024 bytes'); }); it('throws when chunkSize is 512', () => { expect( () => new CasService({ persistence: mockPersistence, crypto: testCrypto, codec: new JsonCodec(), chunkSize: 512, observability: new SilentObserver() }), - ).toThrow('Chunk size must be at least 1024 bytes'); + ).toThrow('Chunk size must be an integer >= 1024 bytes'); }); it('accepts chunkSize of exactly 1024', () => { From 50b9c119bf4401b7dc1b86f9d00fa5d99d543b2a Mon Sep 17 00:00:00 2001 From: James Ross Date: Sun, 8 Mar 2026 10:39:46 -0700 Subject: [PATCH 28/41] feat(cli): expose store/restore configuration flags and .casrc config file MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add CLI flags to `git cas store`: --gzip, --strategy, --chunk-size, --concurrency, --codec, --merkle-threshold, --target-chunk-size, --min-chunk-size, --max-chunk-size. Add CLI flags to `git cas restore`: --concurrency, --max-restore-buffer. Introduce `.casrc` JSON config file support — placed at the repo root, it provides defaults for all CLI flags. CLI flags always take precedence. Fix ROADMAP.md M16 milestone summary LoC/hours discrepancy (~430/~28h corrected to ~698/~21h to match the detailed task breakdown). --- CHANGELOG.md | 4 ++ GUIDE.md | 45 +++++++++++++ README.md | 31 +++++++++ ROADMAP.md | 2 +- bin/config.js | 123 +++++++++++++++++++++++++++++++++++ bin/git-cas.js | 49 ++++++++++++-- test/unit/cli/config.test.js | 123 +++++++++++++++++++++++++++++++++++ 7 files changed, 370 insertions(+), 7 deletions(-) create mode 100644 bin/config.js create mode 100644 test/unit/cli/config.test.js diff --git a/CHANGELOG.md b/CHANGELOG.md index e3c20e6..98b1386 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -8,6 +8,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [Unreleased] ### Added +- **CLI store flags** — `--gzip`, `--strategy `, `--chunk-size `, `--concurrency `, `--codec `, `--merkle-threshold `, `--target-chunk-size `, `--min-chunk-size `, `--max-chunk-size `. All library-level chunking, compression, codec, and concurrency options are now accessible from the CLI. +- **CLI restore flags** — `--concurrency `, `--max-restore-buffer `. Parallel I/O and restore buffer limit now configurable from CLI. +- **`.casrc` config file** — JSON config file at the repository root provides default values for CLI flags. CLI flags always take precedence. Supports: `chunkSize`, `strategy`, `concurrency`, `codec`, `compression`, `merkleThreshold`, `maxRestoreBufferSize`, and `cdc.*` sub-keys. - **CODE-EVAL.md** — Forensic architectural audit (zero-knowledge code extraction, critical assessment, roadmap reconciliation, prescriptive blueprint). - **M16 Capstone** — New milestone in ROADMAP.md addressing all 9 audit flaws and 10 concerns (C1–C10). 13 task cards, ~698 LoC, ~21h estimated. - **Concerns C8–C10** — Three new architectural concerns identified by the audit: crypto adapter LSP violation (C8), FixedChunker quadratic allocation (C9), encrypt-then-chunk dedup loss (C10). @@ -40,6 +43,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - **16.8 — CasError portability guard** — `Error.captureStackTrace` now guarded with a runtime check. CasError constructs correctly on runtimes where `captureStackTrace` is unavailable (e.g. Firefox, older Deno). - **16.9 — Pre-commit hook + hooks directory** — `scripts/git-hooks/` renamed to `scripts/hooks/` per CLAUDE.md convention. New `pre-commit` hook runs lint gate. `install-hooks.sh` updated accordingly. - **16.6 — Chunk size upper bound** — CasService, FixedChunker, and CdcChunker now reject chunk sizes exceeding 100 MiB. CasService logs a warning when chunk size exceeds 10 MiB. +- **ROADMAP.md M16 summary** — Corrected LoC/hours from `~430/~28h` to `~698/~21h` to match the detailed task breakdown. ## [5.2.4] — Prism polish (2026-03-03) diff --git a/GUIDE.md b/GUIDE.md index b72c470..46d6f1a 100644 --- a/GUIDE.md +++ b/GUIDE.md @@ -575,6 +575,51 @@ git cas restore a1b2c3d4e5f67890... --out ./decrypted-vacation.jpg --key-file ./ # Output: 524288 ``` +### Compression, Chunking, and Codec Flags + +```bash +# Enable gzip compression +git cas store ./data.bin --slug my-data --tree --gzip + +# Use CDC (content-defined chunking) for sub-file deduplication +git cas store ./data.bin --slug my-data --tree --strategy cdc + +# Customize chunk size and enable parallel I/O +git cas store ./data.bin --slug my-data --tree --chunk-size 65536 --concurrency 4 + +# Use CBOR codec for smaller manifests +git cas store ./data.bin --slug my-data --tree --codec cbor + +# CDC with custom parameters +git cas store ./data.bin --slug my-data --tree \ + --strategy cdc --target-chunk-size 32768 \ + --min-chunk-size 8192 --max-chunk-size 131072 + +# Restore with parallel I/O +git cas restore --slug my-data --out ./data.bin --concurrency 4 +``` + +### Project Config File (`.casrc`) + +Place a `.casrc` JSON file at the repository root to set defaults. CLI flags +always take precedence. + +```json +{ + "chunkSize": 65536, + "strategy": "cdc", + "concurrency": 4, + "codec": "json", + "compression": "gzip", + "merkleThreshold": 500, + "cdc": { + "minChunkSize": 8192, + "targetChunkSize": 32768, + "maxChunkSize": 131072 + } +} +``` + ### Working Directory By default the CLI operates in the current directory. Use `--cwd` to point at diff --git a/README.md b/README.md index 1013a25..4ecad3f 100644 --- a/README.md +++ b/README.md @@ -295,10 +295,41 @@ git cas vault init git cas store ./secret.bin --slug vault-entry --tree git cas restore --slug vault-entry --out ./decrypted.bin +# Compression, chunking, codec, concurrency +git cas store ./data.bin --slug my-data --tree --gzip +git cas store ./data.bin --slug my-data --tree --strategy cdc +git cas store ./data.bin --slug my-data --tree --chunk-size 65536 --concurrency 4 +git cas store ./data.bin --slug my-data --tree --codec cbor + +# Restore with concurrency +git cas restore --slug my-data --out ./data.bin --concurrency 4 + # JSON output on any command (for CI/scripting) git cas store ./data.bin --slug my-data --tree --json ``` +### `.casrc` — Project Config File + +Place a `.casrc` JSON file at the repository root to set defaults for CLI flags. +CLI flags always take precedence over `.casrc` values. + +```json +{ + "chunkSize": 65536, + "strategy": "cdc", + "concurrency": 4, + "codec": "json", + "compression": "gzip", + "merkleThreshold": 500, + "maxRestoreBufferSize": 1073741824, + "cdc": { + "minChunkSize": 8192, + "targetChunkSize": 32768, + "maxChunkSize": 131072 + } +} +``` + ## Documentation - [Guide](./GUIDE.md) — progressive walkthrough diff --git a/ROADMAP.md b/ROADMAP.md index 9a82ecc..e99942f 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -228,7 +228,7 @@ M15 Prism ─────────────── ✅ | M10| Hydra | Content-defined chunking | v5.0.0 | 4 | ~690 | ~22h | ✅ CLOSED | | M11| Locksmith | Multi-recipient encryption | v5.1.0 | 4 | ~580 | ~20h | ✅ CLOSED | | M12| Carousel | Key rotation | v5.2.0 | 4 | ~400 | ~13h | ✅ CLOSED | -| M16| Capstone | Audit remediation | v5.3.0 | 13 | ~430 | ~28h | 🔲 OPEN | +| M16| Capstone | Audit remediation | v5.3.0 | 13 | ~698 | ~21h | 🔲 OPEN | Completed task cards are in [COMPLETED_TASKS.md](./COMPLETED_TASKS.md). Superseded tasks are in [GRAVEYARD.md](./GRAVEYARD.md). diff --git a/bin/config.js b/bin/config.js new file mode 100644 index 0000000..a8aa35c --- /dev/null +++ b/bin/config.js @@ -0,0 +1,123 @@ +/** + * @fileoverview Loads `.casrc` project config from the Git working directory. + * + * `.casrc` is a JSON file placed at the repository root that provides default + * values for CLI flags. CLI flags always take precedence over `.casrc` values. + * + * Supported keys: + * chunkSize — Chunk size in bytes (integer >= 1024, default 262144) + * strategy — Chunking strategy: "fixed" or "cdc" (default "fixed") + * concurrency — Parallel chunk I/O operations (positive integer, default 1) + * codec — Manifest codec: "json" or "cbor" (default "json") + * compression — Compression algorithm: "gzip" or false (default false) + * merkleThreshold — Chunk count threshold for Merkle sub-manifests (default 1000) + * maxRestoreBufferSize — Max bytes for buffered restore (default 536870912) + * cdc.minChunkSize — CDC minimum chunk size + * cdc.targetChunkSize — CDC target chunk size + * cdc.maxChunkSize — CDC maximum chunk size + */ + +import { readFileSync } from 'node:fs'; +import { resolve } from 'node:path'; + +const FILENAME = '.casrc'; + +/** + * @typedef {Object} CasConfig + * @property {number} [chunkSize] + * @property {string} [strategy] + * @property {number} [concurrency] + * @property {string} [codec] + * @property {string|false} [compression] + * @property {number} [merkleThreshold] + * @property {number} [maxRestoreBufferSize] + * @property {{ minChunkSize?: number, targetChunkSize?: number, maxChunkSize?: number }} [cdc] + */ + +/** + * Loads `.casrc` from the given directory, returning an empty object if not found. + * + * @param {string} cwd - Directory to search for `.casrc`. + * @returns {CasConfig} + */ +export function loadConfig(cwd) { + const filePath = resolve(cwd, FILENAME); + try { + const raw = readFileSync(filePath, 'utf8'); + const config = JSON.parse(raw); + if (typeof config !== 'object' || config === null || Array.isArray(config)) { + throw new Error(`${FILENAME}: expected a JSON object`); + } + return config; + } catch (err) { + if (err.code === 'ENOENT') { + return {}; + } + if (err instanceof SyntaxError) { + throw new Error(`${FILENAME}: invalid JSON — ${err.message}`); + } + throw err; + } +} + +/** + * Sets key on target if value is not undefined. + * @param {Record} target + * @param {string} key + * @param {any} value + */ +function setIfDefined(target, key, value) { + if (value !== undefined) { target[key] = value; } +} + +/** + * Resolves chunking config from merged CLI + config values. + * @param {{ strategy?: string, chunkSize?: number, cliOpts: Record, config: CasConfig }} opts + * @returns {Record|undefined} + */ +function resolveChunking({ strategy, chunkSize, cliOpts, config }) { + if (strategy === 'cdc') { + const cdcConf = config.cdc || {}; + return { + strategy: 'cdc', + targetChunkSize: cliOpts.targetChunkSize ?? cdcConf.targetChunkSize, + minChunkSize: cliOpts.minChunkSize ?? cdcConf.minChunkSize, + maxChunkSize: cliOpts.maxChunkSize ?? cdcConf.maxChunkSize, + }; + } + if (strategy === 'fixed' && chunkSize !== undefined) { + return { strategy: 'fixed', chunkSize }; + } + return undefined; +} + +/** + * Merges CLI options over `.casrc` defaults. CLI flags take precedence. + * + * @param {Record} cliOpts - Parsed CLI options. + * @param {CasConfig} config - Loaded `.casrc` config. + * @returns {{ casConfig: Record, storeExtras: Record }} + */ +export function mergeConfig(cliOpts, config) { + const strategy = cliOpts.strategy || config.strategy; + const chunkSize = cliOpts.chunkSize ?? config.chunkSize; + + /** @type {Record} */ + const casConfig = {}; + setIfDefined(casConfig, 'concurrency', cliOpts.concurrency ?? config.concurrency); + setIfDefined(casConfig, 'chunkSize', chunkSize); + setIfDefined(casConfig, 'merkleThreshold', cliOpts.merkleThreshold ?? config.merkleThreshold); + setIfDefined(casConfig, 'maxRestoreBufferSize', cliOpts.maxRestoreBufferSize ?? config.maxRestoreBufferSize); + setIfDefined(casConfig, 'chunking', resolveChunking({ strategy, chunkSize, cliOpts, config })); + + const codec = cliOpts.codec || config.codec; + if (codec === 'cbor') { casConfig.codec = 'cbor'; } + + /** @type {Record} */ + const storeExtras = {}; + if (cliOpts.gzip || config.compression === 'gzip') { + storeExtras.compression = { algorithm: 'gzip' }; + } + + return { casConfig, storeExtras }; +} diff --git a/bin/git-cas.js b/bin/git-cas.js index 461e2b8..9a5abf2 100755 --- a/bin/git-cas.js +++ b/bin/git-cas.js @@ -3,7 +3,7 @@ import { readFileSync } from 'node:fs'; import { program } from 'commander'; import GitPlumbing, { ShellRunnerFactory } from '@git-stunts/plumbing'; -import ContentAddressableStore, { EventEmitterObserver } from '../index.js'; +import ContentAddressableStore, { EventEmitterObserver, CborCodec } from '../index.js'; import Manifest from '../src/domain/value-objects/Manifest.js'; import { createStoreProgress, createRestoreProgress } from './ui/progress.js'; import { renderEncryptionCard } from './ui/encryption-card.js'; @@ -13,6 +13,7 @@ import { renderHeatmap } from './ui/heatmap.js'; import { runAction } from './actions.js'; import { filterEntries, formatTable, formatTabSeparated } from './ui/vault-list.js'; import { readPassphraseFile, promptPassphrase } from './ui/passphrase-prompt.js'; +import { loadConfig, mergeConfig } from './config.js'; const getJson = () => program.opts().json; @@ -38,16 +39,21 @@ function readKeyFile(keyFilePath) { } /** - * Create a CAS instance for the given working directory with an optional observability adapter. + * Create a CAS instance for the given working directory. * * @param {string} cwd - * @param {{ observability?: import('../index.js').ObservabilityPort }} [opts] + * @param {Record} [opts] * @returns {ContentAddressableStore} */ function createCas(cwd, opts = {}) { const runner = ShellRunnerFactory.create(); const plumbing = new GitPlumbing({ runner, cwd }); - return new ContentAddressableStore({ plumbing, observability: opts.observability }); + /** @type {Record} */ + const casOpts = { plumbing, ...opts }; + if (casOpts.codec === 'cbor') { + casOpts.codec = new CborCodec(); + } + return new ContentAddressableStore(casOpts); } /** @@ -208,6 +214,13 @@ function parseRecipient(value, previous) { return list; } +/** @param {string} v */ +const parseIntFlag = (v) => { + const n = parseInt(v, 10); + if (Number.isNaN(n)) { throw new Error(`Expected an integer, got "${v}"`); } + return n; +}; + program .command('store ') .description('Store a file into Git CAS') @@ -218,6 +231,15 @@ program .option('--force', 'Overwrite existing vault entry') .option('--vault-passphrase ', 'Vault-level passphrase for encryption (prefer GIT_CAS_PASSPHRASE env var)') .option('--vault-passphrase-file ', 'Read vault passphrase from file (use - for stdin)') + .option('--gzip', 'Enable gzip compression') + .option('--strategy ', 'Chunking strategy: fixed or cdc') + .option('--chunk-size ', 'Chunk size in bytes', parseIntFlag) + .option('--concurrency ', 'Parallel chunk I/O operations', parseIntFlag) + .option('--codec ', 'Manifest codec: json or cbor') + .option('--target-chunk-size ', 'CDC target chunk size', parseIntFlag) + .option('--min-chunk-size ', 'CDC minimum chunk size', parseIntFlag) + .option('--max-chunk-size ', 'CDC maximum chunk size', parseIntFlag) + .option('--merkle-threshold ', 'Chunk count threshold for Merkle sub-manifests', parseIntFlag) .option('--cwd ', 'Git working directory', '.') .action(runAction(async (/** @type {string} */ file, /** @type {Record} */ opts) => { if (opts.recipient && (opts.keyFile || hasPassphraseSource(opts))) { @@ -229,9 +251,13 @@ program const json = program.opts().json; const quiet = program.opts().quiet || json; const observer = new EventEmitterObserver(); - const cas = createCas(opts.cwd, { observability: observer }); + + const config = loadConfig(opts.cwd); + const { casConfig, storeExtras } = mergeConfig(opts, config); + const cas = createCas(opts.cwd, { observability: observer, ...casConfig }); const storeOpts = await buildStoreOpts(cas, file, opts); + Object.assign(storeOpts, storeExtras); const progress = createStoreProgress({ filePath: file, chunkSize: cas.chunkSize, quiet }); progress.attach(observer); let manifest; @@ -308,12 +334,23 @@ program .option('--key-file ', 'Path to 32-byte raw encryption key file') .option('--vault-passphrase ', 'Vault-level passphrase for decryption (prefer GIT_CAS_PASSPHRASE env var)') .option('--vault-passphrase-file ', 'Read vault passphrase from file (use - for stdin)') + .option('--concurrency ', 'Parallel chunk I/O operations', parseIntFlag) + .option('--max-restore-buffer ', 'Max bytes for buffered encrypted/compressed restore', parseIntFlag) .option('--cwd ', 'Git working directory', '.') .action(runAction(async (/** @type {Record} */ opts) => { validateRestoreFlags(opts); const quiet = program.opts().quiet || program.opts().json; const observer = new EventEmitterObserver(); - const cas = createCas(opts.cwd, { observability: observer }); + + const config = loadConfig(opts.cwd); + /** @type {Record} */ + const casConfig = {}; + const concurrency = opts.concurrency ?? config.concurrency; + const maxRestoreBufferSize = opts.maxRestoreBuffer ?? config.maxRestoreBufferSize; + if (concurrency !== undefined) { casConfig.concurrency = concurrency; } + if (maxRestoreBufferSize !== undefined) { casConfig.maxRestoreBufferSize = maxRestoreBufferSize; } + + const cas = createCas(opts.cwd, { observability: observer, ...casConfig }); const treeOid = opts.oid || await cas.resolveVaultEntry({ slug: opts.slug }); const manifest = await cas.readManifest({ treeOid }); diff --git a/test/unit/cli/config.test.js b/test/unit/cli/config.test.js new file mode 100644 index 0000000..4727d7f --- /dev/null +++ b/test/unit/cli/config.test.js @@ -0,0 +1,123 @@ +import { describe, it, expect, afterEach } from 'vitest'; +import { writeFileSync, mkdirSync, rmSync } from 'node:fs'; +import { join } from 'node:path'; +import { tmpdir } from 'node:os'; +import { loadConfig, mergeConfig } from '../../../bin/config.js'; + +const tmpDir = join(tmpdir(), `casrc-test-${Date.now()}`); + +function setup() { + mkdirSync(tmpDir, { recursive: true }); +} + +function teardown() { + rmSync(tmpDir, { recursive: true, force: true }); +} + +describe('loadConfig', () => { + afterEach(teardown); + + it('returns empty object when .casrc does not exist', () => { + setup(); + expect(loadConfig(tmpDir)).toEqual({}); + }); + + it('loads valid JSON from .casrc', () => { + setup(); + writeFileSync(join(tmpDir, '.casrc'), JSON.stringify({ chunkSize: 65536, strategy: 'cdc' })); + const config = loadConfig(tmpDir); + expect(config.chunkSize).toBe(65536); + expect(config.strategy).toBe('cdc'); + }); + + it('throws on invalid JSON', () => { + setup(); + writeFileSync(join(tmpDir, '.casrc'), '{bad json'); + expect(() => loadConfig(tmpDir)).toThrow(/invalid JSON/); + }); + + it('throws on non-object JSON', () => { + setup(); + writeFileSync(join(tmpDir, '.casrc'), '"just a string"'); + expect(() => loadConfig(tmpDir)).toThrow(/expected a JSON object/); + }); + + it('throws on array JSON', () => { + setup(); + writeFileSync(join(tmpDir, '.casrc'), '[1, 2, 3]'); + expect(() => loadConfig(tmpDir)).toThrow(/expected a JSON object/); + }); +}); + +describe('mergeConfig — CLI overrides', () => { + it('CLI flags override config', () => { + const { casConfig } = mergeConfig({ chunkSize: 4096, strategy: 'fixed' }, { chunkSize: 65536 }); + expect(casConfig.chunkSize).toBe(4096); + expect(casConfig.chunking).toEqual({ strategy: 'fixed', chunkSize: 4096 }); + }); + + it('config fills in when CLI omits flags', () => { + const { casConfig } = mergeConfig({}, { concurrency: 4, chunkSize: 32768 }); + expect(casConfig.concurrency).toBe(4); + expect(casConfig.chunkSize).toBe(32768); + }); +}); + +describe('mergeConfig — CDC strategy', () => { + it('CDC strategy merges cdc sub-config', () => { + const config = { cdc: { targetChunkSize: 8192, minChunkSize: 2048, maxChunkSize: 16384 } }; + const { casConfig } = mergeConfig({ strategy: 'cdc' }, config); + expect(casConfig.chunking).toEqual({ + strategy: 'cdc', + targetChunkSize: 8192, + minChunkSize: 2048, + maxChunkSize: 16384, + }); + }); + + it('CDC CLI params override cdc sub-config', () => { + const config = { cdc: { targetChunkSize: 8192, minChunkSize: 2048, maxChunkSize: 16384 } }; + const { casConfig } = mergeConfig({ strategy: 'cdc', targetChunkSize: 4096 }, config); + expect(casConfig.chunking.targetChunkSize).toBe(4096); + expect(casConfig.chunking.minChunkSize).toBe(2048); + }); +}); + +describe('mergeConfig — compression', () => { + it('gzip from CLI', () => { + const { storeExtras } = mergeConfig({ gzip: true }, {}); + expect(storeExtras.compression).toEqual({ algorithm: 'gzip' }); + }); + + it('gzip from config', () => { + const { storeExtras } = mergeConfig({}, { compression: 'gzip' }); + expect(storeExtras.compression).toEqual({ algorithm: 'gzip' }); + }); + + it('no compression by default', () => { + const { storeExtras } = mergeConfig({}, {}); + expect(storeExtras.compression).toBeUndefined(); + }); +}); + +describe('mergeConfig — codec and thresholds', () => { + it('cbor codec from CLI', () => { + const { casConfig } = mergeConfig({ codec: 'cbor' }, {}); + expect(casConfig.codec).toBe('cbor'); + }); + + it('cbor codec from config', () => { + const { casConfig } = mergeConfig({}, { codec: 'cbor' }); + expect(casConfig.codec).toBe('cbor'); + }); + + it('merkleThreshold from config', () => { + const { casConfig } = mergeConfig({}, { merkleThreshold: 500 }); + expect(casConfig.merkleThreshold).toBe(500); + }); + + it('maxRestoreBufferSize from config', () => { + const { casConfig } = mergeConfig({}, { maxRestoreBufferSize: 1024 * 1024 }); + expect(casConfig.maxRestoreBufferSize).toBe(1024 * 1024); + }); +}); From ab0318363bdf61fa76535fbb4b461a7e60d96e29 Mon Sep 17 00:00:00 2001 From: James Ross Date: Sun, 8 Mar 2026 10:47:34 -0700 Subject: [PATCH 29/41] =?UTF-8?q?docs(roadmap):=20update=20backlog=20?= =?UTF-8?q?=E2=80=94=20mark=20C1=E2=80=93C10=20mitigated,=20V6=20done,=20a?= =?UTF-8?q?dd=20V9=E2=80=93V12?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit All 10 concerns (C1–C10) now marked ✅ MITIGATED with status lines referencing their implementing tasks. Vision 6 (passphrase prompt) marked ✅ DONE — subsumed by Task 16.11. New visions added for remaining CLI/UX gaps: - V9: `vault status` command (metadata, encryptionCount, nonce health) - V10: `gc` command (collectReferencedChunks + git gc) - V11: KDF parameter tuning via `.casrc` - V12: File-level passphrase CLI flag Summary table updated with Status column. Backlog section updated to reflect CLI parity achievement (.casrc + store/restore flags). --- ROADMAP.md | 186 ++++++++++++++++++++++++++++++++++++++++------------- 1 file changed, 140 insertions(+), 46 deletions(-) diff --git a/ROADMAP.md b/ROADMAP.md index e99942f..81fdffc 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -1098,6 +1098,9 @@ Consistency and DRY fixes surfaced by architecture audit. No new features, no AP Ideas for future milestones. Not committed, not prioritized — just captured. +### CLI Parity *(recently shipped)* +All library-level configuration is now accessible from the CLI: `--gzip`, `--strategy`, `--chunk-size`, `--concurrency`, `--codec`, `--merkle-threshold`, CDC parameters, `--max-restore-buffer`. Project config file (`.casrc`) provides repository-level defaults. See V9–V12 for remaining CLI gaps. + ### Named Vaults Multiple vaults instead of one. Refs move from `refs/cas/vault` to `refs/cas/vaults/`. Default vault is `default`. CLI gets `--vault ` flag. @@ -1108,7 +1111,8 @@ Multiple vaults instead of one. Refs move from `refs/cas/vault` to `refs/cas/vau ### Vault Management - **Move into vault** — `git cas vault add --slug --oid ` to adopt an existing CAS tree into the vault (the API `addToVault()` already supports this; just needs a CLI command). -- **Purge from CAS** — remove an entry from the vault and run `git gc` to reclaim storage. Tricky because git doesn't delete individual objects — you remove refs and let GC handle it. +- **Vault status** — `git cas vault status` shows metadata, `encryptionCount`, entry count, nonce health. See V9. +- **Purge from CAS** — remove an entry from the vault and run `git gc` to reclaim storage. See V10. ### Publish / Mount - **Publish to working tree** — `git cas publish --slug assets/hero --to docs/hero.gif` reconstitutes a vault entry into the repo's working tree so it's servable by GitHub (markdown images, Pages, etc.). @@ -1342,30 +1346,96 @@ Zstd alone would give 5-10x faster compression with equal or better ratio. For a --- -## Vision 6: Interactive Passphrase Prompt +## Vision 6: Interactive Passphrase Prompt ✅ DONE + +**Status:** Implemented by Task 16.11. See `bin/ui/passphrase-prompt.js`. + +Passphrase resolution priority: `--vault-passphrase-file` → `--vault-passphrase` → `GIT_CAS_PASSPHRASE` → interactive TTY prompt. Confirmation prompt on first use (vault init). File permission warnings on group/world-readable passphrase files. CRLF normalization for Windows compatibility. + +--- + +## Vision 9: Vault Status Command **The Pitch** -Replace `--vault-passphrase "my secret"` (visible in shell history, `ps` output, and CI logs) with an interactive TTY prompt that reads the passphrase from stdin with echo disabled. Like `gpg`, `ssh-keygen`, and `sudo`. +`git cas vault status` — a single command to show vault health: entry count, encryption state, `encryptionCount` (nonce usage), KDF parameters, and nonce health assessment. The `encryptionCount` and `decryption_failed` metrics from M16 are already tracked in vault metadata but have no CLI surface. ```shell -$ git cas store ./secrets.tar.gz --slug prod-secrets --vault-passphrase -Enter vault passphrase: •••••••••• -Confirm passphrase: •••••••••• +$ git cas vault status +Entries: 42 +Encryption: aes-256-gcm (pbkdf2, 600000 iterations) +Encryption count: 1,247 / 2,147,483,648 (0.00%) +Nonce health: ✅ Safe (rotate key before 2^31) ``` -Falls back to `GIT_CAS_PASSPHRASE` env var for non-interactive contexts (CI). The flag `--vault-passphrase` without a value triggers the prompt; with a value, uses it directly (backward compatible). +| Phase | Work | ~LoC | ~Hours | +|-------|------|------|--------| +| 1. Command + metadata display | Read vault metadata, format output (text + JSON) | ~40 | ~1h | +| 2. Nonce health assessment | Compare `encryptionCount` against thresholds, emit warning levels | ~15 | ~0.5h | +| 3. Tests | Mock vault state, verify output format | ~20 | ~0.5h | +| **Total** | | **~60** | **~2h** | -**Mini Battle Plan** +--- + +## Vision 10: GC Command + +**The Pitch** + +`git cas gc` — identify unreferenced chunks across vault entries and optionally trigger `git gc`. Wraps `collectReferencedChunks()` with a user-facing report. + +```shell +$ git cas gc --dry-run +Referenced chunks: 1,247 +Unreferenced blobs: 23 (estimated 4.2 MiB) +Run without --dry-run to trigger git gc. +``` | Phase | Work | ~LoC | ~Hours | |-------|------|------|--------| -| 1. TTY reader | `readPassphrase(prompt: string): Promise` — opens `/dev/tty` (Unix) or `CON` (Windows), sets raw mode, reads until Enter, echoes `•` per character. | ~40 | ~2h | -| 2. CLI integration | When `--vault-passphrase` is passed without a value and stdin is a TTY, call `readPassphrase()`. On store (first use), prompt twice for confirmation. | ~20 | ~1h | -| 3. Tests | Mock TTY input, verify echo suppression, verify confirmation match/mismatch, verify env var fallback. | ~30 | ~1h | -| **Total** | | **~90** | **~4h** | +| 1. Chunk analysis | Call `collectReferencedChunks` for all vault entries, compare against all CAS blobs | ~40 | ~2h | +| 2. `git gc` integration | Optionally invoke `git gc` after analysis. `--dry-run` flag for preview. | ~20 | ~1h | +| 3. Tests + safety | Confirm dry-run default, test output format | ~20 | ~1h | +| **Total** | | **~80** | **~4h** | + +--- -**This directly mitigates Concern 5 (shell history exposure) below.** +## Vision 11: KDF Parameter Tuning via `.casrc` + +**The Pitch** + +Allow `.casrc` to specify KDF parameters for vault init and rotation: `kdf.algorithm`, `kdf.iterations` (PBKDF2), `kdf.cost`/`kdf.blockSize`/`kdf.parallelization` (scrypt). Reject insecure values below OWASP minimums (100,000 iterations for PBKDF2-SHA-512, N=8192 for scrypt). + +| Phase | Work | ~LoC | ~Hours | +|-------|------|------|--------| +| 1. Config schema + validation | Add `kdf` key to `.casrc` schema, validate against OWASP minimums | ~25 | ~1h | +| 2. Wire into vault init/rotate | `loadConfig` merges KDF params into `kdfOptions` | ~10 | ~0.5h | +| 3. Tests | Reject weak params, merge precedence | ~15 | ~0.5h | +| **Total** | | **~40** | **~2h** | + +--- + +## Vision 12: File-Level Passphrase CLI + +**The Pitch** + +`--passphrase` flag for standalone encrypted store without requiring vault encryption. The library already supports `passphrase` + `kdfOptions` on `store()`/`storeFile()` — this just wires it through the CLI. + +```shell +# Store with a one-off passphrase (not vault-level) +git cas store ./secrets.bin --slug one-off --passphrase "my secret" + +# Restore with the same passphrase +git cas restore --slug one-off --out ./secrets.bin --passphrase "my secret" +``` + +Mutually exclusive with `--key-file`, `--recipient`, and `--vault-passphrase`. Supports the same resolution chain as vault passphrases: `--passphrase-file`, `--passphrase`, env var `GIT_CAS_FILE_PASSPHRASE`, TTY prompt. + +| Phase | Work | ~LoC | ~Hours | +|-------|------|------|--------| +| 1. CLI flag + store wiring | Add `--passphrase` to store/restore, wire into `storeFile`/`restoreFile` with `passphrase` + `kdfOptions` | ~20 | ~0.5h | +| 2. Mutual exclusion validation | Reject conflicting encryption flags | ~5 | ~0.25h | +| 3. Tests | Store/restore round-trip, conflict validation | ~15 | ~0.5h | +| **Total** | | **~30** | **~1h** | --- @@ -1375,7 +1445,9 @@ Architectural and security concerns identified during code review, with proposed --- -## Concern 1: Memory Amplification on Encrypted/Compressed Restore +## Concern 1: Memory Amplification on Encrypted/Compressed Restore ✅ MITIGATED + +**Status:** Task 16.2 implemented `maxRestoreBufferSize` guard (default 512 MiB). Post-decompression guard added. CLI exposes `--max-restore-buffer`. **The Problem** @@ -1410,7 +1482,9 @@ describe('Concern 1: Memory guard on encrypted restore', () => { --- -## Concern 2: Orphaned Blob Accumulation After STREAM_ERROR +## Concern 2: Orphaned Blob Accumulation After STREAM_ERROR ✅ MITIGATED + +**Status:** Task 16.10 implemented. `STREAM_ERROR` meta includes `orphanedBlobs` array. Observability metric emitted. **The Problem** @@ -1445,7 +1519,9 @@ describe('Concern 2: Orphaned blob tracking on STREAM_ERROR', () => { --- -## Concern 3: No Upper Bound on Chunk Size +## Concern 3: No Upper Bound on Chunk Size ✅ MITIGATED + +**Status:** Task 16.6 implemented. 100 MiB hard cap on CasService, FixedChunker, CdcChunker. Warning at 10 MiB. **The Problem** @@ -1476,7 +1552,9 @@ describe('Concern 3: Chunk size upper bound', () => { --- -## Concern 4: Web Crypto Adapter Silent Memory Buffering +## Concern 4: Web Crypto Adapter Silent Memory Buffering ✅ MITIGATED + +**Status:** Task 16.3 implemented. `ENCRYPTION_BUFFER_EXCEEDED` thrown when accumulated bytes exceed `maxEncryptionBufferSize` (default 512 MiB). **The Problem** @@ -1509,7 +1587,9 @@ describe('Concern 4: Web Crypto buffering guard', () => { --- -## Concern 5: Passphrase Exposure in Shell History and Process Listings +## Concern 5: Passphrase Exposure in Shell History and Process Listings ✅ MITIGATED + +**Status:** Task 16.11 implemented. `--vault-passphrase-file`, interactive TTY prompt, stdin pipe. See Vision 6. **The Problem** @@ -1545,7 +1625,9 @@ describe('Concern 5: Passphrase input security', () => { --- -## Concern 6: No KDF Brute-Force Rate Limiting +## Concern 6: No KDF Brute-Force Rate Limiting ✅ MITIGATED + +**Status:** Task 16.12 implemented. `decryption_failed` observability metric. CLI 1s delay on `INTEGRITY_ERROR`. **The Problem** @@ -1580,7 +1662,9 @@ describe('Concern 6: KDF brute-force awareness', () => { --- -## Concern 7: GCM Nonce Collision Risk at Scale +## Concern 7: GCM Nonce Collision Risk at Scale ✅ MITIGATED + +**Status:** Task 16.13 implemented. `SECURITY.md` documents GCM bound. Vault tracks `encryptionCount`. Warning at 2^31. **The Problem** @@ -1615,10 +1699,12 @@ describe('Concern 7: Nonce uniqueness', () => { --- -## Concern 8: Crypto Adapter Liskov Substitution Violation +## Concern 8: Crypto Adapter Liskov Substitution Violation ✅ MITIGATED **Source:** CODE-EVAL.md, Flaw 1 +**Status:** Task 16.1 implemented. All adapters: async `encryptBuffer`, `_validateKey` in `decryptBuffer`, `STREAM_NOT_CONSUMED` guard. Conformance test suite. + **The Problem** The three `CryptoPort` implementations (Node, Bun, Web) differ in observable behavior: @@ -1633,10 +1719,12 @@ M15 Prism fixed the `sha256()` async inconsistency but left these three discrepa --- -## Concern 9: FixedChunker Quadratic Buffer Allocation +## Concern 9: FixedChunker Quadratic Buffer Allocation ✅ MITIGATED **Source:** CODE-EVAL.md, Flaw 4 +**Status:** Task 16.4 implemented. Pre-allocated `Buffer.allocUnsafe(chunkSize)` working buffer. + **The Problem** `FixedChunker.chunk()` uses `Buffer.concat([buffer, data])` inside its async loop. Each call allocates a new buffer and copies the accumulated bytes. For a source yielding many small buffers (e.g., 4 KiB network reads into a 256 KiB chunk), this is O(n^2 / chunkSize) total byte copies. The CdcChunker, by contrast, uses a pre-allocated `Buffer.allocUnsafe(maxChunkSize)` with zero intermediate copies. @@ -1645,10 +1733,12 @@ M15 Prism fixed the `sha256()` async inconsistency but left these three discrepa --- -## Concern 10: CDC Deduplication Defeated by Encrypt-Then-Chunk +## Concern 10: CDC Deduplication Defeated by Encrypt-Then-Chunk ✅ MITIGATED **Source:** CODE-EVAL.md, Flaw 5 +**Status:** Task 16.5 implemented. Runtime warning when encryption + CDC combined. + **The Problem** Encryption is applied to the source stream *before* chunking. AES-GCM ciphertext is pseudorandom — identical plaintext produces different ciphertext (different random nonce each time). This means content-defined chunking (CDC) provides **zero deduplication benefit** for encrypted files. Users who combine `recipients` (or `encryptionKey`) with `chunking: { strategy: 'cdc' }` get CDC's computational overhead without its primary value proposition. @@ -1661,26 +1751,30 @@ This is a fundamental architectural constraint of the encrypt-then-chunk design. ## Summary Table -| # | Type | Severity | Fix Cost | Recommended Action | Task | -|---|------|----------|----------|--------------------|------| -| C1 | Memory amplification | High | ~20 LoC | Add `maxRestoreBufferSize` guard | **16.2** | -| C2 | Orphaned blobs | Medium | ~20 LoC | Report orphaned blob OIDs in error meta | **16.10** | -| C3 | No chunk size cap | Medium | ~6 LoC | Enforce 100 MiB maximum | **16.6** | -| C4 | Web Crypto buffering | Medium | ~15 LoC | Add buffer size guard in WebCryptoAdapter | **16.3** | -| C5 | Passphrase exposure | High | ~90 LoC | Interactive prompt + file-based input | **16.11** | -| C6 | KDF no rate limit | Low | ~10 LoC | Observability metric + CLI delay | **16.12** | -| C7 | GCM nonce collision | Low | ~20 LoC | Document bound + vault usage counter | **16.13** | -| C8 | Crypto adapter LSP violation | Medium | ~50 LoC | Normalize validation + finalize guards | **16.1** | -| C9 | FixedChunker quadratic alloc | Low | ~20 LoC | Pre-allocated buffer | **16.4** | -| C10 | Encrypt-then-chunk dedup loss | Medium | ~10 LoC | Runtime warning + documentation | **16.5** | - -| # | Type | Theme | Est. Cost | -|---|------|-------|-----------| -| V1 | Feature | Snapshot trees (directory store) | ~410 LoC, ~19h | -| V2 | Feature | Portable bundles (air-gap transfer) | ~340 LoC, ~15h | -| V3 | Feature | Manifest diff engine | ~180 LoC, ~8h | -| V4 | Feature | CompressionPort + zstd/brotli/lz4 | ~180 LoC, ~8h | -| V5 | Feature | Watch mode (continuous sync) | ~220 LoC, ~10h | -| V6 | Feature | Interactive passphrase prompt | ~90 LoC, ~4h — subsumed by **16.11** | -| V7 | Feature | Prometheus/OpenTelemetry ObservabilityPort adapter — export metrics (chunk throughput, encryption counts, error rates) to Prometheus or OTLP. The `decryption_failed` and `encryptionCount` metrics from M16 are natural candidates for alerting dashboards. | ~150 LoC, ~6h | -| V8 | Feature | `encryptionCount` auto-rotation — when count reaches a configurable threshold, automatically trigger `rotateVaultPassphrase` with a new passphrase derived from the old one, making nonce exhaustion impossible for long-lived vaults. | ~120 LoC, ~5h | +| # | Type | Severity | Fix Cost | Recommended Action | Task | Status | +|---|------|----------|----------|--------------------|------|--------| +| C1 | Memory amplification | High | ~20 LoC | Add `maxRestoreBufferSize` guard | **16.2** | ✅ Done | +| C2 | Orphaned blobs | Medium | ~20 LoC | Report orphaned blob OIDs in error meta | **16.10** | ✅ Done | +| C3 | No chunk size cap | Medium | ~6 LoC | Enforce 100 MiB maximum | **16.6** | ✅ Done | +| C4 | Web Crypto buffering | Medium | ~15 LoC | Add buffer size guard in WebCryptoAdapter | **16.3** | ✅ Done | +| C5 | Passphrase exposure | High | ~90 LoC | Interactive prompt + file-based input | **16.11** | ✅ Done | +| C6 | KDF no rate limit | Low | ~10 LoC | Observability metric + CLI delay | **16.12** | ✅ Done | +| C7 | GCM nonce collision | Low | ~20 LoC | Document bound + vault usage counter | **16.13** | ✅ Done | +| C8 | Crypto adapter LSP violation | Medium | ~50 LoC | Normalize validation + finalize guards | **16.1** | ✅ Done | +| C9 | FixedChunker quadratic alloc | Low | ~20 LoC | Pre-allocated buffer | **16.4** | ✅ Done | +| C10 | Encrypt-then-chunk dedup loss | Medium | ~10 LoC | Runtime warning + documentation | **16.5** | ✅ Done | + +| # | Type | Theme | Est. Cost | Status | +|---|------|-------|-----------|--------| +| V1 | Feature | Snapshot trees (directory store) | ~410 LoC, ~19h | 🔲 Open | +| V2 | Feature | Portable bundles (air-gap transfer) | ~340 LoC, ~15h | 🔲 Open | +| V3 | Feature | Manifest diff engine | ~180 LoC, ~8h | 🔲 Open | +| V4 | Feature | CompressionPort + zstd/brotli/lz4 | ~180 LoC, ~8h | 🔲 Open | +| V5 | Feature | Watch mode (continuous sync) | ~220 LoC, ~10h | 🔲 Open | +| V6 | Feature | Interactive passphrase prompt | ~90 LoC, ~4h | ✅ Done — subsumed by **16.11** | +| V7 | Feature | Prometheus/OpenTelemetry ObservabilityPort adapter | ~150 LoC, ~6h | 🔲 Open | +| V8 | Feature | `encryptionCount` auto-rotation | ~120 LoC, ~5h | 🔲 Open | +| V9 | Feature | `vault status` command — show metadata, `encryptionCount`, entry count, nonce health | ~60 LoC, ~2h | 🔲 Open | +| V10 | Feature | `gc` command — `collectReferencedChunks` + `git gc` for orphan cleanup | ~80 LoC, ~4h | 🔲 Open | +| V11 | Feature | KDF parameter tuning via `.casrc` — `kdf.iterations`, `kdf.cost`, `kdf.blockSize`, `kdf.parallelization` with validation (reject insecure values below OWASP minimums) | ~40 LoC, ~2h | 🔲 Open | +| V12 | Feature | File-level passphrase CLI — `--passphrase` flag for standalone encrypted store without vault encryption. Library already supports `passphrase` + `kdfOptions` on `store()`/`storeFile()`. | ~30 LoC, ~1h | 🔲 Open | From ecb94b28213e43ae70a1290e39fd8797712a337e Mon Sep 17 00:00:00 2001 From: James Ross Date: Sun, 8 Mar 2026 11:06:57 -0700 Subject: [PATCH 30/41] fix(types): add observability param to VaultService constructor type The runtime VaultService constructor already accepts an optional observability port, but the .d.ts declaration was missing it. Add the parameter to keep types in sync with the implementation. --- index.d.ts | 1 + 1 file changed, 1 insertion(+) diff --git a/index.d.ts b/index.d.ts index 05fa0e4..3e685b5 100644 --- a/index.d.ts +++ b/index.d.ts @@ -217,6 +217,7 @@ export declare class VaultService { persistence: GitPersistencePortBase; ref: GitRefPortBase; crypto: CryptoPortBase; + observability?: ObservabilityPort; }); /** Validates a vault slug. Throws CasError with code INVALID_SLUG on failure. */ From 1e8de6fd3b2756407d08e7622da562a4e1c43abd Mon Sep 17 00:00:00 2001 From: James Ross Date: Sun, 8 Mar 2026 11:07:18 -0700 Subject: [PATCH 31/41] fix(cli): use nullish coalescing for strategy and codec config merging Replace || with ?? so that empty-string CLI values (which are falsy but intentional) do not silently fall through to .casrc defaults. --- bin/config.js | 4 ++-- test/unit/cli/config.test.js | 12 ++++++++++++ 2 files changed, 14 insertions(+), 2 deletions(-) diff --git a/bin/config.js b/bin/config.js index a8aa35c..cb52e05 100644 --- a/bin/config.js +++ b/bin/config.js @@ -99,7 +99,7 @@ function resolveChunking({ strategy, chunkSize, cliOpts, config }) { * @returns {{ casConfig: Record, storeExtras: Record }} */ export function mergeConfig(cliOpts, config) { - const strategy = cliOpts.strategy || config.strategy; + const strategy = cliOpts.strategy ?? config.strategy; const chunkSize = cliOpts.chunkSize ?? config.chunkSize; /** @type {Record} */ @@ -110,7 +110,7 @@ export function mergeConfig(cliOpts, config) { setIfDefined(casConfig, 'maxRestoreBufferSize', cliOpts.maxRestoreBufferSize ?? config.maxRestoreBufferSize); setIfDefined(casConfig, 'chunking', resolveChunking({ strategy, chunkSize, cliOpts, config })); - const codec = cliOpts.codec || config.codec; + const codec = cliOpts.codec ?? config.codec; if (codec === 'cbor') { casConfig.codec = 'cbor'; } /** @type {Record} */ diff --git a/test/unit/cli/config.test.js b/test/unit/cli/config.test.js index 4727d7f..19e6493 100644 --- a/test/unit/cli/config.test.js +++ b/test/unit/cli/config.test.js @@ -100,6 +100,18 @@ describe('mergeConfig — compression', () => { }); }); +describe('mergeConfig — nullish coalescing', () => { + it('empty-string CLI strategy does not fall through to config', () => { + const { casConfig } = mergeConfig({ strategy: '' }, { strategy: 'cdc' }); + expect(casConfig.chunking).toBeUndefined(); + }); + + it('empty-string CLI codec does not fall through to config', () => { + const { casConfig } = mergeConfig({ codec: '' }, { codec: 'cbor' }); + expect(casConfig.codec).toBeUndefined(); + }); +}); + describe('mergeConfig — codec and thresholds', () => { it('cbor codec from CLI', () => { const { casConfig } = mergeConfig({ codec: 'cbor' }, {}); From f4302331b50f544be132fb12bbe25b6f970d4e14 Mon Sep 17 00:00:00 2001 From: James Ross Date: Sun, 8 Mar 2026 11:07:45 -0700 Subject: [PATCH 32/41] fix(cli): reject empty passphrases from all input sources readPassphraseFile now throws when a file or stdin yields an empty string after newline stripping. resolvePassphrase validates --vault- passphrase flag and GIT_CAS_PASSPHRASE env var with .trim() check. --- bin/git-cas.js | 2 ++ bin/ui/passphrase-prompt.js | 8 ++++++-- test/unit/cli/passphrase-prompt.test.js | 12 ++++++++++++ 3 files changed, 20 insertions(+), 2 deletions(-) diff --git a/bin/git-cas.js b/bin/git-cas.js index 9a5abf2..9da92a5 100755 --- a/bin/git-cas.js +++ b/bin/git-cas.js @@ -108,9 +108,11 @@ async function resolvePassphrase(opts, extra = {}) { return await readPassphraseFile(opts.vaultPassphraseFile); } if (opts.vaultPassphrase) { + if (!opts.vaultPassphrase.trim()) { throw new Error('Passphrase must not be empty'); } return opts.vaultPassphrase; } if (process.env.GIT_CAS_PASSPHRASE) { + if (!process.env.GIT_CAS_PASSPHRASE.trim()) { throw new Error('Passphrase must not be empty'); } return process.env.GIT_CAS_PASSPHRASE; } if (process.stdin.isTTY) { diff --git a/bin/ui/passphrase-prompt.js b/bin/ui/passphrase-prompt.js index 5da323c..b04a128 100644 --- a/bin/ui/passphrase-prompt.js +++ b/bin/ui/passphrase-prompt.js @@ -58,11 +58,15 @@ export async function readPassphraseFile(filePath) { for await (const chunk of process.stdin) { chunks.push(chunk); } - return Buffer.concat(chunks).toString('utf8').replace(/\r?\n$/, ''); + const stdinResult = Buffer.concat(chunks).toString('utf8').replace(/\r?\n$/, ''); + if (!stdinResult) { throw new Error('Passphrase must not be empty'); } + return stdinResult; } await warnInsecurePermissions(filePath); const content = await readFile(filePath, 'utf8'); - return content.replace(/\r?\n$/, ''); + const trimmed = content.replace(/\r?\n$/, ''); + if (!trimmed) { throw new Error('Passphrase must not be empty'); } + return trimmed; } /** diff --git a/test/unit/cli/passphrase-prompt.test.js b/test/unit/cli/passphrase-prompt.test.js index 77df918..066327c 100644 --- a/test/unit/cli/passphrase-prompt.test.js +++ b/test/unit/cli/passphrase-prompt.test.js @@ -36,6 +36,18 @@ describe('readPassphraseFile', () => { }); }); +describe('readPassphraseFile — empty passphrase rejection', () => { + it('rejects file containing only LF', async () => { + await writeFile(tmpPath, '\n', 'utf8'); + await expect(readPassphraseFile(tmpPath)).rejects.toThrow('Passphrase must not be empty'); + }); + + it('rejects file containing only CRLF', async () => { + await writeFile(tmpPath, '\r\n', 'utf8'); + await expect(readPassphraseFile(tmpPath)).rejects.toThrow('Passphrase must not be empty'); + }); +}); + describe('readPassphraseFile — permission warnings', () => { it('warns on group/world-readable file permissions', async () => { const writeSpy = []; From e8045e017dab4925b2d13af28c003dad6d2bd3a2 Mon Sep 17 00:00:00 2001 From: James Ross Date: Sun, 8 Mar 2026 11:08:01 -0700 Subject: [PATCH 33/41] feat(cli): add passphrase-file support to vault rotate command vault rotate now accepts --old-passphrase-file and --new-passphrase-file flags, bringing it to parity with the store/restore passphrase-file support. The old --old-passphrase and --new-passphrase flags are retained as non-required options. --- bin/git-cas.js | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/bin/git-cas.js b/bin/git-cas.js index 9da92a5..5decc39 100755 --- a/bin/git-cas.js +++ b/bin/git-cas.js @@ -554,16 +554,26 @@ vault vault .command('rotate') .description('Rotate vault-level encryption passphrase') - .requiredOption('--old-passphrase ', 'Current vault passphrase') - .requiredOption('--new-passphrase ', 'New vault passphrase') + .option('--old-passphrase ', 'Current vault passphrase') + .option('--new-passphrase ', 'New vault passphrase') + .option('--old-passphrase-file ', 'Read old passphrase from file (- for stdin)') + .option('--new-passphrase-file ', 'Read new passphrase from file (- for stdin)') .option('--algorithm ', 'KDF algorithm (pbkdf2 or scrypt)') .option('--cwd ', 'Git working directory', '.') .action(runAction(async (/** @type {Record} */ opts) => { + const oldPassphrase = opts.oldPassphraseFile + ? await readPassphraseFile(opts.oldPassphraseFile) + : opts.oldPassphrase; + const newPassphrase = opts.newPassphraseFile + ? await readPassphraseFile(opts.newPassphraseFile) + : opts.newPassphrase; + if (!oldPassphrase) { throw new Error('Old passphrase required (--old-passphrase or --old-passphrase-file)'); } + if (!newPassphrase) { throw new Error('New passphrase required (--new-passphrase or --new-passphrase-file)'); } const cas = createCas(opts.cwd); /** @type {{ oldPassphrase: string, newPassphrase: string, kdfOptions?: { algorithm: 'pbkdf2' | 'scrypt' } }} */ const rotateOpts = { - oldPassphrase: opts.oldPassphrase, - newPassphrase: opts.newPassphrase, + oldPassphrase, + newPassphrase, }; if (opts.algorithm) { rotateOpts.kdfOptions = { algorithm: /** @type {'pbkdf2' | 'scrypt'} */ (opts.algorithm) }; From 33a1fc6c52155278061e4fcf8269bef0d75ccf4f Mon Sep 17 00:00:00 2001 From: James Ross Date: Sun, 8 Mar 2026 11:10:01 -0700 Subject: [PATCH 34/41] fix(cli): validate KDF algorithm and .casrc config types Add validateKdfAlgorithm() guard before vault init and vault rotate pass untrusted strings to the KDF. Add validateConfig() to loadConfig() that rejects out-of-range and wrong-type values for all .casrc keys. Also fix passphrase-prompt tests to write temp files with mode 0o600, eliminating spurious "insecure permissions" warnings during test runs. --- bin/config.js | 56 +++++++++++++++++++++++++ bin/git-cas.js | 13 ++++++ test/unit/cli/config.test.js | 51 ++++++++++++++++++++++ test/unit/cli/passphrase-prompt.test.js | 12 +++--- 4 files changed, 126 insertions(+), 6 deletions(-) diff --git a/bin/config.js b/bin/config.js index cb52e05..0ce38f5 100644 --- a/bin/config.js +++ b/bin/config.js @@ -34,6 +34,61 @@ const FILENAME = '.casrc'; * @property {{ minChunkSize?: number, targetChunkSize?: number, maxChunkSize?: number }} [cdc] */ +/** + * @param {any} value + * @param {string} name + * @param {number} min + */ +function assertInt(value, name, min) { + if (value === undefined) { return; } + if (!Number.isInteger(value) || value < min) { + throw new Error(`${FILENAME}: ${name} must be an integer >= ${min}`); + } +} + +/** + * @param {any} value + * @param {string} name + * @param {string[]} allowed + */ +function assertEnum(value, name, allowed) { + if (value === undefined) { return; } + if (!allowed.includes(value)) { + throw new Error(`${FILENAME}: ${name} must be ${allowed.map((v) => `"${v}"`).join(' or ')}`); + } +} + +/** + * @param {Record} config + */ +function validateCdc(config) { + if (config.cdc === undefined) { return; } + if (typeof config.cdc !== 'object' || config.cdc === null || Array.isArray(config.cdc)) { + throw new Error(`${FILENAME}: cdc must be an object`); + } + for (const key of ['minChunkSize', 'targetChunkSize', 'maxChunkSize']) { + assertInt(config.cdc[key], `cdc.${key}`, 1); + } +} + +/** + * Validates `.casrc` config values after parsing. + * + * @param {Record} config + */ +function validateConfig(config) { + assertInt(config.chunkSize, 'chunkSize', 1024); + assertEnum(config.strategy, 'strategy', ['fixed', 'cdc']); + assertInt(config.concurrency, 'concurrency', 1); + assertEnum(config.codec, 'codec', ['json', 'cbor']); + if (config.compression !== undefined && config.compression !== false) { + assertEnum(config.compression, 'compression', ['gzip']); + } + assertInt(config.merkleThreshold, 'merkleThreshold', 1); + assertInt(config.maxRestoreBufferSize, 'maxRestoreBufferSize', 1024); + validateCdc(config); +} + /** * Loads `.casrc` from the given directory, returning an empty object if not found. * @@ -48,6 +103,7 @@ export function loadConfig(cwd) { if (typeof config !== 'object' || config === null || Array.isArray(config)) { throw new Error(`${FILENAME}: expected a JSON object`); } + validateConfig(config); return config; } catch (err) { if (err.code === 'ENOENT') { diff --git a/bin/git-cas.js b/bin/git-cas.js index 5decc39..71c7021 100755 --- a/bin/git-cas.js +++ b/bin/git-cas.js @@ -38,6 +38,17 @@ function readKeyFile(keyFilePath) { return buf; } +/** + * Validate that a KDF algorithm string is a supported value. + * + * @param {string} alg + */ +function validateKdfAlgorithm(alg) { + if (!['pbkdf2', 'scrypt'].includes(alg)) { + throw new Error(`Invalid KDF algorithm "${alg}": must be "pbkdf2" or "scrypt"`); + } +} + /** * Create a CAS instance for the given working directory. * @@ -426,6 +437,7 @@ vault const initOpts = {}; const passphrase = await resolvePassphrase(opts, { confirm: true }); if (passphrase) { + validateKdfAlgorithm(opts.algorithm); initOpts.passphrase = passphrase; initOpts.kdfOptions = { algorithm: /** @type {'pbkdf2' | 'scrypt'} */ (opts.algorithm) }; } @@ -576,6 +588,7 @@ vault newPassphrase, }; if (opts.algorithm) { + validateKdfAlgorithm(opts.algorithm); rotateOpts.kdfOptions = { algorithm: /** @type {'pbkdf2' | 'scrypt'} */ (opts.algorithm) }; } const { commitOid, rotatedSlugs, skippedSlugs } = await cas.rotateVaultPassphrase(rotateOpts); diff --git a/test/unit/cli/config.test.js b/test/unit/cli/config.test.js index 19e6493..79500fb 100644 --- a/test/unit/cli/config.test.js +++ b/test/unit/cli/config.test.js @@ -49,6 +49,57 @@ describe('loadConfig', () => { }); }); +describe('loadConfig — validation', () => { + afterEach(teardown); + + it('rejects non-integer chunkSize', () => { + setup(); + writeFileSync(join(tmpDir, '.casrc'), JSON.stringify({ chunkSize: 'big' })); + expect(() => loadConfig(tmpDir)).toThrow(/chunkSize must be an integer >= 1024/); + }); + + it('rejects chunkSize below 1024', () => { + setup(); + writeFileSync(join(tmpDir, '.casrc'), JSON.stringify({ chunkSize: 512 })); + expect(() => loadConfig(tmpDir)).toThrow(/chunkSize must be an integer >= 1024/); + }); + + it('rejects invalid strategy', () => { + setup(); + writeFileSync(join(tmpDir, '.casrc'), JSON.stringify({ strategy: 'random' })); + expect(() => loadConfig(tmpDir)).toThrow(/strategy must be "fixed" or "cdc"/); + }); + + it('rejects non-positive concurrency', () => { + setup(); + writeFileSync(join(tmpDir, '.casrc'), JSON.stringify({ concurrency: 0 })); + expect(() => loadConfig(tmpDir)).toThrow(/concurrency must be an integer >= 1/); + }); + + it('rejects invalid codec', () => { + setup(); + writeFileSync(join(tmpDir, '.casrc'), JSON.stringify({ codec: 'xml' })); + expect(() => loadConfig(tmpDir)).toThrow(/codec must be "json" or "cbor"/); + }); + + it('accepts a fully valid config', () => { + setup(); + writeFileSync(join(tmpDir, '.casrc'), JSON.stringify({ + chunkSize: 65536, + strategy: 'cdc', + concurrency: 4, + codec: 'cbor', + compression: 'gzip', + merkleThreshold: 500, + maxRestoreBufferSize: 1048576, + cdc: { minChunkSize: 2048, targetChunkSize: 8192, maxChunkSize: 16384 }, + })); + const config = loadConfig(tmpDir); + expect(config.chunkSize).toBe(65536); + expect(config.cdc.targetChunkSize).toBe(8192); + }); +}); + describe('mergeConfig — CLI overrides', () => { it('CLI flags override config', () => { const { casConfig } = mergeConfig({ chunkSize: 4096, strategy: 'fixed' }, { chunkSize: 65536 }); diff --git a/test/unit/cli/passphrase-prompt.test.js b/test/unit/cli/passphrase-prompt.test.js index 066327c..e823d13 100644 --- a/test/unit/cli/passphrase-prompt.test.js +++ b/test/unit/cli/passphrase-prompt.test.js @@ -12,25 +12,25 @@ afterEach(async () => { describe('readPassphraseFile', () => { it('reads from file and trims trailing newline', async () => { - await writeFile(tmpPath, 'my-secret\n', 'utf8'); + await writeFile(tmpPath, 'my-secret\n', { mode: 0o600 }); const result = await readPassphraseFile(tmpPath); expect(result).toBe('my-secret'); }); it('preserves content without trailing newline', async () => { - await writeFile(tmpPath, 'no-newline', 'utf8'); + await writeFile(tmpPath, 'no-newline', { mode: 0o600 }); const result = await readPassphraseFile(tmpPath); expect(result).toBe('no-newline'); }); it('preserves internal newlines', async () => { - await writeFile(tmpPath, 'line1\nline2\n', 'utf8'); + await writeFile(tmpPath, 'line1\nline2\n', { mode: 0o600 }); const result = await readPassphraseFile(tmpPath); expect(result).toBe('line1\nline2'); }); it('strips trailing CRLF (Windows line ending)', async () => { - await writeFile(tmpPath, 'win-secret\r\n', 'utf8'); + await writeFile(tmpPath, 'win-secret\r\n', { mode: 0o600 }); const result = await readPassphraseFile(tmpPath); expect(result).toBe('win-secret'); }); @@ -38,12 +38,12 @@ describe('readPassphraseFile', () => { describe('readPassphraseFile — empty passphrase rejection', () => { it('rejects file containing only LF', async () => { - await writeFile(tmpPath, '\n', 'utf8'); + await writeFile(tmpPath, '\n', { mode: 0o600 }); await expect(readPassphraseFile(tmpPath)).rejects.toThrow('Passphrase must not be empty'); }); it('rejects file containing only CRLF', async () => { - await writeFile(tmpPath, '\r\n', 'utf8'); + await writeFile(tmpPath, '\r\n', { mode: 0o600 }); await expect(readPassphraseFile(tmpPath)).rejects.toThrow('Passphrase must not be empty'); }); }); From a5b91402f8d877ea1b4dba40a5ac5d9200e22e1b Mon Sep 17 00:00:00 2001 From: James Ross Date: Sun, 8 Mar 2026 11:11:02 -0700 Subject: [PATCH 35/41] docs: update deprecated method names and add missing error codes MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - README/GUIDE: deleteAsset → inspectAsset, findOrphanedChunks → collectReferencedChunks with deprecation notes - SECURITY.md: add RESTORE_TOO_LARGE and ENCRYPTION_BUFFER_EXCEEDED error code sections - CHANGELOG: add entries for all PR #17 review fixes --- CHANGELOG.md | 12 ++++++++++++ GUIDE.md | 20 +++++++++++++------- README.md | 8 ++++---- SECURITY.md | 49 +++++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 78 insertions(+), 11 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 98b1386..31e500c 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -7,6 +7,18 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [Unreleased] +### Added +- **Vault rotate passphrase-file support** — `vault rotate` now accepts `--old-passphrase-file` and `--new-passphrase-file` flags, bringing it to parity with the store/restore passphrase-file support. + +### Fixed (PR #17 review findings) +- **VaultService constructor type** — Added missing `observability?: ObservabilityPort` parameter to `index.d.ts` declaration. +- **Nullish coalescing for config merging** — `strategy` and `codec` in `mergeConfig()` now use `??` instead of `||`, so empty-string CLI values don't fall through to `.casrc` defaults. +- **Empty passphrase rejection** — `readPassphraseFile` rejects files that yield an empty string after newline stripping. `resolvePassphrase` validates `--vault-passphrase` flag and `GIT_CAS_PASSPHRASE` env var. +- **KDF algorithm validation** — `vault init` and `vault rotate` now validate `--algorithm` against the supported set (`pbkdf2`, `scrypt`) before passing to the KDF. +- **`.casrc` config validation** — `loadConfig()` now validates all config values (types, ranges, enum membership) after JSON parsing. +- **Deprecated method names in docs** — Updated `deleteAsset` → `inspectAsset` and `findOrphanedChunks` → `collectReferencedChunks` in README and GUIDE. +- **Missing error codes in SECURITY.md** — Added `RESTORE_TOO_LARGE` and `ENCRYPTION_BUFFER_EXCEEDED` sections. + ### Added - **CLI store flags** — `--gzip`, `--strategy `, `--chunk-size `, `--concurrency `, `--codec `, `--merkle-threshold `, `--target-chunk-size `, `--min-chunk-size `, `--max-chunk-size `. All library-level chunking, compression, codec, and concurrency options are now accessible from the CLI. - **CLI restore flags** — `--concurrency `, `--max-restore-buffer `. Parallel I/O and restore buffer limit now configurable from CLI. diff --git a/GUIDE.md b/GUIDE.md index 46d6f1a..808f8d9 100644 --- a/GUIDE.md +++ b/GUIDE.md @@ -666,14 +666,14 @@ The `verifyIntegrity` method reads each chunk blob from Git, recomputes its SHA-256 digest, and compares it against the manifest. It emits either `integrity:pass` or `integrity:fail` events (see Section 9). -### Deleting an Asset +### Inspecting an Asset -`deleteAsset` returns logical deletion metadata for an asset without +`inspectAsset` returns logical deletion metadata for an asset without performing any destructive Git operations. The caller is responsible for removing refs and running `git gc --prune` to reclaim space: ```js -const { slug, chunksOrphaned } = await cas.deleteAsset({ treeOid }); +const { slug, chunksOrphaned } = await cas.inspectAsset({ treeOid }); console.log(`Asset "${slug}" has ${chunksOrphaned} chunks to clean up`); // Remove the ref pointing to the tree, then: @@ -683,24 +683,30 @@ console.log(`Asset "${slug}" has ${chunksOrphaned} chunks to clean up`); This is intentionally non-destructive: CAS never modifies or deletes Git objects. It only tells you what would become unreachable. -### Finding Orphaned Chunks +> **Deprecation note:** `deleteAsset()` is a deprecated alias for +> `inspectAsset()`. It will be removed in a future major version. + +### Collecting Referenced Chunks When you store the same file multiple times with different chunk sizes, or store overlapping files, some chunk blobs may no longer be referenced by any -manifest. `findOrphanedChunks` aggregates all referenced chunk blob OIDs +manifest. `collectReferencedChunks` aggregates all referenced chunk blob OIDs across multiple assets: ```js -const { referenced, total } = await cas.findOrphanedChunks({ +const { referenced, total } = await cas.collectReferencedChunks({ treeOids: [treeOid1, treeOid2, treeOid3] }); console.log(`${referenced.size} unique blobs across ${total} total chunk references`); ``` If any `treeOid` lacks a manifest, the call throws -`CasError('MANIFEST_NOT_FOUND')` (fail closed). This is analysis only -- no +`CasError('MANIFEST_NOT_FOUND')` (fail closed). This is analysis only — no objects are deleted or modified. +> **Deprecation note:** `findOrphanedChunks()` is a deprecated alias for +> `collectReferencedChunks()`. It will be removed in a future major version. + ### Working with Multiple Assets A common pattern is to store multiple assets and assemble their trees into diff --git a/README.md b/README.md index 4ecad3f..6286304 100644 --- a/README.md +++ b/README.md @@ -29,7 +29,7 @@ We use the object database. - **Manifests** a tiny explicit index of chunks + metadata (JSON/CBOR). - **Tree output** generates standard Git trees so assets snap into commits cleanly. - **Full round-trip** store, tree, and restore — get your bytes back, verified. -- **Lifecycle management** `readManifest`, `deleteAsset`, `findOrphanedChunks` — inspect trees, plan deletions, audit storage. +- **Lifecycle management** `readManifest`, `inspectAsset`, `collectReferencedChunks` — inspect trees, plan deletions, audit storage. - **Vault** GC-safe ref-based storage. One ref (`refs/cas/vault`) indexes all assets by slug. No more silent data loss from `git gc`. - **Interactive dashboard** `git cas inspect` with chunk heatmap, animated progress bars, and rich manifest views. - **Verify & JSON output** `git cas verify` checks integrity; `--json` on all commands for CI/scripting. @@ -229,9 +229,9 @@ await cas.restoreFile({ manifest, outputPath: './restored.png' }); // Read the manifest back from a tree OID const m = await cas.readManifest({ treeOid }); -// Lifecycle: inspect deletion impact, find orphaned chunks -const { slug, chunksOrphaned } = await cas.deleteAsset({ treeOid }); -const { referenced, total } = await cas.findOrphanedChunks({ treeOids: [treeOid] }); +// Lifecycle: inspect deletion impact, collect referenced chunks +const { slug, chunksOrphaned } = await cas.inspectAsset({ treeOid }); +const { referenced, total } = await cas.collectReferencedChunks({ treeOids: [treeOid] }); // v2.0.0: Compressed + passphrase-encrypted store const manifest2 = await cas.storeFile({ diff --git a/SECURITY.md b/SECURITY.md index 00a5b5e..5537384 100644 --- a/SECURITY.md +++ b/SECURITY.md @@ -635,6 +635,55 @@ throw new CasError( - Verify the encryption key is available and passed to `restore()`. - If the key is lost, the content is permanently inaccessible. +### `RESTORE_TOO_LARGE` + +**Thrown when**: +- An encrypted or compressed restore would exceed the configured `maxRestoreBufferSize` limit. +- The post-decompression size exceeds the limit (checked after gunzip). + +**Example**: +```javascript +throw new CasError( + 'Restore buffer exceeds limit', + 'RESTORE_TOO_LARGE', + { size: 1073741824, limit: 536870912 }, +); +``` + +**Possible causes**: +- The asset is larger than the configured buffer limit (default 512 MiB). +- A compressed asset inflates beyond the limit after decompression. + +**Recommended action**: +- Increase `maxRestoreBufferSize` in the `CasService` constructor or `.casrc`. +- For very large assets, consider storing without encryption to enable streaming restore. + +--- + +### `ENCRYPTION_BUFFER_EXCEEDED` + +**Thrown when**: +- Web Crypto AES-GCM encryption is attempted on data exceeding the configured `maxEncryptionBufferSize`. +- Web Crypto is a one-shot API — it cannot stream, so the entire plaintext must fit in memory. + +**Example**: +```javascript +throw new CasError( + 'Encryption buffer exceeds limit', + 'ENCRYPTION_BUFFER_EXCEEDED', + { size: 1073741824, limit: 536870912 }, +); +``` + +**Possible causes**: +- Large chunks combined with `WebCryptoAdapter` (used in Bun/Deno). +- `NodeCryptoAdapter` uses true streaming and is not affected by this limit. + +**Recommended action**: +- Increase `maxEncryptionBufferSize` in the `WebCryptoAdapter` constructor. +- Switch to `NodeCryptoAdapter` if streaming encryption is needed. +- Reduce chunk size to keep individual encryption operations within the limit. + --- ## Conclusion From ca961dc00b55a119681a33f74a302ddd42fe5b01 Mon Sep 17 00:00:00 2001 From: James Ross Date: Sun, 8 Mar 2026 11:20:10 -0700 Subject: [PATCH 36/41] fix: resolve self-review findings from PR #17 pre-flight - SECURITY.md: fix ENCRYPTION_BUFFER_EXCEEDED meta field name (accumulated, not size) to match WebCryptoAdapter implementation - vault rotate: reject double-stdin (--old-passphrase-file - and --new-passphrase-file - simultaneously); extract helper to keep action complexity within lint threshold - CHANGELOG.md: merge duplicate ### Added headings under [Unreleased] - passphrase-prompt tests: add 0-byte empty file rejection test --- CHANGELOG.md | 18 ++++++--------- SECURITY.md | 4 ++-- bin/git-cas.js | 30 ++++++++++++++++++------- test/unit/cli/passphrase-prompt.test.js | 5 +++++ 4 files changed, 36 insertions(+), 21 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 31e500c..1f725c5 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -9,17 +9,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Added - **Vault rotate passphrase-file support** — `vault rotate` now accepts `--old-passphrase-file` and `--new-passphrase-file` flags, bringing it to parity with the store/restore passphrase-file support. - -### Fixed (PR #17 review findings) -- **VaultService constructor type** — Added missing `observability?: ObservabilityPort` parameter to `index.d.ts` declaration. -- **Nullish coalescing for config merging** — `strategy` and `codec` in `mergeConfig()` now use `??` instead of `||`, so empty-string CLI values don't fall through to `.casrc` defaults. -- **Empty passphrase rejection** — `readPassphraseFile` rejects files that yield an empty string after newline stripping. `resolvePassphrase` validates `--vault-passphrase` flag and `GIT_CAS_PASSPHRASE` env var. -- **KDF algorithm validation** — `vault init` and `vault rotate` now validate `--algorithm` against the supported set (`pbkdf2`, `scrypt`) before passing to the KDF. -- **`.casrc` config validation** — `loadConfig()` now validates all config values (types, ranges, enum membership) after JSON parsing. -- **Deprecated method names in docs** — Updated `deleteAsset` → `inspectAsset` and `findOrphanedChunks` → `collectReferencedChunks` in README and GUIDE. -- **Missing error codes in SECURITY.md** — Added `RESTORE_TOO_LARGE` and `ENCRYPTION_BUFFER_EXCEEDED` sections. - -### Added - **CLI store flags** — `--gzip`, `--strategy `, `--chunk-size `, `--concurrency `, `--codec `, `--merkle-threshold `, `--target-chunk-size `, `--min-chunk-size `, `--max-chunk-size `. All library-level chunking, compression, codec, and concurrency options are now accessible from the CLI. - **CLI restore flags** — `--concurrency `, `--max-restore-buffer `. Parallel I/O and restore buffer limit now configurable from CLI. - **`.casrc` config file** — JSON config file at the repository root provides default values for CLI flags. CLI flags always take precedence. Supports: `chunkSize`, `strategy`, `concurrency`, `codec`, `compression`, `merkleThreshold`, `maxRestoreBufferSize`, and `cdc.*` sub-keys. @@ -56,6 +45,13 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - **16.9 — Pre-commit hook + hooks directory** — `scripts/git-hooks/` renamed to `scripts/hooks/` per CLAUDE.md convention. New `pre-commit` hook runs lint gate. `install-hooks.sh` updated accordingly. - **16.6 — Chunk size upper bound** — CasService, FixedChunker, and CdcChunker now reject chunk sizes exceeding 100 MiB. CasService logs a warning when chunk size exceeds 10 MiB. - **ROADMAP.md M16 summary** — Corrected LoC/hours from `~430/~28h` to `~698/~21h` to match the detailed task breakdown. +- **VaultService constructor type** — Added missing `observability?: ObservabilityPort` parameter to `index.d.ts` declaration. +- **Nullish coalescing for config merging** — `strategy` and `codec` in `mergeConfig()` now use `??` instead of `||`, so empty-string CLI values don't fall through to `.casrc` defaults. +- **Empty passphrase rejection** — `readPassphraseFile` rejects files that yield an empty string after newline stripping. `resolvePassphrase` validates `--vault-passphrase` flag and `GIT_CAS_PASSPHRASE` env var. +- **KDF algorithm validation** — `vault init` and `vault rotate` now validate `--algorithm` against the supported set (`pbkdf2`, `scrypt`) before passing to the KDF. +- **`.casrc` config validation** — `loadConfig()` now validates all config values (types, ranges, enum membership) after JSON parsing. +- **Deprecated method names in docs** — Updated `deleteAsset` → `inspectAsset` and `findOrphanedChunks` → `collectReferencedChunks` in README and GUIDE. +- **Missing error codes in SECURITY.md** — Added `RESTORE_TOO_LARGE` and `ENCRYPTION_BUFFER_EXCEEDED` sections. ## [5.2.4] — Prism polish (2026-03-03) diff --git a/SECURITY.md b/SECURITY.md index 5537384..1f804cf 100644 --- a/SECURITY.md +++ b/SECURITY.md @@ -669,9 +669,9 @@ throw new CasError( **Example**: ```javascript throw new CasError( - 'Encryption buffer exceeds limit', + 'Streaming encryption buffered 1073741824 bytes (limit: 536870912)...', 'ENCRYPTION_BUFFER_EXCEEDED', - { size: 1073741824, limit: 536870912 }, + { accumulated: 1073741824, limit: 536870912 }, ); ``` diff --git a/bin/git-cas.js b/bin/git-cas.js index 71c7021..076e8b0 100755 --- a/bin/git-cas.js +++ b/bin/git-cas.js @@ -563,6 +563,27 @@ vault // --------------------------------------------------------------------------- // vault rotate // --------------------------------------------------------------------------- +/** + * Resolve old and new passphrases for vault rotate from flags/files. + * + * @param {Record} opts + * @returns {Promise<{ oldPassphrase: string, newPassphrase: string }>} + */ +async function resolveRotatePassphrases(opts) { + if (opts.oldPassphraseFile === '-' && opts.newPassphraseFile === '-') { + throw new Error('Cannot read both old and new passphrase from stdin'); + } + const oldPassphrase = opts.oldPassphraseFile + ? await readPassphraseFile(opts.oldPassphraseFile) + : opts.oldPassphrase; + const newPassphrase = opts.newPassphraseFile + ? await readPassphraseFile(opts.newPassphraseFile) + : opts.newPassphrase; + if (!oldPassphrase) { throw new Error('Old passphrase required (--old-passphrase or --old-passphrase-file)'); } + if (!newPassphrase) { throw new Error('New passphrase required (--new-passphrase or --new-passphrase-file)'); } + return { oldPassphrase, newPassphrase }; +} + vault .command('rotate') .description('Rotate vault-level encryption passphrase') @@ -573,14 +594,7 @@ vault .option('--algorithm ', 'KDF algorithm (pbkdf2 or scrypt)') .option('--cwd ', 'Git working directory', '.') .action(runAction(async (/** @type {Record} */ opts) => { - const oldPassphrase = opts.oldPassphraseFile - ? await readPassphraseFile(opts.oldPassphraseFile) - : opts.oldPassphrase; - const newPassphrase = opts.newPassphraseFile - ? await readPassphraseFile(opts.newPassphraseFile) - : opts.newPassphrase; - if (!oldPassphrase) { throw new Error('Old passphrase required (--old-passphrase or --old-passphrase-file)'); } - if (!newPassphrase) { throw new Error('New passphrase required (--new-passphrase or --new-passphrase-file)'); } + const { oldPassphrase, newPassphrase } = await resolveRotatePassphrases(opts); const cas = createCas(opts.cwd); /** @type {{ oldPassphrase: string, newPassphrase: string, kdfOptions?: { algorithm: 'pbkdf2' | 'scrypt' } }} */ const rotateOpts = { diff --git a/test/unit/cli/passphrase-prompt.test.js b/test/unit/cli/passphrase-prompt.test.js index e823d13..6f81900 100644 --- a/test/unit/cli/passphrase-prompt.test.js +++ b/test/unit/cli/passphrase-prompt.test.js @@ -37,6 +37,11 @@ describe('readPassphraseFile', () => { }); describe('readPassphraseFile — empty passphrase rejection', () => { + it('rejects empty (0-byte) file', async () => { + await writeFile(tmpPath, '', { mode: 0o600 }); + await expect(readPassphraseFile(tmpPath)).rejects.toThrow('Passphrase must not be empty'); + }); + it('rejects file containing only LF', async () => { await writeFile(tmpPath, '\n', { mode: 0o600 }); await expect(readPassphraseFile(tmpPath)).rejects.toThrow('Passphrase must not be empty'); From a0040863e39e885be40bc380a589e95d155dcaac Mon Sep 17 00:00:00 2001 From: James Ross Date: Sun, 8 Mar 2026 11:28:57 -0700 Subject: [PATCH 37/41] fix: address CodeRabbit review round 2 findings - GUIDE.md FAQ: mention 100 MiB upper bound and >10 MiB warning for chunk size, not just the lower bound - passphrase-prompt tests: use chmod() after writeFile() instead of relying on mode parameter, which is filtered by process umask and fails on hardened CI systems with restrictive umask --- GUIDE.md | 4 ++++ test/unit/cli/passphrase-prompt.test.js | 8 +++++--- 2 files changed, 9 insertions(+), 3 deletions(-) diff --git a/GUIDE.md b/GUIDE.md index 808f8d9..1d8783e 100644 --- a/GUIDE.md +++ b/GUIDE.md @@ -1620,6 +1620,10 @@ to hold the entire file. The minimum chunk size is 1 KiB. This prevents pathologically small chunks that would create excessive Git objects. Increase your `chunkSize` parameter. +There is also a hard cap at 100 MiB — values above this are rejected outright. +Setting `chunkSize` above 10 MiB will trigger a warning, since very large +chunks reduce deduplication benefit and increase memory pressure. + ### Q: I get "Encryption key must be 32 bytes, got N" AES-256 requires exactly a 256-bit (32-byte) key. Ensure your key file diff --git a/test/unit/cli/passphrase-prompt.test.js b/test/unit/cli/passphrase-prompt.test.js index 6f81900..2c2866e 100644 --- a/test/unit/cli/passphrase-prompt.test.js +++ b/test/unit/cli/passphrase-prompt.test.js @@ -1,5 +1,5 @@ import { describe, it, expect, afterEach } from 'vitest'; -import { writeFile, unlink } from 'node:fs/promises'; +import { writeFile, unlink, chmod } from 'node:fs/promises'; import { tmpdir } from 'node:os'; import { join } from 'node:path'; import { readPassphraseFile } from '../../../bin/ui/passphrase-prompt.js'; @@ -59,7 +59,8 @@ describe('readPassphraseFile — permission warnings', () => { const origWrite = process.stderr.write; process.stderr.write = (/** @type {any} */ chunk) => { writeSpy.push(String(chunk)); return true; }; try { - await writeFile(tmpPath, 'secret\n', { mode: 0o644 }); + await writeFile(tmpPath, 'secret\n'); + await chmod(tmpPath, 0o644); await readPassphraseFile(tmpPath); expect(writeSpy.some((s) => s.includes('permissions'))).toBe(true); } finally { @@ -72,7 +73,8 @@ describe('readPassphraseFile — permission warnings', () => { const origWrite = process.stderr.write; process.stderr.write = (/** @type {any} */ chunk) => { writeSpy.push(String(chunk)); return true; }; try { - await writeFile(tmpPath, 'secret\n', { mode: 0o600 }); + await writeFile(tmpPath, 'secret\n'); + await chmod(tmpPath, 0o600); await readPassphraseFile(tmpPath); expect(writeSpy.some((s) => s.includes('permissions'))).toBe(false); } finally { From ab3010456da545ce54d3154bfe378ac0f98d129a Mon Sep 17 00:00:00 2001 From: James Ross Date: Sun, 8 Mar 2026 11:40:11 -0700 Subject: [PATCH 38/41] refactor(cli): inject delay dependency into runAction Replace hardcoded setTimeout in runAction with an injectable delay parameter. Tests now inject a spy instead of patching globals with vi.useFakeTimers(), making the INTEGRITY_ERROR rate-limit tests deterministic across all supported runtimes. Add test/CONVENTIONS.md documenting cross-runtime testing rules: inject time dependencies, use chmod() over writeFile({ mode }), prefer dependency injection over global monkey-patching. --- CHANGELOG.md | 2 ++ bin/actions.js | 7 +++-- test/CONVENTIONS.md | 57 +++++++++++++++++++++++++++++++++++ test/unit/cli/actions.test.js | 18 +++++------ 4 files changed, 72 insertions(+), 12 deletions(-) create mode 100644 test/CONVENTIONS.md diff --git a/CHANGELOG.md b/CHANGELOG.md index 1f725c5..e255558 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -26,6 +26,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - **16.7 — Lifecycle method naming** — Added `inspectAsset()` (replaces `deleteAsset()`) and `collectReferencedChunks()` (replaces `findOrphanedChunks()`) as canonical names on both `CasService` and the facade. Old names are preserved as deprecated aliases that emit observability warnings. Type definitions updated with `@deprecated` JSDoc. ### Changed +- **`runAction` injectable delay** — `runAction()` now accepts an optional `{ delay }` dependency, replacing the hardcoded `setTimeout` call. Tests inject a spy instead of using `vi.useFakeTimers()`, making INTEGRITY_ERROR rate-limit tests deterministic across Node, Bun, and Deno. +- **Test conventions** — Added `test/CONVENTIONS.md` documenting rules for deterministic, cross-runtime tests: inject time dependencies, use `chmod()` instead of `writeFile({ mode })`, avoid global state patching. - **VaultService test observability wiring** — `VaultService.test.js` now passes a `mockObservability()` port to all tests instead of relying on the silent no-op default. `rotateVaultPassphrase.test.js` now passes `SilentObserver` explicitly. If observability wiring breaks, the test suite will catch it. - **`NodeCryptoAdapter.encryptBuffer` JSDoc** — `@returns` annotation corrected to `Promise<...>`, matching the async implementation. - **`maxRestoreBufferSize` documented** — constructor JSDoc and `#config` type in `ContentAddressableStore` now include the parameter. diff --git a/bin/actions.js b/bin/actions.js index c6388b8..2a28ce4 100644 --- a/bin/actions.js +++ b/bin/actions.js @@ -57,11 +57,11 @@ function getHint(code) { } /** - * Delay utility for rate-limiting after sensitive failures. + * Default delay — real setTimeout for production use. * @param {number} ms * @returns {Promise} */ -function delay(ms) { +function defaultDelay(ms) { return new Promise((resolve) => { setTimeout(resolve, ms); }); } @@ -70,9 +70,10 @@ function delay(ms) { * * @param {(...args: any[]) => Promise} fn - The async action function. * @param {() => boolean} getJson - Lazy getter for --json flag value. + * @param {{ delay?: (ms: number) => Promise }} [options] - Injectable dependencies. * @returns {(...args: any[]) => Promise} Wrapped action. */ -export function runAction(fn, getJson) { +export function runAction(fn, getJson, { delay = defaultDelay } = {}) { return async (/** @type {any[]} */ ...args) => { try { await fn(...args); diff --git a/test/CONVENTIONS.md b/test/CONVENTIONS.md new file mode 100644 index 0000000..f723294 --- /dev/null +++ b/test/CONVENTIONS.md @@ -0,0 +1,57 @@ +# Test Conventions + +Rules for writing deterministic, cross-runtime tests. All tests must pass +on Node.js, Bun, and Deno. + +## Time and Scheduling + +**Never assert wall-clock timing.** `Date.now()` deltas are +nondeterministic — they flake under CI load and vary across runtimes. + +**Inject delay/timer dependencies.** If production code uses `setTimeout` +or similar scheduling, accept the delay function as a parameter: + +```js +// production: injectable dependency with a real default +export function runAction(fn, getJson, { delay = defaultDelay } = {}) { ... } + +// test: inject a spy — no global patching needed +const delaySpy = vi.fn().mockResolvedValue(undefined); +const action = runAction(fn, getJson, { delay: delaySpy }); +await action(); +expect(delaySpy).toHaveBeenCalledWith(1000); +``` + +**Avoid `vi.useFakeTimers()`.** Vitest fake timers rely on +`@sinonjs/fake-timers`, which patches globals differently across runtimes. +Prefer dependency injection over global monkey-patching. + +## File Permissions + +**Use `chmod()` after `writeFile()`, not `writeFile({ mode })`.** The +`mode` parameter is filtered through `process.umask()`. A restrictive +umask (e.g., `0o077`) silently strips the bits you requested, making +permission-sensitive tests environment-dependent. + +```js +// wrong — umask can mask the requested mode +await writeFile(path, 'data', { mode: 0o644 }); + +// correct — chmod sets the exact mode regardless of umask +await writeFile(path, 'data'); +await chmod(path, 0o644); +``` + +This applies to macOS and Linux (our supported platforms). Permission +bits are a Unix concept — `chmod` is a no-op on Windows. + +## General Principles + +- **Test behavior, not timing.** Assert that a function was called, not + how long it took. +- **Inject infrastructure.** Clocks, filesystems, network — anything that + varies across environments should be injectable through constructor + parameters or function arguments. +- **No global state patching when injection is available.** If you control + the code under test, add a parameter. Only patch globals for third-party + code you cannot modify. diff --git a/test/unit/cli/actions.test.js b/test/unit/cli/actions.test.js index 9019b82..b635d93 100644 --- a/test/unit/cli/actions.test.js +++ b/test/unit/cli/actions.test.js @@ -116,29 +116,29 @@ describe('runAction — INTEGRITY_ERROR rate-limiting', () => { beforeEach(() => { process.exitCode = undefined; stderrSpy = vi.spyOn(process.stderr, 'write').mockImplementation(() => true); - vi.useFakeTimers(); }); afterEach(() => { - vi.useRealTimers(); process.exitCode = originalExitCode; stderrSpy.mockRestore(); }); - it('delays ~1s on INTEGRITY_ERROR before writing output', async () => { + it('calls delay(1000) on INTEGRITY_ERROR', async () => { + const delaySpy = vi.fn().mockResolvedValue(undefined); const err = Object.assign(new Error('bad key'), { code: 'INTEGRITY_ERROR' }); - const action = runAction(async () => { throw err; }, () => false); - const promise = action(); - await vi.advanceTimersByTimeAsync(1000); - await promise; + const action = runAction(async () => { throw err; }, () => false, { delay: delaySpy }); + await action(); + expect(delaySpy).toHaveBeenCalledWith(1000); expect(process.exitCode).toBe(1); expect(stderrSpy).toHaveBeenCalled(); }); - it('no delay for non-INTEGRITY_ERROR codes', async () => { + it('does not call delay for non-INTEGRITY_ERROR codes', async () => { + const delaySpy = vi.fn().mockResolvedValue(undefined); const err = Object.assign(new Error('gone'), { code: 'MISSING_KEY' }); - const action = runAction(async () => { throw err; }, () => false); + const action = runAction(async () => { throw err; }, () => false, { delay: delaySpy }); await action(); + expect(delaySpy).not.toHaveBeenCalled(); expect(process.exitCode).toBe(1); }); }); From ed746793293ba0e08a53f5e8adf99b13303a3c18 Mon Sep 17 00:00:00 2001 From: James Ross Date: Sun, 8 Mar 2026 11:46:26 -0700 Subject: [PATCH 39/41] fix: address CodeRabbit review round 3 findings - parseIntFlag: reject partial numeric strings (e.g., "64MiB") using regex validation and Number.isSafeInteger check - .casrc validation: add 100 MiB upper bound for chunkSize and CDC sub-keys, matching CasService runtime constraints - Error message: update recipient conflict text to mention all passphrase sources (--vault-passphrase-file, GIT_CAS_PASSPHRASE) - SECURITY.md: remove incorrect recommendation to reduce chunk size for ENCRYPTION_BUFFER_EXCEEDED (WebCrypto buffers the entire stream, not individual chunks) --- SECURITY.md | 2 +- bin/config.js | 18 +++++++++++------- bin/git-cas.js | 7 ++++--- test/unit/cli/config.test.js | 12 +++++++++++- 4 files changed, 27 insertions(+), 12 deletions(-) diff --git a/SECURITY.md b/SECURITY.md index 1f804cf..12fd81d 100644 --- a/SECURITY.md +++ b/SECURITY.md @@ -682,7 +682,7 @@ throw new CasError( **Recommended action**: - Increase `maxEncryptionBufferSize` in the `WebCryptoAdapter` constructor. - Switch to `NodeCryptoAdapter` if streaming encryption is needed. -- Reduce chunk size to keep individual encryption operations within the limit. +- Split the asset before storing, or store without encryption on the Web Crypto path for very large files. --- diff --git a/bin/config.js b/bin/config.js index 0ce38f5..b30dec0 100644 --- a/bin/config.js +++ b/bin/config.js @@ -21,6 +21,7 @@ import { readFileSync } from 'node:fs'; import { resolve } from 'node:path'; const FILENAME = '.casrc'; +const MAX_CHUNK_SIZE = 100 * 1024 * 1024; /** * @typedef {Object} CasConfig @@ -37,13 +38,16 @@ const FILENAME = '.casrc'; /** * @param {any} value * @param {string} name - * @param {number} min + * @param {{ min: number, max?: number }} range */ -function assertInt(value, name, min) { +function assertInt(value, name, { min, max }) { if (value === undefined) { return; } if (!Number.isInteger(value) || value < min) { throw new Error(`${FILENAME}: ${name} must be an integer >= ${min}`); } + if (max !== undefined && value > max) { + throw new Error(`${FILENAME}: ${name} must not exceed ${max}`); + } } /** @@ -67,7 +71,7 @@ function validateCdc(config) { throw new Error(`${FILENAME}: cdc must be an object`); } for (const key of ['minChunkSize', 'targetChunkSize', 'maxChunkSize']) { - assertInt(config.cdc[key], `cdc.${key}`, 1); + assertInt(config.cdc[key], `cdc.${key}`, { min: 1, max: MAX_CHUNK_SIZE }); } } @@ -77,15 +81,15 @@ function validateCdc(config) { * @param {Record} config */ function validateConfig(config) { - assertInt(config.chunkSize, 'chunkSize', 1024); + assertInt(config.chunkSize, 'chunkSize', { min: 1024, max: MAX_CHUNK_SIZE }); assertEnum(config.strategy, 'strategy', ['fixed', 'cdc']); - assertInt(config.concurrency, 'concurrency', 1); + assertInt(config.concurrency, 'concurrency', { min: 1 }); assertEnum(config.codec, 'codec', ['json', 'cbor']); if (config.compression !== undefined && config.compression !== false) { assertEnum(config.compression, 'compression', ['gzip']); } - assertInt(config.merkleThreshold, 'merkleThreshold', 1); - assertInt(config.maxRestoreBufferSize, 'maxRestoreBufferSize', 1024); + assertInt(config.merkleThreshold, 'merkleThreshold', { min: 1 }); + assertInt(config.maxRestoreBufferSize, 'maxRestoreBufferSize', { min: 1024 }); validateCdc(config); } diff --git a/bin/git-cas.js b/bin/git-cas.js index 076e8b0..1c4e7b5 100755 --- a/bin/git-cas.js +++ b/bin/git-cas.js @@ -229,8 +229,9 @@ function parseRecipient(value, previous) { /** @param {string} v */ const parseIntFlag = (v) => { - const n = parseInt(v, 10); - if (Number.isNaN(n)) { throw new Error(`Expected an integer, got "${v}"`); } + if (!/^-?\d+$/.test(v)) { throw new Error(`Expected an integer, got "${v}"`); } + const n = Number(v); + if (!Number.isSafeInteger(n)) { throw new Error(`Expected a safe integer, got "${v}"`); } return n; }; @@ -256,7 +257,7 @@ program .option('--cwd ', 'Git working directory', '.') .action(runAction(async (/** @type {string} */ file, /** @type {Record} */ opts) => { if (opts.recipient && (opts.keyFile || hasPassphraseSource(opts))) { - throw new Error('Provide --key-file/--vault-passphrase or --recipient, not both'); + throw new Error('Provide --key-file or a vault passphrase source (--vault-passphrase, --vault-passphrase-file, GIT_CAS_PASSPHRASE), or --recipient — not both'); } if (opts.force && !opts.tree) { throw new Error('--force requires --tree'); diff --git a/test/unit/cli/config.test.js b/test/unit/cli/config.test.js index 79500fb..205b903 100644 --- a/test/unit/cli/config.test.js +++ b/test/unit/cli/config.test.js @@ -49,7 +49,7 @@ describe('loadConfig', () => { }); }); -describe('loadConfig — validation', () => { +describe('loadConfig — chunkSize validation', () => { afterEach(teardown); it('rejects non-integer chunkSize', () => { @@ -64,6 +64,16 @@ describe('loadConfig — validation', () => { expect(() => loadConfig(tmpDir)).toThrow(/chunkSize must be an integer >= 1024/); }); + it('rejects chunkSize above 100 MiB', () => { + setup(); + writeFileSync(join(tmpDir, '.casrc'), JSON.stringify({ chunkSize: 200 * 1024 * 1024 })); + expect(() => loadConfig(tmpDir)).toThrow(/chunkSize must not exceed/); + }); +}); + +describe('loadConfig — field validation', () => { + afterEach(teardown); + it('rejects invalid strategy', () => { setup(); writeFileSync(join(tmpDir, '.casrc'), JSON.stringify({ strategy: 'random' })); From 4928ddece3d9788e4c6f3df5f692bd0605a4e41d Mon Sep 17 00:00:00 2001 From: James Ross Date: Sun, 8 Mar 2026 11:57:19 -0700 Subject: [PATCH 40/41] refactor(cli): use Commander.js choices() for enum validation Replace hand-rolled validateKdfAlgorithm() and free-text --strategy, --codec, --algorithm options with Commander's built-in Option.choices() API. Commander rejects invalid values at parse time with a clear error message, eliminating custom validation code. --- bin/git-cas.js | 23 +++++------------------ 1 file changed, 5 insertions(+), 18 deletions(-) diff --git a/bin/git-cas.js b/bin/git-cas.js index 1c4e7b5..d034386 100755 --- a/bin/git-cas.js +++ b/bin/git-cas.js @@ -1,7 +1,7 @@ #!/usr/bin/env node import { readFileSync } from 'node:fs'; -import { program } from 'commander'; +import { program, Option } from 'commander'; import GitPlumbing, { ShellRunnerFactory } from '@git-stunts/plumbing'; import ContentAddressableStore, { EventEmitterObserver, CborCodec } from '../index.js'; import Manifest from '../src/domain/value-objects/Manifest.js'; @@ -38,17 +38,6 @@ function readKeyFile(keyFilePath) { return buf; } -/** - * Validate that a KDF algorithm string is a supported value. - * - * @param {string} alg - */ -function validateKdfAlgorithm(alg) { - if (!['pbkdf2', 'scrypt'].includes(alg)) { - throw new Error(`Invalid KDF algorithm "${alg}": must be "pbkdf2" or "scrypt"`); - } -} - /** * Create a CAS instance for the given working directory. * @@ -246,10 +235,10 @@ program .option('--vault-passphrase ', 'Vault-level passphrase for encryption (prefer GIT_CAS_PASSPHRASE env var)') .option('--vault-passphrase-file ', 'Read vault passphrase from file (use - for stdin)') .option('--gzip', 'Enable gzip compression') - .option('--strategy ', 'Chunking strategy: fixed or cdc') + .addOption(new Option('--strategy ', 'Chunking strategy').choices(['fixed', 'cdc'])) .option('--chunk-size ', 'Chunk size in bytes', parseIntFlag) .option('--concurrency ', 'Parallel chunk I/O operations', parseIntFlag) - .option('--codec ', 'Manifest codec: json or cbor') + .addOption(new Option('--codec ', 'Manifest codec').choices(['json', 'cbor'])) .option('--target-chunk-size ', 'CDC target chunk size', parseIntFlag) .option('--min-chunk-size ', 'CDC minimum chunk size', parseIntFlag) .option('--max-chunk-size ', 'CDC maximum chunk size', parseIntFlag) @@ -430,7 +419,7 @@ vault .description('Initialize the vault') .option('--vault-passphrase ', 'Passphrase for vault-level encryption (prefer GIT_CAS_PASSPHRASE env var)') .option('--vault-passphrase-file ', 'Read vault passphrase from file (use - for stdin)') - .option('--algorithm ', 'KDF algorithm (pbkdf2 or scrypt)', 'pbkdf2') + .addOption(new Option('--algorithm ', 'KDF algorithm').choices(['pbkdf2', 'scrypt']).default('pbkdf2')) .option('--cwd ', 'Git working directory', '.') .action(runAction(async (/** @type {Record} */ opts) => { const cas = createCas(opts.cwd); @@ -438,7 +427,6 @@ vault const initOpts = {}; const passphrase = await resolvePassphrase(opts, { confirm: true }); if (passphrase) { - validateKdfAlgorithm(opts.algorithm); initOpts.passphrase = passphrase; initOpts.kdfOptions = { algorithm: /** @type {'pbkdf2' | 'scrypt'} */ (opts.algorithm) }; } @@ -592,7 +580,7 @@ vault .option('--new-passphrase ', 'New vault passphrase') .option('--old-passphrase-file ', 'Read old passphrase from file (- for stdin)') .option('--new-passphrase-file ', 'Read new passphrase from file (- for stdin)') - .option('--algorithm ', 'KDF algorithm (pbkdf2 or scrypt)') + .addOption(new Option('--algorithm ', 'KDF algorithm').choices(['pbkdf2', 'scrypt'])) .option('--cwd ', 'Git working directory', '.') .action(runAction(async (/** @type {Record} */ opts) => { const { oldPassphrase, newPassphrase } = await resolveRotatePassphrases(opts); @@ -603,7 +591,6 @@ vault newPassphrase, }; if (opts.algorithm) { - validateKdfAlgorithm(opts.algorithm); rotateOpts.kdfOptions = { algorithm: /** @type {'pbkdf2' | 'scrypt'} */ (opts.algorithm) }; } const { commitOid, rotatedSlugs, skippedSlugs } = await cas.rotateVaultPassphrase(rotateOpts); From 4c13b33ed4a753904a468d04daf48fb94507b87e Mon Sep 17 00:00:00 2001 From: James Ross Date: Sun, 8 Mar 2026 12:05:19 -0700 Subject: [PATCH 41/41] fix: address CodeRabbit review round 4 nitpicks MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - CDC inter-field ordering validation in .casrc (min ≤ target ≤ max) - Tighten delay test to verify runAction awaits the delay via deferred - Clarify CHANGELOG C8–C10 wording (existing audit findings, not new) --- CHANGELOG.md | 2 +- bin/config.js | 20 ++++++++++++++++++++ test/unit/cli/actions.test.js | 18 +++++++++++++----- test/unit/cli/config.test.js | 28 ++++++++++++++++++++++++++++ 4 files changed, 62 insertions(+), 6 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index e255558..10f3986 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -14,7 +14,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - **`.casrc` config file** — JSON config file at the repository root provides default values for CLI flags. CLI flags always take precedence. Supports: `chunkSize`, `strategy`, `concurrency`, `codec`, `compression`, `merkleThreshold`, `maxRestoreBufferSize`, and `cdc.*` sub-keys. - **CODE-EVAL.md** — Forensic architectural audit (zero-knowledge code extraction, critical assessment, roadmap reconciliation, prescriptive blueprint). - **M16 Capstone** — New milestone in ROADMAP.md addressing all 9 audit flaws and 10 concerns (C1–C10). 13 task cards, ~698 LoC, ~21h estimated. -- **Concerns C8–C10** — Three new architectural concerns identified by the audit: crypto adapter LSP violation (C8), FixedChunker quadratic allocation (C9), encrypt-then-chunk dedup loss (C10). +- **Concerns C8–C10** — Three architectural concerns from the CODE-EVAL.md audit now documented: crypto adapter LSP violation (C8), FixedChunker quadratic allocation (C9), encrypt-then-chunk dedup loss (C10). - **CasError codes** — `RESTORE_TOO_LARGE` and `ENCRYPTION_BUFFER_EXCEEDED` registered in canonical error code table. - **16.2 — Memory restore guard** — `CasService` accepts `maxRestoreBufferSize` (default 512 MiB). `_restoreBuffered` throws `RESTORE_TOO_LARGE` with `{ size, limit }` meta when encrypted/compressed restore would exceed the limit. Unencrypted streaming restore is unaffected. - **16.3 — Web Crypto encryption buffer guard** — `WebCryptoAdapter` accepts `maxEncryptionBufferSize` (default 512 MiB). Throws `ENCRYPTION_BUFFER_EXCEEDED` when streaming encryption exceeds the limit, since Web Crypto AES-GCM is a one-shot API. NodeCryptoAdapter uses true streaming and is unaffected. diff --git a/bin/config.js b/bin/config.js index b30dec0..e92ef3a 100644 --- a/bin/config.js +++ b/bin/config.js @@ -62,6 +62,25 @@ function assertEnum(value, name, allowed) { } } +/** + * @param {Record} config + */ +/** + * @param {{ minChunkSize?: number, targetChunkSize?: number, maxChunkSize?: number }} cdc + */ +function assertCdcOrdering(cdc) { + const { minChunkSize, targetChunkSize, maxChunkSize } = cdc; + if (minChunkSize !== undefined && maxChunkSize !== undefined && minChunkSize > maxChunkSize) { + throw new Error(`${FILENAME}: cdc.minChunkSize must not exceed cdc.maxChunkSize`); + } + if (targetChunkSize !== undefined && minChunkSize !== undefined && targetChunkSize < minChunkSize) { + throw new Error(`${FILENAME}: cdc.targetChunkSize must be >= cdc.minChunkSize`); + } + if (targetChunkSize !== undefined && maxChunkSize !== undefined && targetChunkSize > maxChunkSize) { + throw new Error(`${FILENAME}: cdc.targetChunkSize must be <= cdc.maxChunkSize`); + } +} + /** * @param {Record} config */ @@ -73,6 +92,7 @@ function validateCdc(config) { for (const key of ['minChunkSize', 'targetChunkSize', 'maxChunkSize']) { assertInt(config.cdc[key], `cdc.${key}`, { min: 1, max: MAX_CHUNK_SIZE }); } + assertCdcOrdering(config.cdc); } /** diff --git a/test/unit/cli/actions.test.js b/test/unit/cli/actions.test.js index b635d93..6bd4ab4 100644 --- a/test/unit/cli/actions.test.js +++ b/test/unit/cli/actions.test.js @@ -123,14 +123,22 @@ describe('runAction — INTEGRITY_ERROR rate-limiting', () => { stderrSpy.mockRestore(); }); - it('calls delay(1000) on INTEGRITY_ERROR', async () => { - const delaySpy = vi.fn().mockResolvedValue(undefined); + it('awaits delay(1000) before writing INTEGRITY_ERROR output', async () => { + let releaseDelay = () => {}; + const delaySpy = vi.fn().mockImplementation(() => new Promise((resolve) => { + releaseDelay = resolve; + })); const err = Object.assign(new Error('bad key'), { code: 'INTEGRITY_ERROR' }); - const action = runAction(async () => { throw err; }, () => false, { delay: delaySpy }); - await action(); + const actionPromise = runAction(() => { throw err; }, () => false, { delay: delaySpy })(); + expect(delaySpy).toHaveBeenCalledWith(1000); - expect(process.exitCode).toBe(1); + expect(stderrSpy).not.toHaveBeenCalled(); + expect(process.exitCode).toBeUndefined(); + + releaseDelay(); + await actionPromise; expect(stderrSpy).toHaveBeenCalled(); + expect(process.exitCode).toBe(1); }); it('does not call delay for non-INTEGRITY_ERROR codes', async () => { diff --git a/test/unit/cli/config.test.js b/test/unit/cli/config.test.js index 205b903..6a81879 100644 --- a/test/unit/cli/config.test.js +++ b/test/unit/cli/config.test.js @@ -110,6 +110,34 @@ describe('loadConfig — field validation', () => { }); }); +describe('loadConfig — CDC inter-field ordering', () => { + afterEach(teardown); + + it('rejects cdc.minChunkSize > cdc.maxChunkSize', () => { + setup(); + writeFileSync(join(tmpDir, '.casrc'), JSON.stringify({ + cdc: { minChunkSize: 16384, targetChunkSize: 8192, maxChunkSize: 4096 }, + })); + expect(() => loadConfig(tmpDir)).toThrow(/cdc\.minChunkSize must not exceed cdc\.maxChunkSize/); + }); + + it('rejects cdc.targetChunkSize < cdc.minChunkSize', () => { + setup(); + writeFileSync(join(tmpDir, '.casrc'), JSON.stringify({ + cdc: { minChunkSize: 8192, targetChunkSize: 4096, maxChunkSize: 16384 }, + })); + expect(() => loadConfig(tmpDir)).toThrow(/cdc\.targetChunkSize must be >= cdc\.minChunkSize/); + }); + + it('rejects cdc.targetChunkSize > cdc.maxChunkSize', () => { + setup(); + writeFileSync(join(tmpDir, '.casrc'), JSON.stringify({ + cdc: { minChunkSize: 2048, targetChunkSize: 32768, maxChunkSize: 16384 }, + })); + expect(() => loadConfig(tmpDir)).toThrow(/cdc\.targetChunkSize must be <= cdc\.maxChunkSize/); + }); +}); + describe('mergeConfig — CLI overrides', () => { it('CLI flags override config', () => { const { casConfig } = mergeConfig({ chunkSize: 4096, strategy: 'fixed' }, { chunkSize: 65536 });