diff --git a/CHANGELOG.md b/CHANGELOG.md index 5b5cde4a..af4c5ebb 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -7,6 +7,29 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [Unreleased] +## [4.0.1] - 2026-02-22 + +### Fixed + +- **OID guard in `writeContent()`** — Throw if `getContentOid()` returns null after a successful write, enforcing the `WriteContentResult.sha: string` contract (#284) +- **Error cause chain in `readContent()`** — Capture original error via `{ cause: err }` so callers can distinguish blob-not-found from infrastructure failures (#284) +- **Null guard in `readContent()`** — Changed `!contentBuf` to explicit `contentBuf == null` for clearer null/undefined intent (#284) +- **Null guard in `hasContent()`** — Changed `sha !== null` to `sha != null` to catch both null and undefined from `getContentOid()`, consistent with other callers (#284) +- **ROADMAP stale integrity-check language** — Corrected binary content backlog item to reflect WARP-native blob storage; reframed `--verify` flag as OID existence check (#284) +- **JSDoc typedef terminology** — Changed "Git blob SHA" / "Written blob SHA" to "Git blob OID" in `ContentMeta` and `WriteContentResult` typedefs (#284) +- **Dead `execSync` import** — Removed unused `execSync` from `test/content.test.js`; only `execFileSync` is used (#284) + +### Added + +- **Empty content edge-case test** — Verifies `writeContent()` handles empty string input correctly (size 0, round-trip intact) (#284) + +## [4.0.0] - 2026-02-22 + +### Changed + +- **BREAKING: Migrate content system to git-warp native API** — Replaced custom CAS layer (`git hash-object` / `git cat-file`) with `@git-stunts/git-warp` native `setContent()` / `getContent()` API. Content properties now use WARP's `CONTENT_PROPERTY_KEY` instead of custom `_content.sha`. Removes all direct git subprocess calls from content module (#284) +- **Test count** — 577 tests across 29 files (was 571) + ## [3.3.0] - 2026-02-22 ### Added diff --git a/ROADMAP.md b/ROADMAP.md index 8f1b9891..27ee8a56 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -2189,13 +2189,20 @@ Two issues were filed during the M12 extension polish pass and intentionally def ### Content system enhancements (from M13 VESSEL review) -- **`git mind content list`** — Query all nodes that have `_content.sha` properties. Currently there's no way to discover which nodes carry content without inspecting each one individually. -- **Binary content support** — Add base64 encoding for non-text MIME types. Currently the content system is text-only (UTF-8); non-UTF-8 blobs fail the integrity check by design. Requires reintroducing encoding metadata and updating `readContent()` to handle buffer round-trips. -- **`content meta --verify` flag** — Run the SHA integrity check without dumping the full content body. Useful for bulk health checks across all content-bearing nodes. +- **`git mind content list`** — Query all nodes that have `_content` properties. Currently there's no way to discover which nodes carry content without inspecting each one individually. +- **Binary content support** — Add base64 encoding for non-text MIME types. Currently the content system is text-only (UTF-8); WARP stores blobs natively but `readContent()` always calls `.toString('utf-8')`, so binary round-trips are lossy. Requires returning a `Buffer` for non-text MIME types and reintroducing encoding metadata. +- **`content meta --verify` flag** — Verify the content blob OID exists in the git object store without dumping the full body. Useful for bulk health checks across all content-bearing nodes. ### Codebase hardening (from M13 VESSEL review) -- **Standardize all git subprocess calls to `execFileSync`** — `src/content.js` now uses `execFileSync` exclusively, but other modules (e.g. `processCommitCmd` in `commands.js`) still use `execSync` with string interpolation. Audit and migrate for consistency and defense-in-depth. +- **Standardize all git subprocess calls to `execFileSync`** — `src/content.js` eliminated all subprocess calls via WARP-native migration, but other modules (e.g. `processCommitCmd` in `commands.js`) still use `execSync` with string interpolation. Audit and migrate for consistency and defense-in-depth. + +### Content module error handling (from #284 CodeRabbit review) + +- **Establish error message conventions** — Content module errors lack a consistent prefix/format. Consider a `[content] failed: ` convention or a `ContentError` class hierarchy for typed catch handling. +- **Audit `try/catch` blocks for cause preservation** — The `readContent()` catch now chains `{ cause: err }`, but other modules may swallow root causes. Audit all catch blocks in domain code for cause propagation. +- **Integration test for `error.cause` chain** — Verify callers of `readContent()` can access `error.cause` when blob retrieval fails. Currently only the error message is tested. +- **`--verbose` flag for content CLI** — Dump the full `error.cause` chain when content operations fail. Helps diagnose infrastructure vs. missing-blob issues. ### Other backlog items diff --git a/package-lock.json b/package-lock.json index b79cd2f5..461647fa 100644 --- a/package-lock.json +++ b/package-lock.json @@ -1,12 +1,12 @@ { "name": "@neuroglyph/git-mind", - "version": "3.2.0", + "version": "4.0.1", "lockfileVersion": 3, "requires": true, "packages": { "": { "name": "@neuroglyph/git-mind", - "version": "3.2.0", + "version": "4.0.1", "license": "Apache-2.0", "dependencies": { "@git-stunts/git-warp": "^11.5.0", diff --git a/package.json b/package.json index e41a74cd..7e3e0c5b 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "@neuroglyph/git-mind", - "version": "3.3.0", + "version": "4.0.1", "description": "A project knowledge graph tool built on git-warp", "type": "module", "license": "Apache-2.0", diff --git a/src/cli/commands.js b/src/cli/commands.js index a0a82e45..92025f3e 100644 --- a/src/cli/commands.js +++ b/src/cli/commands.js @@ -838,7 +838,7 @@ export async function contentSet(cwd, nodeId, filePath, opts = {}) { const mime = opts.mime ?? MIME_MAP[extname(filePath).toLowerCase()] ?? 'application/octet-stream'; const graph = await loadGraph(cwd); - const result = await writeContent(cwd, graph, nodeId, buf, { mime }); + const result = await writeContent(graph, nodeId, buf, { mime }); if (opts.json) { outputJson('content-set', result); @@ -861,7 +861,7 @@ export async function contentSet(cwd, nodeId, filePath, opts = {}) { export async function contentShow(cwd, nodeId, opts = {}) { try { const graph = await loadGraph(cwd); - const { content, meta } = await readContent(cwd, graph, nodeId); + const { content, meta } = await readContent(graph, nodeId); if (opts.json) { outputJson('content-show', { nodeId, content, ...meta }); diff --git a/src/content.js b/src/content.js index 2ba1849d..293faa18 100644 --- a/src/content.js +++ b/src/content.js @@ -1,42 +1,25 @@ /** * @module content * Content-on-node: attach rich content (markdown, text, etc.) to graph nodes - * using git's native content-addressed storage. + * using git-warp's native content-addressed storage. * - * Content is stored as git blobs via `git hash-object -w`. The blob SHA and - * metadata are recorded as WARP node properties under the `_content.` prefix. + * Content is stored as git blobs via WARP's patch.attachContent(). The blob OID + * and metadata are recorded as WARP node properties. * * Property convention: - * _content.sha — git blob SHA - * _content.mime — MIME type (e.g. "text/markdown") - * _content.size — byte count + * _content — git blob OID (managed by WARP via CONTENT_PROPERTY_KEY) + * _content.mime — MIME type (e.g. "text/markdown") + * _content.size — byte count */ -import { execFileSync } from 'node:child_process'; +import { CONTENT_PROPERTY_KEY } from '@git-stunts/git-warp'; -/** Property key prefix for content metadata. */ -const PREFIX = '_content.'; - -/** Known content property keys. */ -const KEYS = { - sha: `${PREFIX}sha`, - mime: `${PREFIX}mime`, - size: `${PREFIX}size`, -}; - -/** Validates a string is a 40- or 64-hex-char git object hash (SHA-1 or SHA-256). */ -const SHA_RE = /^[0-9a-f]{40,64}$/; - -/** @throws {Error} if sha is not a valid git object hash (40 or 64 hex chars). */ -function assertValidSha(sha) { - if (typeof sha !== 'string' || !SHA_RE.test(sha)) { - throw new Error(`Invalid content SHA: ${sha}`); - } -} +const MIME_KEY = '_content.mime'; +const SIZE_KEY = '_content.size'; /** * @typedef {object} ContentMeta - * @property {string} sha - Git blob SHA + * @property {string} sha - Git blob OID * @property {string} mime - MIME type * @property {number} size - Content size in bytes */ @@ -44,23 +27,22 @@ function assertValidSha(sha) { /** * @typedef {object} WriteContentResult * @property {string} nodeId - Target node - * @property {string} sha - Written blob SHA + * @property {string} sha - Written blob OID * @property {string} mime - MIME type * @property {number} size - Byte count */ /** - * Write content to a graph node. Stores the content as a git blob and records - * metadata as node properties. + * Write content to a graph node. Stores the content as a git blob via WARP's + * native content API and records metadata as node properties. * - * @param {string} cwd - Repository working directory * @param {import('@git-stunts/git-warp').default} graph - WARP graph instance * @param {string} nodeId - Target node ID * @param {Buffer|string} content - Content to store * @param {{ mime?: string }} [opts] * @returns {Promise} */ -export async function writeContent(cwd, graph, nodeId, content, opts = {}) { +export async function writeContent(graph, nodeId, content, opts = {}) { const exists = await graph.hasNode(nodeId); if (!exists) { throw new Error(`Node not found: ${nodeId}`); @@ -70,70 +52,52 @@ export async function writeContent(cwd, graph, nodeId, content, opts = {}) { const mime = opts.mime ?? 'text/plain'; const size = buf.length; - // Write blob to git object store - const sha = execFileSync('git', ['hash-object', '-w', '--stdin'], { - cwd, - input: buf, - encoding: 'utf-8', - }).trim(); - - // Record metadata as node properties const patch = await graph.createPatch(); - patch.setProperty(nodeId, KEYS.sha, sha); - patch.setProperty(nodeId, KEYS.mime, mime); - patch.setProperty(nodeId, KEYS.size, size); + await patch.attachContent(nodeId, buf); + patch.setProperty(nodeId, MIME_KEY, mime); + patch.setProperty(nodeId, SIZE_KEY, size); await patch.commit(); + const sha = await graph.getContentOid(nodeId); + + if (sha == null) { + throw new Error(`Failed to retrieve OID after writing content to node: ${nodeId}`); + } + return { nodeId, sha, mime, size }; } /** - * Read content attached to a graph node. Retrieves the blob from git's object - * store and verifies SHA integrity. + * Read content attached to a graph node. Retrieves the blob from WARP's + * native content store. * - * @param {string} cwd - Repository working directory * @param {import('@git-stunts/git-warp').default} graph - WARP graph instance * @param {string} nodeId - Target node ID * @returns {Promise<{ content: string, meta: ContentMeta }>} */ -export async function readContent(cwd, graph, nodeId) { +export async function readContent(graph, nodeId) { const meta = await getContentMeta(graph, nodeId); if (!meta) { throw new Error(`No content attached to node: ${nodeId}`); } - // Validate SHA before passing to git - assertValidSha(meta.sha); - - // Retrieve blob from git object store - let content; + let contentBuf; try { - content = execFileSync('git', ['cat-file', 'blob', meta.sha], { - cwd, - encoding: 'utf-8', - }); - } catch { + contentBuf = await graph.getContent(nodeId); + } catch (err) { throw new Error( - `Content blob ${meta.sha} not found in git object store for node: ${nodeId}`, + `Failed to retrieve content blob ${meta.sha} for node: ${nodeId}`, + { cause: err }, ); } - // Verify integrity: re-hash and compare - const verifyBuf = Buffer.from(content, 'utf-8'); - const verifySha = execFileSync('git', ['hash-object', '--stdin'], { - cwd, - input: verifyBuf, - encoding: 'utf-8', - }).trim(); - - if (verifySha !== meta.sha) { + if (contentBuf == null || (contentBuf.length === 0 && meta.size > 0)) { throw new Error( - `Content integrity check failed for node ${nodeId}: ` + - `expected ${meta.sha}, got ${verifySha}`, + `Failed to retrieve content blob ${meta.sha} for node: ${nodeId}`, ); } - return { content, meta }; + return { content: contentBuf.toString('utf-8'), meta }; } /** @@ -150,14 +114,15 @@ export async function getContentMeta(graph, nodeId) { throw new Error(`Node not found: ${nodeId}`); } - const propsMap = await graph.getNodeProps(nodeId); - const sha = propsMap?.get(KEYS.sha) ?? null; + const sha = await graph.getContentOid(nodeId); if (!sha) return null; + const propsMap = await graph.getNodeProps(nodeId); + return { sha, - mime: propsMap.get(KEYS.mime) ?? 'text/plain', - size: propsMap.get(KEYS.size) ?? 0, + mime: propsMap?.get(MIME_KEY) ?? 'text/plain', + size: propsMap?.get(SIZE_KEY) ?? 0, }; } @@ -172,13 +137,12 @@ export async function hasContent(graph, nodeId) { const exists = await graph.hasNode(nodeId); if (!exists) return false; - const propsMap = await graph.getNodeProps(nodeId); - const sha = propsMap?.get(KEYS.sha) ?? null; - return sha !== null; + const sha = await graph.getContentOid(nodeId); + return sha != null; } /** - * Delete content from a node by clearing the `_content.*` properties. + * Delete content from a node by clearing the content properties. * The git blob remains in the object store (cleaned up by git gc). * * @param {import('@git-stunts/git-warp').default} graph - WARP graph instance @@ -191,17 +155,16 @@ export async function deleteContent(graph, nodeId) { throw new Error(`Node not found: ${nodeId}`); } - const propsMap = await graph.getNodeProps(nodeId); - const previousSha = propsMap?.get(KEYS.sha) ?? null; + const previousSha = await graph.getContentOid(nodeId); if (!previousSha) { return { nodeId, removed: false, previousSha: null }; } const patch = await graph.createPatch(); - patch.setProperty(nodeId, KEYS.sha, null); - patch.setProperty(nodeId, KEYS.mime, null); - patch.setProperty(nodeId, KEYS.size, null); + patch.setProperty(nodeId, CONTENT_PROPERTY_KEY, null); + patch.setProperty(nodeId, MIME_KEY, null); + patch.setProperty(nodeId, SIZE_KEY, null); await patch.commit(); return { nodeId, removed: true, previousSha }; diff --git a/test/content.test.js b/test/content.test.js index 3194aa5f..9ed81579 100644 --- a/test/content.test.js +++ b/test/content.test.js @@ -7,8 +7,9 @@ import { describe, it, expect, beforeEach, afterEach, beforeAll } from 'vitest'; import { mkdtemp, rm, readFile, writeFile } from 'node:fs/promises'; import { join } from 'node:path'; import { tmpdir } from 'node:os'; -import { execFileSync, execSync } from 'node:child_process'; +import { execFileSync } from 'node:child_process'; import Ajv from 'ajv/dist/2020.js'; +import { CONTENT_PROPERTY_KEY } from '@git-stunts/git-warp'; import { initGraph } from '../src/graph.js'; import { writeContent, readContent, getContentMeta, hasContent, deleteContent } from '../src/content.js'; @@ -60,7 +61,7 @@ describe('content store core', () => { }); it('writeContent stores blob and sets properties', async () => { - const result = await writeContent(tempDir, graph, 'doc:readme', '# Hello World\n', { + const result = await writeContent(graph, 'doc:readme', '# Hello World\n', { mime: 'text/markdown', }); @@ -73,46 +74,26 @@ describe('content store core', () => { it('readContent retrieves correct content', async () => { const body = '# Hello World\n\nThis is a test document.\n'; - await writeContent(tempDir, graph, 'doc:readme', body, { mime: 'text/markdown' }); + await writeContent(graph, 'doc:readme', body, { mime: 'text/markdown' }); - const { content, meta } = await readContent(tempDir, graph, 'doc:readme'); + const { content, meta } = await readContent(graph, 'doc:readme'); expect(content).toBe(body); expect(meta.mime).toBe('text/markdown'); }); it('readContent throws when blob is missing from object store', async () => { - await writeContent(tempDir, graph, 'doc:readme', 'original', { mime: 'text/plain' }); + await writeContent(graph, 'doc:readme', 'original', { mime: 'text/plain' }); // Point to a valid-looking SHA that doesn't exist in the object store const patch = await graph.createPatch(); - patch.setProperty('doc:readme', '_content.sha', 'deadbeefdeadbeefdeadbeefdeadbeefdeadbeef'); + patch.setProperty('doc:readme', CONTENT_PROPERTY_KEY, 'deadbeefdeadbeefdeadbeefdeadbeefdeadbeef'); await patch.commit(); - await expect(readContent(tempDir, graph, 'doc:readme')).rejects.toThrow(/not found in git object store/); - }); - - it('readContent detects integrity mismatch on non-UTF-8 blob', async () => { - // Write a blob with non-UTF-8 bytes directly via git — the UTF-8 - // round-trip in readContent will corrupt the data, producing a - // different hash and triggering the integrity check. - const binaryBuf = Buffer.from([0x80, 0x81, 0x82, 0xFF, 0xFE]); - const sha = execFileSync('git', ['hash-object', '-w', '--stdin'], { - cwd: tempDir, - input: binaryBuf, - encoding: 'utf-8', - }).trim(); - - const patch = await graph.createPatch(); - patch.setProperty('doc:readme', '_content.sha', sha); - patch.setProperty('doc:readme', '_content.mime', 'application/octet-stream'); - patch.setProperty('doc:readme', '_content.size', 5); - await patch.commit(); - - await expect(readContent(tempDir, graph, 'doc:readme')).rejects.toThrow(/integrity check failed/); + await expect(readContent(graph, 'doc:readme')).rejects.toThrow(/Failed to retrieve content blob/); }); it('getContentMeta returns correct metadata', async () => { - await writeContent(tempDir, graph, 'doc:readme', 'test', { mime: 'text/plain' }); + await writeContent(graph, 'doc:readme', 'test', { mime: 'text/plain' }); const meta = await getContentMeta(graph, 'doc:readme'); expect(meta).not.toBeNull(); @@ -128,7 +109,7 @@ describe('content store core', () => { }); it('hasContent returns true for node with content', async () => { - await writeContent(tempDir, graph, 'doc:readme', 'test', { mime: 'text/plain' }); + await writeContent(graph, 'doc:readme', 'test', { mime: 'text/plain' }); expect(await hasContent(graph, 'doc:readme')).toBe(true); }); @@ -141,7 +122,7 @@ describe('content store core', () => { }); it('deleteContent removes properties', async () => { - await writeContent(tempDir, graph, 'doc:readme', 'test', { mime: 'text/plain' }); + await writeContent(graph, 'doc:readme', 'test', { mime: 'text/plain' }); const result = await deleteContent(graph, 'doc:readme'); expect(result.removed).toBe(true); @@ -157,13 +138,13 @@ describe('content store core', () => { it('writeContent fails on non-existent node', async () => { await expect( - writeContent(tempDir, graph, 'doc:nonexistent', 'test', { mime: 'text/plain' }), + writeContent(graph, 'doc:nonexistent', 'test', { mime: 'text/plain' }), ).rejects.toThrow(/Node not found/); }); it('readContent fails on node without content', async () => { await expect( - readContent(tempDir, graph, 'doc:readme'), + readContent(graph, 'doc:readme'), ).rejects.toThrow(/No content attached/); }); @@ -180,21 +161,32 @@ describe('content store core', () => { }); it('overwrite replaces content cleanly', async () => { - await writeContent(tempDir, graph, 'doc:readme', 'version 1', { mime: 'text/plain' }); - await writeContent(tempDir, graph, 'doc:readme', 'version 2', { mime: 'text/markdown' }); + await writeContent(graph, 'doc:readme', 'version 1', { mime: 'text/plain' }); + await writeContent(graph, 'doc:readme', 'version 2', { mime: 'text/markdown' }); - const { content, meta } = await readContent(tempDir, graph, 'doc:readme'); + const { content, meta } = await readContent(graph, 'doc:readme'); expect(content).toBe('version 2'); expect(meta.mime).toBe('text/markdown'); }); it('handles Buffer input', async () => { const buf = Buffer.from('binary-safe content', 'utf-8'); - await writeContent(tempDir, graph, 'doc:readme', buf, { mime: 'application/octet-stream' }); + await writeContent(graph, 'doc:readme', buf, { mime: 'application/octet-stream' }); - const { content } = await readContent(tempDir, graph, 'doc:readme'); + const { content } = await readContent(graph, 'doc:readme'); expect(content).toBe('binary-safe content'); }); + + it('stores empty string content', async () => { + const result = await writeContent(graph, 'doc:readme', '', { mime: 'text/plain' }); + + expect(result.size).toBe(0); + const meta = await getContentMeta(graph, 'doc:readme'); + expect(meta.size).toBe(0); + + const { content } = await readContent(graph, 'doc:readme'); + expect(content).toBe(''); + }); }); describe('content CLI commands', () => {