Skip to content
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Fixed

- **Missing content blob OIDs now throw instead of reading as empty bytes** — `GitGraphAdapter.readBlob()` now disambiguates real zero-byte blobs from swallowed missing-object reads by checking object existence when a blob stream collects to zero bytes. Corrupted `_content` / edge-content references now surface `PersistenceError(E_MISSING_OBJECT)` through `getContent()` / `getEdgeContent()` instead of returning a truthy empty buffer.
- **Deno CI resolver drift** — The Deno test image now imports a Node 22 npm toolchain from `node:22-slim`, installs dependencies with `npm ci`, and runs tests with `--node-modules-dir=manual`, avoiding runtime npm re-resolution of `cbor-extract` optional platform packages while keeping the container on the repo’s supported Node engine line.
- **Markdown code-sample linter edge cases** — The Markdown JS/TS sample linter now recognizes fenced code blocks indented by up to three spaces, rejects malformed mixed-marker fences, fails on unterminated JS/TS fences, and parses snippets with the repository’s configured TypeScript target from `tsconfig.base.json`.
- **B87 review follow-ups** — Clarified the ADR folds snippet as a wholly proposed `graph.view()` sketch, corrected the pre-push quick-mode gate label to Gate 8, aligned the local hook’s gate numbers with CI for faster failure triage, and removed the self-expiring `pending merge` wording from the completed-roadmap archive entry.
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -462,7 +462,7 @@ await patch.attachContent('adr:0007', '# ADR 0007\n\nDecision text...'); // asyn
await patch.commit();

// Read content back
const buffer = await graph.getContent('adr:0007'); // Buffer | null
const buffer = await graph.getContent('adr:0007'); // Uint8Array | null
const oid = await graph.getContentOid('adr:0007'); // hex SHA or null

// Edge content works the same way (assumes nodes and edge already exist)
Expand All @@ -472,7 +472,7 @@ await patch2.commit();
const edgeBuf = await graph.getEdgeContent('a', 'b', 'rel');
```

Content blobs survive `git gc` — their OIDs are embedded in the patch commit tree and checkpoint tree, keeping them reachable.
Content blobs survive `git gc` — their OIDs are embedded in the patch commit tree and checkpoint tree, keeping them reachable. If a live `_content` reference points at a missing blob anyway (for example due to manual corruption), `getContent()` / `getEdgeContent()` throw instead of silently returning empty bytes.

### Writer API

Expand Down
6 changes: 4 additions & 2 deletions docs/specs/CONTENT_ATTACHMENT.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,15 +104,17 @@ Both methods are async (they call `writeBlob()` internally) and return the build
#### Read API (WarpGraph)

```javascript
const buffer = await graph.getContent('adr:0007'); // Buffer | null
const buffer = await graph.getContent('adr:0007'); // Uint8Array | null
const oid = await graph.getContentOid('adr:0007'); // string | null

// Edge content
const edgeBuf = await graph.getEdgeContent('a', 'b', 'rel');
const edgeOid = await graph.getEdgeContentOid('a', 'b', 'rel');
```

`getContent()` returns a raw `Buffer`. Consumers wanting text call `.toString('utf8')`.
`getContent()` returns raw `Uint8Array` bytes. Consumers wanting text should decode with `new TextDecoder().decode(buffer)`.
If `_content` points at a missing blob OID, `getContent()` throws instead of silently returning empty bytes.
`getEdgeContent()` has the same byte-decoding and missing-blob semantics for edge `_content` references.

#### Constant

Expand Down
46 changes: 36 additions & 10 deletions scripts/hooks/pre-push
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,32 @@ if [ -z "$ROOT" ]; then
fi
cd "$ROOT"

command_exists() {
launcher="$1"
cmd="$2"
if [ -n "$launcher" ]; then
command -v "$launcher" >/dev/null 2>&1 && [ -f "$cmd" ] && [ -r "$cmd" ]
else
command -v "$cmd" >/dev/null 2>&1
fi
}

run_tool() {
launcher="$1"
cmd="$2"
shift 2
if [ -n "$launcher" ]; then
"$launcher" "$cmd" "$@"
else
"$cmd" "$@"
fi
}

NPM_BIN="${WARP_NPM_BIN:-npm}"
NPM_LAUNCHER="${WARP_NPM_LAUNCHER:-}"
LINKCHECK_BIN="${WARP_LINKCHECK_BIN:-lychee}"
LINKCHECK_LAUNCHER="${WARP_LINKCHECK_LAUNCHER:-}"

# ── Quick mode: skip unit tests when WARP_QUICK_PUSH=1 or true ──────────
QUICK=0
if [ "$WARP_QUICK_PUSH" = "1" ] || [ "$WARP_QUICK_PUSH" = "true" ]; then
Expand All @@ -26,29 +52,29 @@ echo " IRONCLAD M9 — pre-push type firewall"
echo "══════════════════════════════════════════════════════════"

# ── Link check (optional) ──────────────────────────────────────────────────
if command -v lychee >/dev/null 2>&1; then
if command_exists "$LINKCHECK_LAUNCHER" "$LINKCHECK_BIN"; then
echo "[Gate 0] Link check..."
lychee --config .lychee.toml '**/*.md'
run_tool "$LINKCHECK_LAUNCHER" "$LINKCHECK_BIN" --config .lychee.toml '**/*.md'
else
echo "[Gate 0] Link check skipped (lychee not installed)"
fi

# ── Gates 1-7 in parallel (all are read-only) ─────────────────────────────
echo "[Gates 1-7] Running lint + typecheck + policy + consumer type test + surface validator + markdown gates..."

npm run lint &
run_tool "$NPM_LAUNCHER" "$NPM_BIN" run lint &
LINT_PID=$!
npm run typecheck &
run_tool "$NPM_LAUNCHER" "$NPM_BIN" run typecheck &
TC_PID=$!
npm run typecheck:policy &
run_tool "$NPM_LAUNCHER" "$NPM_BIN" run typecheck:policy &
POLICY_PID=$!
npm run typecheck:consumer &
run_tool "$NPM_LAUNCHER" "$NPM_BIN" run typecheck:consumer &
CONSUMER_PID=$!
npm run typecheck:surface &
run_tool "$NPM_LAUNCHER" "$NPM_BIN" run typecheck:surface &
SURFACE_PID=$!
npm run lint:md &
run_tool "$NPM_LAUNCHER" "$NPM_BIN" run lint:md &
MD_PID=$!
npm run lint:md:code &
run_tool "$NPM_LAUNCHER" "$NPM_BIN" run lint:md:code &
MD_CODE_PID=$!

wait $LINT_PID || { echo ""; echo "BLOCKED — Gate 4 FAILED: ESLint (includes no-explicit-any, no-unsafe-*)"; exit 1; }
Expand All @@ -66,7 +92,7 @@ if [ "$QUICK" = "1" ]; then
echo "[Gate 8] Skipped (WARP_QUICK_PUSH quick mode)"
else
echo "[Gate 8] Running unit tests..."
npm run test:local || { echo ""; echo "BLOCKED — Gate 8 FAILED: Unit tests"; exit 1; }
run_tool "$NPM_LAUNCHER" "$NPM_BIN" run test:local || { echo ""; echo "BLOCKED — Gate 8 FAILED: Unit tests"; exit 1; }
fi

echo "══════════════════════════════════════════════════════════"
Expand Down
14 changes: 8 additions & 6 deletions src/domain/warp/query.methods.js
Original file line number Diff line number Diff line change
Expand Up @@ -368,9 +368,10 @@ export async function getContentOid(nodeId) {
* @this {import('../WarpGraph.js').default}
* @param {string} nodeId - The node ID to get content for
* @returns {Promise<Uint8Array|null>} Content bytes or null
* @throws {Error} If the referenced blob OID is not in the object store
* (e.g., garbage-collected despite anchoring). Callers should handle this
* if operating on repos with aggressive GC or partial clones.
* @throws {import('../errors/PersistenceError.js').default} If the referenced
* blob OID is not in the object store (code: `E_MISSING_OBJECT`), such as
* after repository corruption, aggressive GC, or a partial clone missing the
* blob object.
*/
export async function getContent(nodeId) {
const oid = await getContentOid.call(this, nodeId);
Expand Down Expand Up @@ -414,9 +415,10 @@ export async function getEdgeContentOid(from, to, label) {
* @param {string} to - Target node ID
* @param {string} label - Edge label
* @returns {Promise<Uint8Array|null>} Content bytes or null
* @throws {Error} If the referenced blob OID is not in the object store
* (e.g., garbage-collected despite anchoring). Callers should handle this
* if operating on repos with aggressive GC or partial clones.
* @throws {import('../errors/PersistenceError.js').default} If the referenced
* blob OID is not in the object store (code: `E_MISSING_OBJECT`), such as
* after repository corruption, aggressive GC, or a partial clone missing the
* blob object.
*/
export async function getEdgeContent(from, to, label) {
const oid = await getEdgeContentOid.call(this, from, to, label);
Expand Down
7 changes: 5 additions & 2 deletions src/infrastructure/adapters/CasBlobAdapter.js
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
*/

import BlobStoragePort from '../../ports/BlobStoragePort.js';
import PersistenceError from '../../domain/errors/PersistenceError.js';
import { createLazyCas } from './lazyCasInit.js';
import LoggerObservabilityBridge from './LoggerObservabilityBridge.js';
import { Readable } from 'node:stream';
Expand Down Expand Up @@ -148,8 +149,10 @@ export default class CasBlobAdapter extends BlobStoragePort {
}
const blob = await this._persistence.readBlob(oid);
if (blob === null || blob === undefined) {
throw new Error(
`Blob not found: OID "${oid}" is neither a CAS manifest nor a readable Git blob`,
throw new PersistenceError(
`Missing Git object: ${oid}`,
PersistenceError.E_MISSING_OBJECT,
{ context: { oid } },
);
}
return blob;
Expand Down
50 changes: 49 additions & 1 deletion src/infrastructure/adapters/GitGraphAdapter.js
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ const TRANSIENT_ERROR_PATTERNS = [
];

/**
* @typedef {Error & { details?: { stderr?: string, code?: number }, exitCode?: number, code?: number }} GitError
* @typedef {Error & { details?: { stderr?: string, stdout?: string, code?: number }, exitCode?: number, code?: number }} GitError
*/

/**
Expand Down Expand Up @@ -185,6 +185,19 @@ function errorSearchText(err) {
return `${message} ${stderr}`;
}

/**
* Returns stderr/stdout diagnostic text from a Git error, ignoring wrapper
* messages like "Git command failed with code 1" that do not carry object
* lookup semantics on their own.
* @param {GitError} err
* @returns {string}
*/
function gitDiagnosticText(err) {
const stderr = String(err?.details?.stderr || '');
const stdout = String(err?.details?.stdout || '');
return `${stderr} ${stdout}`.trim().toLowerCase();
}

/**
* Checks if a Git error indicates a missing object (commit, blob, tree).
* Covers exit code 128 with object-related stderr patterns.
Expand Down Expand Up @@ -358,6 +371,35 @@ export default class GitGraphAdapter extends GraphPersistencePort {
return await retry(() => this.plumbing.execute(options), this._retryOptions);
}

/**
* Distinguishes a legitimate zero-byte blob from a missing object when a
* blob stream returns no bytes. Some plumbing implementations surface the
* missing object case as an empty collect result instead of throwing.
*
* @param {string} oid
* @returns {Promise<void>}
* @private
*/
async _assertBlobExistsForEmptyRead(oid) {
try {
await this._executeWithRetry({ args: ['cat-file', '-e', oid] });
} catch (err) {
const gitErr = /** @type {GitError} */ (err);
const wrapped = wrapGitError(gitErr, { oid });
const exitCode = getExitCode(gitErr);
const diagnostics = gitDiagnosticText(gitErr);
const ambiguousMissingObject = exitCode === 1 && diagnostics === '';
if (wrapped === gitErr && ambiguousMissingObject) {
throw new PersistenceError(
`Missing Git object: ${oid}`,
PersistenceError.E_MISSING_OBJECT,
{ cause: /** @type {Error} */ (gitErr), context: { oid } },
);
}
throw wrapped;
}
}

/**
* The well-known SHA for Git's empty tree object.
* @type {string}
Expand Down Expand Up @@ -651,6 +693,12 @@ export default class GitGraphAdapter extends GraphPersistencePort {
args: ['cat-file', 'blob', oid]
});
const raw = await stream.collect({ asString: false });
// Some executeStream implementations can surface a missing object as an
// empty collect result instead of throwing. Distinguish that from a real
// zero-byte blob with an explicit existence check.
if (raw.length === 0) {
await this._assertBlobExistsForEmptyRead(oid);
}
// Return as-is — plumbing returns Buffer (which IS-A Uint8Array)
return /** @type {Uint8Array} */ (raw);
} catch (err) {
Expand Down
41 changes: 41 additions & 0 deletions test/integration/api/content-attachment.test.js
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import { describe, it, expect, beforeEach, afterEach } from 'vitest';
import { execSync } from 'node:child_process';
import { createTestRepo } from './helpers/setup.js';
import PersistenceError from '../../../src/domain/errors/PersistenceError.js';

describe('API: Content Attachment', () => {
/** @type {any} */
Expand Down Expand Up @@ -223,4 +224,44 @@ describe('API: Content Attachment', () => {
expect(content).toBeInstanceOf(Uint8Array);
expect(content).toEqual(binary);
});

it('throws when _content points at a missing blob OID', async () => {
const graph = await repo.openGraph('test', 'alice');

const patch = await graph.createPatch();
patch.addNode('doc:1');
await patch.attachContent('doc:1', 'hello');
await patch.commit();

await graph.materialize();

const patch2 = await graph.createPatch();
patch2.setProperty('doc:1', '_content', 'deadbeefdeadbeefdeadbeefdeadbeefdeadbeef');
await patch2.commit();

await graph.materialize();

await expect(graph.getContent('doc:1'))
.rejects.toMatchObject({ code: PersistenceError.E_MISSING_OBJECT });
});

it('throws when edge _content points at a missing blob OID', async () => {
const graph = await repo.openGraph('test', 'alice');

const patch = await graph.createPatch();
patch.addNode('a').addNode('b').addEdge('a', 'b', 'rel');
await patch.attachEdgeContent('a', 'b', 'rel', 'edge payload');
await patch.commit();

await graph.materialize();

const patch2 = await graph.createPatch();
patch2.setEdgeProperty('a', 'b', 'rel', '_content', 'deadbeefdeadbeefdeadbeefdeadbeefdeadbeef');
await patch2.commit();

await graph.materialize();

await expect(graph.getEdgeContent('a', 'b', 'rel'))
.rejects.toMatchObject({ code: PersistenceError.E_MISSING_OBJECT });
});
});
52 changes: 52 additions & 0 deletions test/unit/domain/WarpGraph.content.test.js
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ import { createEmptyStateV5, encodeEdgeKey, encodeEdgePropKey } from '../../../s
import { orsetAdd } from '../../../src/domain/crdt/ORSet.js';
import { createDot } from '../../../src/domain/crdt/Dot.js';
import { encodePropKey } from '../../../src/domain/services/KeyCodec.js';
import PersistenceError from '../../../src/domain/errors/PersistenceError.js';

function setupGraphState(/** @type {any} */ graph, /** @type {any} */ seedFn) {
const state = createEmptyStateV5();
Expand Down Expand Up @@ -156,6 +157,29 @@ describe('WarpGraph content attachment (query methods)', () => {
expect(content).toEqual(rawBuf);
expect(mockPersistence.readBlob).toHaveBeenCalledWith('raw-oid');
});

it('preserves E_MISSING_OBJECT from blobStorage.retrieve()', async () => {
const blobStorage = {
store: vi.fn(),
retrieve: vi.fn().mockRejectedValue(
new PersistenceError(
'Missing Git object: cas-tree-oid',
PersistenceError.E_MISSING_OBJECT,
{ context: { oid: 'cas-tree-oid' } },
),
),
};
/** @type {any} */ (graph)._blobStorage = blobStorage;

setupGraphState(graph, (/** @type {any} */ state) => {
addNode(state, 'doc:1', 1);
const propKey = encodePropKey('doc:1', '_content');
state.prop.set(propKey, { eventId: null, value: 'cas-tree-oid' });
});

await expect(graph.getContent('doc:1'))
.rejects.toMatchObject({ code: PersistenceError.E_MISSING_OBJECT });
});
});

describe('getEdgeContent() with blobStorage', () => {
Expand All @@ -181,6 +205,34 @@ describe('WarpGraph content attachment (query methods)', () => {
expect(blobStorage.retrieve).toHaveBeenCalledWith('cas-edge-oid');
expect(mockPersistence.readBlob).not.toHaveBeenCalled();
});

it('preserves E_MISSING_OBJECT from blobStorage.retrieve()', async () => {
const blobStorage = {
store: vi.fn(),
retrieve: vi.fn().mockRejectedValue(
new PersistenceError(
'Missing Git object: cas-edge-oid',
PersistenceError.E_MISSING_OBJECT,
{ context: { oid: 'cas-edge-oid' } },
),
),
};
/** @type {any} */ (graph)._blobStorage = blobStorage;

setupGraphState(graph, (/** @type {any} */ state) => {
addNode(state, 'a', 1);
addNode(state, 'b', 2);
addEdge(state, 'a', 'b', 'rel', 3);
const propKey = encodeEdgePropKey('a', 'b', 'rel', '_content');
state.prop.set(propKey, {
eventId: { lamport: 2, writerId: 'w1', patchSha: 'aabbccdd', opIndex: 0 },
value: 'cas-edge-oid',
});
});

await expect(graph.getEdgeContent('a', 'b', 'rel'))
.rejects.toMatchObject({ code: PersistenceError.E_MISSING_OBJECT });
});
});

describe('getEdgeContentOid()', () => {
Expand Down
Loading
Loading