WarpGraph is a graph database built on Git. It uses a patch-based CRDT model where Git commits represent patch objects containing graph operations, with commit messages encoding patch metadata and parent relationships linking patch history.
This architecture enables:
- Content-addressable storage with built-in deduplication
- Git's proven durability and integrity guarantees
- Standard Git tooling compatibility
- Distributed replication via git push/pull
The codebase follows hexagonal architecture to isolate domain logic from infrastructure concerns:
- Ports define abstract interfaces for external dependencies
- Adapters implement ports for specific technologies (Git, console, etc.)
- Domain services contain pure business logic with injected dependencies
This enables:
- Easy testing via mock adapters
- Swappable infrastructure (different Git implementations, logging backends)
- Clear separation of concerns
git-warp now includes a thin Time Travel Debugger (TTD) command family in the main CLI, but the architectural boundary remains strict:
- Domain/core owns substrate analyzers, receipts, provenance, and materialization facts.
- CLI adapters expose those facts through
seek,debug coordinate,debug timeline,debug conflicts,debug provenance, anddebug receipts. - Presenters format the results for text, JSON, or NDJSON.
- Human-facing debugger applications do not live in git-warp.
This keeps git-warp as substrate plus thin inspection tooling rather than turning the package into a TUI/web application shell. See docs/TTD.md for the dedicated debugger architecture note.
Read-side worldline awareness lives here, not in the reducer:
- supported debug topics can inspect either the live frontier or a pinned working-set patch universe
- analyzers and materializers decide which patches are visible
reduceV5and generic join semantics remain deterministic and worldline-blind
git-warp now also includes a separate working-set substrate family. It is intentionally not part of TTD because it creates durable descriptor refs.
- Domain/core owns explicit coordinate materialization and working-set descriptors.
- CLI adapters expose those capabilities through
working-set create,working-set braid,working-set list,working-set show,working-set materialize,working-set compare, andworking-set drop. - Presenters render descriptor and materialization summaries without inventing higher-level product semantics.
- Descriptors pin coordinates; caches do not become truth.
The active substrate model is still narrow and substrate-first:
- a working set pins an explicit frontier snapshot plus an optional Lamport ceiling
- the target working set keeps its own overlay patch-log ref and can mark that overlay writable or read-only
- zero or more braided read-only overlays can be pinned onto the same base observation
- target-owned braid refs keep those pinned support-overlay heads reachable as durable substrate facts
- read-side helpers such as
getWorkingSetPatches(),patchesForWorkingSet(),projectStateV5(),createStateReaderV5(),compareVisibleStateV5(),compareWorkingSet(),compareCoordinates(), and working-set-aware conflict analysis operate on the visiblebase + braided overlays + active overlaypatch universe - comparison helpers report substrate facts such as visible patch divergence plus node/edge/property deltas instead of higher-layer review or governance meaning
- materialized state is derived/cache only
- no Git worktree assumption leaks into the API
Composition terminology is fixed as braid:
- git-warp uses braid language for co-present working-set composition
- braid changes the visible patch universe, not reducer rules
- braid is not a synonym for merge or rebase
This gives higher layers an honest substrate for future worldline/fork behavior without teaching git-warp about XYPH governance or UI concepts. See docs/WORKING_SETS.md for the dedicated working-set note.
The domain layer models the graph database concepts:
GraphNode- Immutable value object representing a nodeWarpGraph- Node CRUD operations (the main API class)GraphTraversal- Graph algorithms (11 algorithms: BFS, DFS, shortest path, A*, topological sort, etc.)BitmapIndexBuilder/Reader- High-performance indexingMaterializedViewService- Orchestrate materialized view lifecycleNeighborProviderPort- Abstract neighbor lookup interface
GraphNode instances are frozen after construction. The parents array is also frozen to prevent accidental mutation. This aligns with Git's immutable commit model.
All services accept their dependencies via constructor options:
- Persistence adapters
- Loggers
- Clocks
- Parsers
This enables testing with mocks and flexible runtime configuration.
+--------------------------------------------------------------+
| WarpGraph | <- Main API
| (WarpGraph.js) |
+--------------------------------------------------------------+
| Supporting Services |
| +---------------+ +--------------------+ |
| | IndexRebuild | | GraphTraversal | |
| | Service | | | |
| +---------------+ +--------------------+ |
| +-------------+ +---------------+ +--------------------+ |
| | HealthCheck | | BitmapIndex | | BitmapIndex | |
| | Service | | Builder | | Reader | |
| +-------------+ +---------------+ +--------------------+ |
| +---------------+ +--------------------+ |
| | GitLogParser | | Streaming | |
| | | | BitmapIndexBuilder | |
| +---------------+ +--------------------+ |
| +---------------------+ +----------------------------+ |
| | MaterializedView | | IncrementalIndexUpdater | |
| | Service | | | |
| +---------------------+ +----------------------------+ |
| +---------------------+ +----------------------------+ |
| | LogicalIndex | | LogicalIndex | |
| | BuildService | | Reader | |
| +---------------------+ +----------------------------+ |
+--------------------------------------------------------------+
| Ports |
| +-------------------+ +---------------------------+ |
| | GraphPersistence | | IndexStoragePort | |
| | Port | | | |
| +-------------------+ +---------------------------+ |
| +-------------------+ +---------------------------+ |
| | LoggerPort | | ClockPort | |
| +-------------------+ +---------------------------+ |
| +-------------------+ |
| | NeighborProvider | |
| | Port | |
| +-------------------+ |
+--------------------------------------------------------------+
| Adapters |
| +-------------------+ +---------------------------+ |
| | GitGraphAdapter | | ConsoleLogger | |
| | | | NoOpLogger | |
| +-------------------+ +---------------------------+ |
| +-------------------+ |
| | PerformanceClock | |
| | GlobalClock | |
| +-------------------+ |
+--------------------------------------------------------------+
src/
+-- domain/
| +-- entities/ # Immutable domain objects
| | +-- GraphNode.js # Node value object (sha, message, parents)
| +-- services/ # Business logic
| | +-- WarpGraph.js # Main API - Node CRUD operations
| | +-- IndexRebuildService.js # Index orchestration
| | +-- BitmapIndexBuilder.js # In-memory index construction
| | +-- BitmapIndexReader.js # O(1) index queries
| | +-- StreamingBitmapIndexBuilder.js # Memory-bounded building
| | +-- LogicalTraversal.js # Graph algorithms (deprecated facade)
| | +-- GraphTraversal.js # Unified traversal engine (11 algorithms)
| | +-- MaterializedViewService.js # Materialized view lifecycle
| | +-- LogicalIndexBuildService.js # Build bitmap indexes from state
| | +-- LogicalIndexReader.js # Hydrate indexes from tree
| | +-- IncrementalIndexUpdater.js # O(diff) index updates
| | +-- AdjacencyNeighborProvider.js # In-memory neighbor provider
| | +-- BitmapNeighborProvider.js # Bitmap-backed neighbor provider
| | +-- HealthCheckService.js # K8s-style probes
| | +-- GitLogParser.js # Binary stream parsing
| | +-- WorkingSetService.js # Pinned coordinates + working-set overlays
| +-- errors/ # Domain-specific errors
| | +-- IndexError.js
| | +-- ShardLoadError.js
| | +-- ShardCorruptionError.js
| | +-- ShardValidationError.js
| | +-- TraversalError.js
| | +-- OperationAbortedError.js
| | +-- EmptyMessageError.js
| | +-- WorkingSetError.js
| +-- utils/ # Domain utilities
| +-- LRUCache.js # Shard caching
| +-- MinHeap.js # Priority queue for A*
| +-- CachedValue.js # TTL-based caching
| +-- cancellation.js # AbortSignal utilities
| +-- parseWorkingSetBlob.js # Working-set descriptor validation
+-- infrastructure/
| +-- adapters/ # Port implementations
| +-- GitGraphAdapter.js # Git operations via @git-stunts/plumbing
| +-- ConsoleLogger.js # Structured JSON logging
| +-- NoOpLogger.js # Silent logger for tests
| +-- PerformanceClockAdapter.js # Node.js timing
| +-- GlobalClockAdapter.js # Bun/Deno/Browser timing
+-- ports/ # Abstract interfaces
+-- GraphPersistencePort.js # Git commit/ref operations
+-- IndexStoragePort.js # Blob/tree storage
+-- LoggerPort.js # Structured logging contract
+-- ClockPort.js # Timing abstraction
+-- NeighborProviderPort.js # Abstract neighbor lookup interface
+-- SeekCachePort.js # Persistent seek cache interface
The main entry point (WarpGraph.js) provides:
- Direct graph database API
open()factory for managed mode with automatic durability- Batch API for efficient bulk writes
- Health check endpoints (K8s liveness/readiness)
- Index management (rebuild, load, save)
Core node operations:
createNode()- Create a single nodecreateNodes()- Bulk creation with placeholder references ($0,$1)readNode()/getNode()- Retrieve node datahasNode()- Existence checkiterateNodes()- Streaming iterator for large graphscountNodes()- Efficient count viagit rev-list --count
Message validation enforces size limits (default 1MB) and non-empty content.
Orchestrates index creation:
- In-memory mode: Fast, O(N) memory, single serialization pass
- Streaming mode: Memory-bounded, flushes to storage periodically
Supports cancellation via AbortSignal and progress callbacks.
Unified traversal engine with 11 algorithms, operating over a NeighborProviderPort abstraction (in-memory via AdjacencyNeighborProvider or bitmap-backed via BitmapNeighborProvider):
bfs()/dfs()- Deterministic traversals returning ordered node arraysshortestPath()- Unweighted shortest path (BFS-based)weightedShortestPath()- Dijkstra with customweightFnornodeWeightFnaStarSearch()- A* with heuristic guidancebidirectionalAStar()- Bidirectional A*topologicalSort()- Kahn's algorithm with cycle detectionweightedLongestPath()- Longest path on DAGs (critical path)connectedComponent()- All reachable nodesisReachable()- Fast reachability checkcommonAncestors()- Multi-source ancestor intersection
All traversals support:
maxNodes/maxDepthlimits- Cancellation via
AbortSignal - Direction control (forward/reverse)
nodeWeightFn- Per-node weight function for weighted algorithms
Note:
LogicalTraversalremains as a deprecated facade that delegates toGraphTraversal.
Roaring bitmap-based indexes for O(1) neighbor lookups:
Builder:
registerNode()- Assign numeric ID to SHAaddEdge()- Record parent/child relationshipserialize()- Output sharded JSON structure
Reader:
setup()- Configure with shard OID mappingsgetParents()/getChildren()- O(1) lookups- Lazy loading with LRU cache for bounded memory
- Checksum validation with strict/non-strict modes
Memory-bounded variant of BitmapIndexBuilder:
- Flushes bitmap data to storage when threshold exceeded
- SHA-to-ID mappings remain in memory (required for consistency)
- Merges chunks at finalization via bitmap OR operations
Orchestrates the full materialized view lifecycle — build, persist, and load:
- Coordinates
JoinReducer(patch replay),LogicalIndexBuildService(bitmap index construction), andCheckpointService(state snapshots) - Supports checkpoint schema 4 with embedded bitmap indexes
IncrementalIndexUpdaterenables O(diff) bitmap index updates when only a few patches have arrived since the last checkpoint- Lifecycle:
build()→persistIndexTree()→loadFromOids(), with incrementalapplyDiff()for hot paths and integrity checks viaverifyIndex()
Abstract interface for neighbor lookups, decoupling traversal algorithms from storage:
getNeighbors(nodeId, direction, options?)— returns neighbor edges;options.labels(aSet) filters by label- Two implementations:
AdjacencyNeighborProvider— builds adjacency lists from materialized state (in-memory, O(E) build)BitmapNeighborProvider— delegates toLogicalIndexReaderfor O(1) bitmap lookups
Git operations contract:
commitNode()- Create commit pointing to empty treeshowNode()/getNodeInfo()- Retrieve commit datalogNodesStream()- Stream commit historyupdateRef()/readRef()/deleteRef()- Ref managementisAncestor()- Ancestry testing for fast-forward detectioncountNodes()- Efficient countping()- Health check
Also includes blob/tree operations for index storage.
Index persistence contract:
writeBlob()/readBlob()- Blob I/OwriteTree()/readTreeOids()- Tree I/OupdateRef()/readRef()- Index ref management
Structured logging contract:
debug(),info(),warn(),error()- Log levelschild()- Create scoped logger with inherited context
Timing abstraction:
now()- High-resolution timestamp (ms)timestamp()- ISO 8601 wall-clock time
Neighbor lookup abstraction:
getNeighbors(nodeId, direction, options?)— returns neighbor edges;options.labels(aSet) filters by label- Implementations:
AdjacencyNeighborProvider(in-memory),BitmapNeighborProvider(bitmap-backed)
Persistent seek-cache abstraction used by materializeAt and cursor-bound materialization:
get(key)— returns{ buffer, indexTreeOid? } | nullfor a cached state snapshotset(key, buffer, { indexTreeOid? })— stores a snapshot with optional index-tree metadatadelete(key)/clear()— invalidates stale seek-cache entries
Implementations must preserve the optional indexTreeOid metadata so index hydration can skip full rebuilds on cache hits.
Implements both GraphPersistencePort and IndexStoragePort:
- Uses
@git-stunts/plumbingfor git command execution - Retry logic with exponential backoff for transient errors
- Input validation to prevent command injection
- NUL-terminated output parsing for reliability
ConsoleLogger: Structured JSON output with configurable levelsNoOpLogger: Zero-overhead silent logger for tests
PerformanceClockAdapter: Uses Node.jsperf_hooksGlobalClockAdapter: Uses globalperformancefor Bun/Deno/browsers
createNode() -> WarpGraph.createNode()
-> persistence.commitNode()
-> persistence.updateRef()
getParents() -> BitmapIndexReader._getEdges()
-> _getOrLoadShard() (lazy load)
-> storage.readBlob()
-> Validate checksum
-> RoaringBitmap32.deserialize()
-> Map IDs to SHAs
rebuildIndex() -> IndexRebuildService.rebuild()
-> WarpGraph.iterateNodes()
-> BitmapIndexBuilder.registerNode() / addEdge()
-> builder.serialize()
-> storage.writeBlob() (per shard, parallel)
-> storage.writeTree()
All WarpGraph nodes are Git commits pointing to the empty tree:
SHA: 4b825dc642cb6eb9a060e54bf8d69288fbee4904
This is the well-known SHA of an empty Git tree, automatically available in every repository.
How it works:
- Data: Stored in commit message (arbitrary payload up to 1MB default)
- Edges: Commit parent relationships (directed, multi-parent supported)
- Identity: Commit SHA (content-addressable)
Benefits:
- Introduces no files into the repository working tree
- Content-addressable with automatic deduplication
- Git's proven durability and integrity (SHA verification)
- Standard tooling compatibility (
git log,git show, etc.) - Distributed replication via
git push/git pull
The bitmap index enables O(1) neighbor lookups. It is stored as a Git tree with sharded JSON blobs:
index-tree/
+-- meta_00.json # SHA->ID mappings for prefix "00"
+-- meta_01.json # SHA->ID mappings for prefix "01"
+-- ...
+-- meta_ff.json # SHA->ID mappings for prefix "ff"
+-- shards_fwd_00.json # Forward edges (parent->children) for prefix "00"
+-- shards_rev_00.json # Reverse edges (child->parents) for prefix "00"
+-- ...
+-- shards_fwd_ff.json
+-- shards_rev_ff.json
Shard envelope format:
{
"version": 2,
"checksum": "sha256-hex-of-data",
"data": { ... actual content ... }
}Meta shard content:
{
"00a1b2c3d4e5f6789...": 0,
"00d4e5f6a7b8c9012...": 42
}Edge shard content:
{
"00a1b2c3d4e5f6789...": "OjAAAAEAAAAAAAEAEAAAABAAAA=="
}Values are base64-encoded Roaring bitmaps containing numeric IDs of connected nodes.
This section defines the official durability contract for WarpGraph and the mechanisms used to enforce it.
A write is durable if and only if it becomes reachable from the graph ref.
Git garbage collection (GC) prunes commits that are not reachable from any ref. Since WarpGraph patches are Git commits, without careful ref management, data can be silently deleted. WarpGraph provides mechanisms to ensure writes remain reachable.
In managed mode, WarpGraph guarantees durability for all writes.
- Every write operation updates the graph ref (or creates an anchor commit).
- Reachability from the ref is maintained automatically.
- Users do not need to manage refs or call sync manually.
In manual mode, WarpGraph provides no automatic ref management.
- Writes create commits but do not update refs.
- User is responsible for calling
sync()to persist reachability. - User may manage refs directly via Git commands.
- Warning: Uncommitted writes are subject to garbage collection.
Anchor commits are the mechanism used to maintain reachability for disconnected graphs (e.g., disjoint roots or imported history).
In a linear history, every new commit points to the previous tip, maintaining a single chain reachable from the ref. However, graph operations can create disconnected commits:
- Creating a new root node (no parents).
- Merging unrelated graph histories.
- Importing commits from external sources.
If the ref simply moves to the new commit, the old history becomes unreachable and will be GC'd.
An anchor commit is a special infrastructure commit that:
- Has multiple parents: The previous ref tip AND the new commit(s).
- Has an Empty Tree (like all WarpGraph nodes).
- Has a specific Payload/Trailer:
- v4+ format: Trailer-typed (
eg-kind: anchor,eg-schema: 1). - Legacy v3: JSON
{"_type":"anchor"}.
- v4+ format: Trailer-typed (
- Is filtered out during graph traversal (invisible to domain logic).
1. Chained Anchors (per-write sync)
Used by autoSync: 'onWrite'.
- Each disconnected write creates one anchor with 2 parents.
- Pro: Simple, stateless, works for incremental writes.
- Con: O(N) anchor commits for N disconnected tips.
2. Octopus Anchors (batch mode)
Used by Batch.commit().
- Single anchor with N parents for all tips.
- Pro: O(1) anchor overhead.
- Con: Requires knowing all tips upfront.
3. Hybrid
WarpGraph defaults to chained anchors for individual writes but uses octopus anchors for batch operations. compactAnchors() can be called to rewrite chains into octopus anchors for cleanup.
- In managed mode, any successfully returned write is durable.
- Anchor commits preserve all previously reachable history.
- The sync algorithm is idempotent for the same inputs.
- Graph semantics are unaffected by anchor commits (they are transparent to traversal).
In V7, logical graph traversal uses the Bitmap Index (built from materialized state) and is O(1) regardless of the underlying commit topology. Anchor commits do not appear in the logical graph.
However, anchors do impact Materialization (scanning Git history to build state) and Git Storage (number of objects).
| Metric | Chained Anchors | Octopus Anchors |
|---|---|---|
| Logical Traversal | O(1) (Index) | O(1) (Index) |
| Materialization | N patches + O(N) anchors | N patches + O(1) anchors |
| Git Object Count | Higher | Lower |
Chained Anchors (linear history enforcement) add overhead to the git rev-list walk required during materialization.
Octopus Anchors (used by syncCoverage) are more efficient for bulk operations, keeping the commit depth shallow.
In V7 Multi-Writer mode:
- Each writer maintains their own ref (
refs/.../writers/<id>), pointing to a chain of Patch Commits. - Durability is ensured because writes update these refs.
- Global Reachability (optional) is maintained via
syncCoverage(), which creates an Octopus Anchor commit pointed to byrefs/.../coverage/head. This anchor has all writer tips as parents, ensuring they aren't GC'd even if individual writer refs are deleted (e.g. during a clone).
| Operation | Complexity | Notes |
|---|---|---|
| Write (createNode) | O(1) | Append-only commit |
| Read (readNode) | O(1) | Direct SHA lookup |
| Unindexed traversal | O(N) | Linear scan via git log |
| Indexed lookup | O(1) | Bitmap query + ID resolution |
| Index rebuild | O(N) | One-time scan |
| Index load | O(1) initial | Lazy shard loading |
Memory characteristics:
| Scenario | Approximate Memory |
|---|---|
| Cold start (no index) | Near-zero |
| Single shard loaded | 0.5-2 MB per prefix |
| Full index (1M nodes) | 150-200 MB |
Domain-specific error types enable precise error handling:
ShardLoadError- Storage I/O failureShardCorruptionError- Invalid shard formatShardValidationError- Version/checksum mismatchTraversalError- Algorithm failures (no path, cycle detected)OperationAbortedError- Cancellation via AbortSignalEmptyMessageError- Empty message validation failure
Long-running operations support AbortSignal for cooperative cancellation:
const controller = new AbortController();
setTimeout(() => controller.abort(), 5000);
for await (const node of graph.iterateNodes({
ref: 'HEAD',
signal: controller.signal
})) {
// Process node
}Supported operations:
iterateNodes()rebuildIndex()- All traversal methods (BFS, DFS, shortest path, etc.)