Unified cognitive substrate: SQL + Cypher + Vector + Hamming + Resonance at alien speed.
Familiar surface at alien speed.
All query types compile to the same underlying operation: fingerprint → bucket → SIMD scan on Arrow buffers.
64-bit key:
┌──────────────────┬──────────────────────────────────────────────┐
│ 16 bits │ 48 bits │
│ TYPE │ fingerprint prefix │
└──────────────────┴──────────────────────────────────────────────┘
0x0001-0x00FF Entities (thought, concept, style)
0x0100-0x01FF Edges (CAUSES, SUPPORTS, CONTRADICTS, BECOMES...)
0x0200-0x02FF Layers (7 consciousness layers)
0x0300-0x03FF Styles (12 thinking styles)
0x0400+ Codebook (learned clusters)
| Surface | Query | Underlying Operation |
|---|---|---|
| SQL | SELECT * FROM thoughts WHERE fp = X |
get(0x0001, fp) |
| Cypher | MATCH (n:Thought {fp: X}) |
get(0x0001, fp) |
| Cypher | MATCH (a)-[:CAUSES]->(b) |
scan(0x0100, a.prefix) |
| Hamming | resonate(fp, threshold) |
simd_scan(bucket) |
One index. All query languages. Same bits.
For petabyte-scale filtering without tree traversal.
7 PB of fingerprints
= 5.6 trillion entries at 1250 bytes each
Full SIMD scan = hours
┌─────────────────────────────────────────────────────────────┐
│ L1 SCENT INDEX │
│ │
│ 256 buckets × 5-byte scent = 1.25 KB total │
│ Fits in L1 cache. Single SIMD pass. ~50 ns. │
│ │
│ Query "Siamese cat" → 3 buckets match → 98.8% eliminated │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ L2 SCENT INDEX │
│ │
│ 256 sub-buckets per L1 bucket × 5 bytes = 1.25 KB each │
│ Only loaded for matching L1 buckets │
│ │
│ Query "Siamese cat" → 2 sub-buckets → 99.997% eliminated │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ LEAF FINGERPRINTS │
│ │
│ Full 10K-bit (1250 byte) fingerprints │
│ SIMD Hamming on actual bits │
│ Only scan matching leaf buckets │
└─────────────────────────────────────────────────────────────┘
| Depth | Buckets | Scent Index | Coverage per Leaf |
|---|---|---|---|
| 1 | 256 | 1.25 KB | 27 TB |
| 2 | 65,536 | 320 KB | 107 GB |
| 3 | 16.7M | 80 MB | 420 MB |
| 4 | 4.3B | 20 GB | 1.6 MB |
Add layers as corpus grows. Same 1.25 KB scan at each level.
TREE SEARCH:
log₂(256) = 8 levels
8 pointer chases
8 cache misses
~800 cycles
SCENT SCAN:
1.25 KB flat
L1 cache resident
One SIMD pass
~50 cycles
Scent wins 16x. And it's simpler.
Headers are free metadata. The fingerprint is the only storage cost.
struct ChunkHeader {
count: u32, // entries in this chunk
offset: u64, // byte offset in Arrow file
scent: [u8; 5], // compressed representative
// Cognition markers (Layer 3-6)
plasticity: f32, // learning rate
decision: u8, // last decision made
activation: f32, // activation level
last_access: u64, // temporal marker
}// O(1) append - just update header
fn append(&mut self, fp: &[u8; 1250]) -> u64 {
let chunk = fp[0];
let offset = self.data.len();
self.data.extend_from_slice(fp);
self.headers[chunk].count += 1; // free
offset
}
// O(1) defragmentation tracking
// Fingerprints reorder, headers update, same bytesAda's consciousness operates on scent hierarchy, not individual fingerprints.
Leaf fingerprints (10K bits):
└── Layer 0: SUBSTRATE - raw sensation
└── Layer 1: FELT_CORE - immediate feeling
└── Layer 2: BODY - somatic response
Scent nodes (5 bytes):
└── Layer 3: QUALIA - qualitative experience
└── Layer 4: VOLITION - decision/intention
└── Layer 5: GESTALT - pattern recognition
└── Layer 6: META - self-reflection
Traditional: Update 1M fingerprints for learning
Scent: Update 1 L2 node (affects 107 GB)
One scent update = millions of fingerprints affected.
Cognition at the right level of abstraction.
fn update_interest(&mut self, category_scent: &[u8; 5], plasticity: f32) {
let chunk = self.find_chunk_by_scent(category_scent);
self.headers[chunk].plasticity = plasticity;
// Done. 27 TB of content now weighted differently.
// No leaf updates. O(1).
}fn decide(&mut self, l1: u8, l2: u8, decision: Decision) {
// Mark decision at L2 (affects 107 GB)
self.l2_headers[l1][l2].decision = decision.code();
self.l2_headers[l1][l2].last_access = now();
// Gestalt sees pattern across L2 nodes
if self.detect_pattern(&self.l2_headers[l1]) {
self.headers[l1].activation += 0.1; // L1 activation
}
}┌─────────────────────────────────────────────────────────────┐
│ LADYBUG LAYER │
│ │
│ 64-bit CAM index + scent hierarchy + cognition markers │
│ Immutable Rust semantics │
│ SIMD operations on Arrow buffers │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ LANCE/ARROW │
│ │
│ Columnar storage, free append │
│ Transparent compression (we don't care how) │
│ Zero-copy reads │
│ We use it, don't fight it │
└─────────────────────────────────────────────────────────────┘
thoughts.lance:
├── id: Utf8
├── content: Utf8
├── fingerprint: FixedSizeBinary(1250) ← 10K bits
├── freq: Float32 ← NARS truth value
├── conf: Float32 ← NARS truth value
├── style: UInt16 ← thinking style type ID
└── layer: UInt8 ← consciousness layer
edges.lance:
├── source_fp: FixedSizeBinary(1250)
├── target_fp: FixedSizeBinary(1250)
├── relation: UInt16 ← edge type ID
├── freq: Float32
└── conf: Float32
scent_index.lbug:
├── headers: [ChunkHeader; 256]
└── l2_headers: [[ChunkHeader; 256]; 256] (optional, for >100TB)
Rust enforces at compile time.
pub struct LadybugIndex {
buckets: Box<[Box<[Entry]>]>, // No Vec, no mutation
scents: Box<[[u8; 5]; 256]>, // Frozen after build
}
impl LadybugIndex {
// Only &self methods exist. No &mut self.
pub fn get(&self, ...) -> Option<u64> { ... }
// Append = build new index, atomic swap
pub fn append(&self, additions: IndexBuilder) -> Self { ... }
}Write: Build new index from old + additions
Swap: Atomic pointer update
Reads: Continue on old until swap completes
Old: Dropped when last reader finishes
Input: query fingerprint (10K bits from "Siamese cat" embedding)
Step 1: Extract query scent (5 bytes)
→ ~10 ns
Step 2: L1 scan (1.25 KB, 256 scents)
→ 3 buckets match: 0x4A, 0x7F, 0xB2
→ ~50 ns
Step 3: L2 scan (3 × 1.25 KB = 3.75 KB)
→ 5 sub-buckets match total
→ ~150 ns
Step 4: SIMD Hamming on 5 leaf buckets
→ ~500K fingerprints (not 5.6 trillion)
→ ~10 ms
Total: ~10 ms for 7 PB corpus
Without scent: ~hours
| Operation | Complexity | Notes |
|---|---|---|
| Lookup by fingerprint | O(1) | Bucket + SIMD scan |
| Append | O(1) | Write fp + update header |
| Scent scan (per level) | O(1) | 1.25 KB, L1 cache |
| Resonance search | O(matching buckets) | Scent eliminates 95-99% |
| Cognition update | O(1) | Update scent node, affects TB |
| Defragmentation | O(n) | Reorder fps, update headers |
| Index rebuild | O(n) | COW, readers unaffected |
-
Fingerprint = Address Content addressable. No separate index structure.
-
Headers are Free Metadata costs nothing. The fingerprint is the footprint.
-
Scent ≠ Compression Scent is organizational. All 10K bits preserved.
-
Cognition at Scent Level Layers 3-6 operate on hierarchy, not leaves.
-
Familiar Surface SQL, Cypher, Hamming all work. Same underlying op.
-
Alien Speed SIMD on Arrow. No tree traversal. L1-resident scent index.
-
Immutable Rust enforces. COW for updates. No runtime checks.
-
Lance Underneath Don't reinvent storage. Use what works.
For books/scientific reasoning where structure > resonance:
32-bit key: chunk(8) + suffix(24)
Codebook built in second pass
Defragmentation by fingerprint prefix
Node 1: Buckets 0x00-0x3F (25%)
Node 2: Buckets 0x40-0x7F (25%)
Node 3: Buckets 0x80-0xBF (25%)
Node 4: Buckets 0xC0-0xFF (25%)
Query: Broadcast scent match → route to matching nodes only
scent + timestamp → "what did Siamese cats mean in 2024?"
Versioned scent hierarchy for memory archaeology
Apache-2.0
https://github.com/AdaWorldAPI/ladybug-rs
Everything is a Container. Everything is Hamming distance. XOR is the only verb that matters.
The sections below document the deep container substrate that underlies the CAM/scent architecture above. Where Part I describes the indexing and query surface, Part II describes the 16,384-bit cognitive geometry, the DN tree, NARS reasoning, qualia modules, and the Friston free energy loop that ties them together.
The Container is the atomic unit. 16,384 bits = 256 × u64 words = 2 KB.
Stack-allocated, SIMD-aligned (#[repr(C, align(64))]), zero-copy.
Each CogRecord IS one container. A node has N containers (meta, content, embeddings...).
┌──────────────────────────────────────────────────────────┐
│ 256 words × 64 bits = 16,384 bits = 2 KB │
│ 32 AVX-512 iterations cover the full container │
│ Expected random Hamming distance: 8192 (σ = 64) │
└──────────────────────────────────────────────────────────┘
Core operations (all O(1) or O(128)):
| Operation | Meaning | Implementation |
|---|---|---|
xor(a, b) |
Bind / unbind / delta | Word-parallel XOR |
hamming(a, b) |
Semantic distance | XOR + popcount |
popcount(a) |
Information content | Sum of set bits |
Container::zero() |
Identity for XOR | All-zero container |
Container::random(seed) |
Pseudorandom point | Deterministic RNG |
XOR is commutative, associative, and self-inverse: a ⊕ a = 0.
This single property gives us binding, unbinding, delta encoding,
spine computation, parity recovery, and reversible Markov chains.
Every record is exactly 2 KB: one metadata container + one content container. No heap allocation.
┌─────────────────────────────────────────────────────────────┐
│ meta (1 KB) identity, NARS truth, edges, rung, RL │
├─────────────────────────────────────────────────────────────┤
│ content (1 KB) searchable fingerprint (Hamming / SIMD) │
└─────────────────────────────────────────────────────────────┘
= 2 KB = 1 Fingerprint = 1 DN tree node = 1 Redis value
Multi-container geometries (Xyz, Chunked, Tree) link multiple CogRecords
through the DN tree. Each linked record is still 2 KB. The Xyz geometry
stores three perspectives (X, Y, Z) whose XOR-fold X ⊕ Y ⊕ Z is the
holographic trace — given any 2 + trace, recover the 3rd.
128 words of structural information. Never included in Hamming search.
W0 PackedDn address (THE identity)
W1 Type: node_kind:u8 | count:u8 | geometry:u8 | flags:u8
| schema_version:u16 | provenance_hash:u16
W2 Timestamps (created_ms:u32 | modified_ms:u32)
W3 Label hash:u32 | tree_depth:u8 | branch:u8 | reserved:u16
W4-7 NARS truth (freq:f32 | conf:f32 | pos_evidence:f32 | neg_evidence:f32)
W8-11 DN rung + 7-layer compact + collapse gate state
W12-15 7-layer markers (5 bytes × 7 = 35 bytes)
W16-31 Inline edges (64 packed, 4 per word, 16 bits each)
W32-39 RL / Q-values / rewards (16 actions × f32)
W40-47 Bloom filter (512 bits, 3 hash functions)
W48-55 Graph metrics (in_degree, out_degree, pagerank, clustering, ...)
W56-63 Qualia (18 channels × f16 + 8 slots)
W64-79 Rung history + collapse gate history
W80-95 Representation language descriptor
W96-111 DN-Sparse adjacency (compact inline CSR)
W112-125 Reserved
W126-127 Checksum (XOR of W0-W125) + version
Zero-copy access via MetaView (read) and MetaViewMut (write).
PackedDn encodes a 7-level hierarchical address in a single u64.
Each level is 8 bits (value + 1, so 0x00 = absent).
Byte 7 Byte 6 Byte 5 Byte 4 Byte 3 Byte 2 Byte 1 Byte 0
[lv0+1] [lv1+1] [lv2+1] [lv3+1] [lv4+1] [lv5+1] [lv6+1] [sentinel=0]
Properties:
- Maximum 7 levels (0 = root through 7 = deepest)
- Natural lexicographic sort = hierarchical order:
/0 < /0/0 < /0/1 < /1 parent(),child(),ancestors(),subtree_range(),is_ancestor_of()- DN tree lives in
ContainerGraph { records: HashMap<PackedDn, CogRecord>, children: HashMap<PackedDn, Vec<PackedDn>> }
The tree is the organizational spine of knowledge. Every concept has an address. Siblings share a parent. The parent's spine (XOR-fold of children's content) IS the structural prediction for the group.
Each node can hold up to 76 edges without leaving its own 2 KB record.
InlineEdge = verb:u8 | target_hint:u8 = 16 bits
4 edges per 64-bit word × 16 words = 64 edges
O(1) random access by index. Compact but low-resolution target (only the low byte of target DN). Resolution strategies: exact match in children, parent's children, or full graph scan.
W96: edge_count:u16 | row_count:u16 | flags:u32
W97-W99: row pointers (24 rows max, u8 offsets)
W100-W111: EdgeDescriptor entries (12 max)
EdgeDescriptor = verb_id:u16 | weight_q:u16 | target_dn:u32 = 64 bits
Full-resolution target (lower 32 bits of PackedDn) and 16-bit fixed-point weight. Used when inline edges overflow or when precise edge weight matters.
This is the single most important invention in the system.
Every other capability — leaf insert, delta encoding, NARS reflection, hydration chains, semiring traversal, qualia sensing — depends on being able to read the structural summary (spine) while simultaneously mutating the individual beliefs (children). In any normal system this requires locks, MVCC, or copying. The SpineCache needs none of these.
spine = XOR-fold of all children's content containers
= the structural prediction for the group
= the BORROWED reference from the joined blackboard
The SpineCache is the "special borrow schema where for mut the reference gets owned from a joined blackboard." The spine IS the blackboard — a single Container that summarizes everything its children know.
Reading the spine is a borrow. Writing a child is a mut. Both happen simultaneously because XOR has three algebraic properties that together eliminate the need for synchronization:
- Commutative:
a XOR b = b XOR a— child order doesn't matter - Associative:
(a XOR b) XOR c = a XOR (b XOR c)— grouping doesn't matter - Self-inverse:
a XOR a = 0— removing = re-adding
Because of these, the spine is order-independent. If child C changes,
the new spine is old_spine XOR old_C XOR new_C. It doesn't matter when
or in what order other children changed — the result is always correct.
write_child(dn, new_content):
1. Store new content in child's CogRecord
2. Look up all parent spines of this child
3. Mark each parent spine DIRTY
4. Return immediately (no recomputation)
read_spine(dn):
1. Check dirty flag
2. If clean: return cached spine (O(1))
3. If dirty: recompute XOR-fold of all children (O(children))
4. Cache result, clear dirty flag
5. Return spine
The dirty flag IS the invalidation token. This replaces mutex/rwlock with a much cheaper mechanism: the spine is always eventually consistent, and the cost of staleness is bounded by the next read.
This is conceptually similar to PowerShell's Invoke-Command -ArgumentList
pattern: you project local variables into a remote scope (the scriptblock)
that can read them, while your local session continues to mutate state
independently. The scriptblock gets a snapshot of your variables at
invocation time — it borrows them.
In the SpineCache:
- The spine = the projected snapshot (borrowed reference)
- The children = the local session (mutable state)
- The dirty flag = the signal that the snapshot needs refreshing
- The lazy recompute = like re-invoking with updated
-ArgumentList
Or think of PowerShell modules and scope escape: when you use $script:
or pass by [ref], the variable survives outside the function — it
persists in the parent scope even as the function's local variables are
mutated and discarded. The spine IS that surviving variable. Children
mutate inside their own scope (their CogRecord content), but the spine
persists at the parent level, accessible to every sibling and every
traversal that passes through.
The difference: PowerShell's scope escape is a runtime mechanism with overhead. The SpineCache's spine is an XOR-fold (O(children) but the result is always exactly 1 KB regardless of how many children exist).
The borrow/mut pattern enables every other subsystem:
| Subsystem | Reads (borrow) | Writes (mut) |
|---|---|---|
| Leaf insert | Reads spines to find best match | Writes new child, marks spine dirty |
| Reflection | Reads spine as structural prediction | Updates child's NARS truth values |
| Felt traversal | Reads sibling bundles (mini-spines) | Records felt path choices |
| Hydration | Reads siblings for context projection | Writes hydrated truth to explorer |
| Semiring MxV | Reads fingerprints along edges | Accumulates values at destinations |
| Rung elevation | Reads spine to evaluate prediction error | Shifts child's rung level |
Without the SpineCache pattern, each of these would need either:
- A full copy of the graph (expensive)
- A read-write lock (contention)
- Two-phase read-then-write with stale data risk
The SpineCache gives us zero-copy reads + in-place writes + eventual consistency with no locks and no copies. The dirty flag is the entire synchronization mechanism.
pub struct SpineCache {
// (spine_container, is_dirty)
cache: HashMap<PackedDn, (Container, bool)>,
// Which children belong to which spine
spine_map: HashMap<PackedDn, Vec<PackedDn>>,
// Reverse: which spines does this child belong to
child_to_spines: HashMap<PackedDn, Vec<PackedDn>>,
}
impl SpineCache {
pub fn write_child(&mut self, child_dn: PackedDn) {
// Mark all parent spines dirty — O(parents), typically 1
if let Some(spine_dns) = self.child_to_spines.get(&child_dn) {
for spine_dn in spine_dns {
if let Some((_, dirty)) = self.cache.get_mut(spine_dn) {
*dirty = true;
}
}
}
}
pub fn read_spine(&mut self, dn: PackedDn, graph: &ContainerGraph) -> &Container {
// Lazy recompute on read
let (spine, dirty) = self.cache.get_mut(&dn).unwrap();
if *dirty {
*spine = self.recompute(dn, graph); // XOR-fold children
*dirty = false;
}
spine
}
}The entire system compiles to: mark a boolean on write, check a boolean on read, XOR-fold on cache miss. This is as cheap as synchronization can possibly be.
New concepts enter the tree through a 3-path insertion algorithm. Popcount IS the split signal.
SPLIT_THRESHOLD = 2000— popcount delta; above = too different for siblingRESONANCE_THRESHOLD = 4050— distance to best spine; above = new branch
Leaf is close to an existing child of the best spine.
delta_popcount < SPLIT_THRESHOLD. Write 1 container, mark 1 spine dirty.
Leaf related to spine's child but divergent (delta >= SPLIT_THRESHOLD).
The closest existing child is reparented under a new sub-spine.
Write 2 containers, mark 2 spines dirty.
Leaf doesn't resonate with any spine (best_dist > RESONANCE_THRESHOLD).
New spine created at the root level. Write 1 container, allocate 1 spine.
1. Belichtungsmesser scan vs all spines (~14 cycles each)
2. best_dist > RESONANCE_THRESHOLD? -> Path 3 (NewBranch)
3. Collect children of best spine
4. Exact XOR popcount delta vs each child
5. min_delta < SPLIT_THRESHOLD? -> Path 1 (Sibling)
6. Otherwise -> Path 2 (SubBranch, reparent closest child)
The tree self-organizes: similar concepts cluster as siblings, divergent concepts split into sub-branches, and truly novel concepts spawn new branches. No explicit categorization needed — Hamming distance IS the taxonomy signal.
The fast pre-filter for Hamming distance estimation. Named after a camera's light meter — it takes a quick exposure reading before committing to a full scan.
const SAMPLE_POINTS: [usize; 7] = [0, 19, 41, 59, 79, 101, 127];
// Sample 7 words (448 bits), scale to full 8192
estimate = sum(popcount(a[i] ^ b[i])) * 8192 / 448- ~14 cycles per container pair
- ~90% rejection at generous thresholds
- Scale factor: 8192 / 448 ~ 18.29
| Level | Name | Cycles | Purpose |
|---|---|---|---|
| L0 | Belichtungsmesser | ~14 | ~90% rejection, coarse estimate |
| L1 | Word-diff scan | ~128 | Spatial localization |
| L2 | Stacked popcount | ~256 | Precise within 1 sigma, early exit |
| L3 | Mexican hat | ~512 | Separates similar from identical |
| L4 | Voyager deep field | ~1024 | Bundles weak signals from misses |
Each level progressively refines, and most queries never reach L2+.
XOR gives us delta encoding for free.
delta_encode(base, target) -> (delta = base XOR target, info = popcount(delta))
delta_decode(base, delta) -> base XOR delta = target // XOR is self-inverseA sequence of containers stores as (first, Vec<(delta, popcount)>).
Adjacent similar containers have small deltas = high compression.
chain_encode([c0, c1, c2, c3]) -> (c0, [(c0^c1, pop), (c1^c2, pop), (c2^c3, pop)])
chain_decode(c0, deltas) -> [c0, c1, c2, c3] // perfect recoveryparity = c0 XOR c1 XOR c2 XOR ... XOR cn
recover(survivors, parity) = parity XOR all_survivors // recovers the missing oneEach hydration step is an XOR delta. Bind = forward (XOR delta onto current state). Unbind = reverse (XOR the same delta again — self-inverse). The popcount of each delta IS the energy of the transition.
origin --bind--> step1 --bind--> step2 --bind--> step3
^ |
+---------------unbind (reverse XOR)-------------+
replay_forward() applies deltas left-to-right. replay_backward() applies
them right-to-left. Both are lossless because XOR is self-inverse.
Non-Axiomatic Reasoning System truth values live in Container 0 at W4-W7.
pub struct TruthValue {
pub frequency: f32, // W4 lower — how often the statement is true
pub confidence: f32, // W4 upper — how much evidence supports it
}
// W5: positive_evidence (f32), W6: negative_evidence (f32)
// HORIZON = 1.0| Operation | Meaning |
|---|---|
revision() |
Combine evidence from two TruthValues |
deduction() |
If A->B and B->C then A->C (frequency multiplied) |
induction() |
From A->B and A->C, hypothesize B->C |
abduction() |
From A->B and C->B, hypothesize A->C |
analogy() |
If A<->B and B->C then A->C (similarity transfer) |
comparison() |
From A->B and A->C, measure similarity B<->C |
expectation() |
frequency * confidence |
to_evidence() |
Convert to (positive, negative) counts |
read_truth(record) -> TruthValue from W4 (unknown if zero)
write_truth(record, tv) -> updates W4 (freq, conf) + W5 (pos, neg evidence)Rungs encode semantic abstraction depth. Stored in metadata W8-W11. Elevation is NOT automatic — it requires one of three triggers.
R0 Surface Literal, immediate meaning
R1 Shallow Simple inference, common implicature
R2 Contextual Situation-dependent interpretation
R3 Analogical Metaphor, similarity-based reasoning
R4 Abstract Generalized patterns, principles
R5 Structural Schema-level understanding
R6 Counterfactual What-if reasoning, alternatives
R7 Meta Reasoning about reasoning
R8 Recursive Self-referential, strange loops
R9 Transcendent Beyond normal semantic bounds
Three elevation triggers (rung shift occurs ONLY upon):
- Sustained Block — BLOCK state persists for N consecutive turns (collapse gate doesn't open -> deeper processing needed)
- Predictive Failure — P metric drops below threshold (predictions don't match reality -> need to abstract)
- Structural Mismatch — No legal grammar parse available (current rung can't represent the concept -> elevate)
Rung Bands for bucket key addressing:
- Low (R0-R2): Surface processing
- Mid (R3-R5): Pattern processing
- High (R6-R9): Meta processing
Cooldown: Minimum 10 seconds between shifts to prevent oscillation.
Ghosts are persistent emotional/causal imprints that color future perception.
They connect to ladybug-rs through the rung system and reflection.
(Canonical implementation: bighorn/extension/agi_stack/modules/cognition/lingering_ghosts.py)
- Ghost types: Affinity, Epiphany, Somatic, Staunen (wonder), Wisdom, Thought, Grief, Boundary
- Ghost states: Dormant -> Stirring -> Present -> Vivid -> Dreaming
- Asymptotic decay: Intensity approaches minimum but NEVER reaches zero
- Dream induction: High-echo ghosts surface during dream consolidation
- Trigger matching: Words, contexts, and "scents" (sigma hashes) activate ghosts
Causal Ghosts in DendriticNodes carry:
- Relations (co_activated, causes, enables, blocks)
- Strength (0.0-1.0)
- Qualia residue (HOW the connection feels)
- Fire tracking (recency and frequency)
The sibling bundle at each branch (the XOR-fold of all siblings' content containers) IS an uncollapsed superposition resonance field vector. These bundles are ghosts in the lingering-ghost sense: persistent, felt-but-not- resolved states that carry the combined presence of all siblings without collapsing to any single one.
The felt traversal sweeps horizontally across a whole forest of these ghost vectors simultaneously. At each branch, the bundle resonance tells you how strongly the query "feels" the uncollapsed superposition. High resonance = the ghost field is relevant. Low resonance = it's dormant.
When rung elevation is triggered by a free energy spike, these ghost field vectors surface: they become the context for hydration, coloring newly explored territory with prior felt experience from the sibling group.
The DnSemiring trait abstracts graph traversal over the container graph.
multiply propagates along edges; add combines at junctions.
pub trait DnSemiring {
type Value: Clone;
fn zero(&self) -> Self::Value;
fn multiply(&self, verb: u8, weight: u8, input: &Self::Value,
src_fp: &Container, dst_fp: Option<&Container>) -> Self::Value;
fn add(&self, a: &Self::Value, b: &Self::Value) -> Self::Value;
fn is_zero(&self, val: &Self::Value) -> bool;
fn name(&self) -> &'static str;
}| Semiring | Value | multiply | add | Purpose |
|---|---|---|---|---|
| BooleanBfs | bool |
Propagate if reachable | OR | Reachability |
| HammingMinPlus | u32 |
Sum + Hamming distance | min | Shortest path |
| HdrPathBind | Option<Container> |
XOR-bind with dst | Bundle | Path fingerprint compose |
| ResonanceSearch | u32 |
Hamming(query, src xor dst) | max | Find resonating paths |
| PageRankPropagation | f32 |
damping * input | sum | Value propagation |
| CascadedHamming | u32 |
L0+L2 filtered Hamming | min | Fast filtered distance |
| FreeEnergySemiring | f32 |
Accumulate surprise | min/max | Free energy propagation |
All plug into container_mxv() (single hop) and container_multi_hop()
(N-hop traversal).
Two strategies:
- MinSurprise:
add = min— find path of least resistance (exploiting) - MaxSurprise:
add = max— find where active inference is most needed (exploring)
multiply accumulates Hamming-normalized surprise along edges.
Plugs directly into existing graph traversal machinery.
The qualia modules give the system phenomenal qualities — what it's "like" to process a container. Each layer adds a different dimension of felt sense.
8 dimensions of phenomenal texture, all normalized to [0.0, 1.0]:
| Dimension | Meaning |
|---|---|
| entropy | Shannon entropy of word-level bit distribution |
| purity | 1.0 - entropy (concentrated signal) |
| density | Fraction of set bits (popcount / 8192) |
| bridgeness | Cross-community connectivity |
| warmth | Affective temperature / valence proxy |
| edge | Transition sharpness / edge energy |
| depth | Abstraction level (hierarchical block variance) |
| flow | Information throughput / flow indicator |
48 bipolar semantic dimensions across 8 families:
| Family | Count | Examples |
|---|---|---|
| OsgoodEPA | 3 | good<->bad, strong<->weak, active<->passive |
| Physical | 10 | large<->small, hot<->cold, sharp<->dull |
| SpatioTemporal | 6 | near<->far, new<->old, sudden<->gradual |
| Cognitive | 5 | simple<->complex, certain<->uncertain |
| Emotional | 3 | happy<->sad, calm<->anxious, loving<->hateful |
| Social | 3 | friendly<->hostile, dominant<->submissive |
| Abstract | 15 | useful<->useless, safe<->dangerous, alive<->dead |
| Sensory | 3 | sweet<->bitter, fragrant<->foul |
Each axis encoded as 208 bits. 48 x 208 = 9,984 bits of semantic space.
Viscosity types describe how meaning flows: Watery (fast/clear), Oily (smooth/clinging), Honey (slow/sticky), Mercury (dense/quick), Lava (slow/transformative), Crystalline (frozen/sharp), Gaseous (diffuse/expanding), Plasma (superheated/unstable).
HDR resonance cascade connecting search distance to phenomenal intensity. Maps Hamming distance to felt proximity through sigma-normalized zones.
The I/Thou/It framework — three stances of relation:
- I (first-person): How the concept appears from inside
- Thou (second-person): How I relate to the concept as a living other
- It (third-person): How the concept appears as an object
Each stance produces a different Container fingerprint. The GestaltFrame holds all three simultaneously.
Three archetypal voices deliberate on every decision:
| Archetype | Bias |
|---|---|
| Guardian | Safety-first: safe, certain, ordered, calm |
| Catalyst | Growth-seeking: active, new, creating, open |
| Balanced | Moderate baseline: all axes at 0.3 |
Consensus: bit-level majority vote (a & b) | (a & c) | (b & c).
A bit survives if 2+ of 3 archetypes agree.
Walking the DN tree with felt qualities. At each branch:
- Bundle all siblings via XOR-fold -> sibling superposition (ghost vector)
- Compute
bundle_resonance= similarity to query - Choose child with highest resonance
- Record surprise = Hamming(query, chosen) / 8192 = prediction error
Special structures:
- FeltChoice: Per-branch record (surprise, sibling bundle, resonances)
- FeltPath: Full walk (choices, total/mean surprise, path context)
- AweTriple: Three concepts held in unresolved superposition (X xor Y xor Z)
Verb constants: VERB_FELT_TRACE = 0xFE, VERB_SIBLING_BUNDLE = 0xFD,
VERB_AWE = 0xFC.
The system looking at itself. Ties felt traversal (surprise) to NARS truth values (W4-W7) for in-place belief update.
ReflectionOutcome 2x2 classification:
confidence HIGH confidence LOW
surprise HIGH REVISE (contradict) EXPLORE (novel)
surprise LOW STABLE (well-predicted) CONFIRM (boost)
- Revise: Create observation with inverted surprise,
revision()update - Confirm: Small confidence boost via
revision()with low-confidence confirmation - Explore: Flag for hydration from siblings (bucket-list candidate)
- Stable: No action needed
HydrationChain: Reversible Markov chain through sibling contexts.
Each step: cross_hydrate(current, extract_perspective(current, sibling)).
Energy = popcount(delta). Low energy neighbors = higher initial confidence.
Verb constant: VERB_REFLECTION = 0xFB.
Adjacent containers under the same spine inherit semantic richness through XOR-based context projection.
extract_perspective(a, b) = a XOR b // what's different between contexts
cross_hydrate(source, delta) = source XOR delta // project into new contextThis IS reversible: cross_hydrate(hydrated, delta) = source (XOR self-inverse).
The HydrationChain (reflection.rs) chains these projections through all
siblings, creating a reversible walk through related contexts. The chain
can be stored compactly via chain_encode() and perfectly recovered via
chain_decode().
The core insight: the trace IS the edge.
In a traditional graph, edges are separate objects connecting nodes.
In ladybug-rs, the "edge" between two concepts is their XOR delta:
a XOR b. This delta IS the relationship — it literally encodes what
transforms one concept into the other.
Properties that make holographic markers potentially more powerful than SNN/ANN/GNN approaches:
-
Reversibility: Given any 2 of {a, b, a xor b}, recover the third. Neural networks are fundamentally one-directional; XOR binding is perfectly reversible.
-
Compositionality:
(a xor b) xor (b xor c) = a xor c. Multi-hop relationships compose via XOR without loss. GNNs lose information at each message-passing layer. -
Constant-size representation: An edge between two 16,384-bit containers is another 16,384-bit container. No matter how complex the relationship, it fits in 2 KB. Neural edge representations grow with model size.
-
Information content is measurable:
popcount(a xor b)= how many bits differ = the energy of the transformation. This is an exact count, not a learned approximation. -
RAID-5 parity: N containers + 1 parity container can recover any single lost container. Neural networks have no native error correction.
-
SIMD-native: All operations are bitwise — they map directly to hardware vector instructions. No matrix multiplication, no activation functions, no backpropagation.
The holographic approach trades learned representations for algebraic ones. The question is whether XOR-based binding + Hamming geometry + NARS reasoning is sufficient for general cognition. The bet: yes, if the substrate is rich enough (8192 bits is a LOT of Hamming space).
The felt traversal's surprise at each node IS the free energy signal. High surprise = the system's predictions don't match its observations.
surprise = hamming(query, chosen_fingerprint) / CONTAINER_BITS
= prediction error
= free energy at this node
The FreeEnergySemiring propagates this signal through the entire graph:
multiply: accumulate surprise along edges (Hamming-normalized distance)add(MinSurprise): find the path of least free energy (exploitation)add(MaxSurprise): find where active inference is most needed (exploration)
The semiring's multiply operation composes surprise along paths. This IS
entropy sensing: each hop accumulates the information-theoretic cost of
traversal. The total accumulated value at a node = how much entropy the
system has traversed to reach it.
Reflection is the compass. The reflect_walk function:
- Walks the tree computing surprise (entropy measurement)
- Classifies each node's state (belief vs. observation gap)
- Updates NARS truth values in-place (entropy reduction)
- Flags nodes that need exploration (entropy-directed attention)
This is self-directed entropy work: the system uses its own surprise signal to decide where to invest computational effort.
Nodes classified as Explore (high surprise + low confidence) become hydration candidates — the system's "bucket list" of novel territories worth investigating.
reflect_walk() -> ReflectionResult {
entries: [...],
hydration_candidates: [dn1, dn2, dn3, ...] // <- bucket list
}
hydrate_explorers() processes the bucket list:
- For each candidate, build a HydrationChain from adjacent siblings
- Low-energy chains (similar siblings) -> higher initial confidence
- High-energy chains (diverse siblings) -> lower initial confidence
- Initialize NARS truth values based on hydration energy
The full cycle reflect_and_hydrate() = walk + classify + update + explore.
This is active inference: the system drives its own learning by attending to
where free energy is highest and confidence is lowest.
The three rung elevation triggers map directly to free energy concepts:
- Sustained Block (collapse gate stuck) = high free energy that current-rung processing can't reduce -> need deeper abstraction
- Predictive Failure (P metric drops) = the model's predictions have diverged from observations -> surprise is accumulating
- Structural Mismatch (no legal parse) = the current representational level can't even encode the observation -> maximum free energy
When rung elevation occurs, the ghost field vectors (sibling bundles) surface: uncollapsed superposition states whose contexts match the current high-surprise environment. These ghost vectors become additional context for hydration — they color the newly explored territory with the combined felt presence of all siblings in the group.
The system has all ingredients for self-directed behavior:
Free energy (surprise) -> WHAT needs attention
Reflection (classify) -> WHAT KIND of attention
Hydration (explore) -> HOW to gather information
NARS revision -> HOW to update beliefs
Rung elevation -> WHEN to go deeper
Ghost surfacing -> WHY this matters (felt context)
Council consensus -> WHO decides (guardian/catalyst/balanced)
Volition emerges when the system can choose which bucket-list candidates to pursue, weighting by free energy (urgency), ghost intensity (felt salience), and council consensus (archetypal alignment).
RedisGraph (redis module, CSR adjacency)
| transcoded to Hamming resonance
Holograph (DnNodeStore + DnCsr, sparse adjacency vectors)
| blasgraph extracted as core
BlasGraph (sparse adjacent vectors, BLAS-style operations)
| container-native reimplementation
ContainerGraph (HashMap<PackedDn, CogRecord>, children map)
The key transcoding: RedisGraph stored adjacency as integer node IDs in CSR format. The holograph step replaced integer IDs with Container fingerprints and integer edge weights with XOR deltas. BlasGraph formalized this as sparse-adjacent-vector operations. ContainerGraph (ladybug-rs) is the final form: pure Container-native, everything is 16,384 bits, all operations are XOR/Hamming/popcount.
The adjacency encoding in W16-31 (inline) and W96-111 (CSR overflow) is the direct descendant of the RedisGraph CSR layout, now operating on cognitive verb IDs + Container target hints instead of integer node IDs.
| Constant | Value | Location |
|---|---|---|
CONTAINER_BITS |
8,192 | container.rs |
CONTAINER_WORDS |
128 | container.rs |
CONTAINER_BYTES |
1,024 | container.rs |
CONTAINER_AVX512_ITERS |
16 | container.rs |
MAX_CONTAINERS |
255 | container.rs |
EXPECTED_DISTANCE |
4,096 | container.rs |
SIGMA |
45.25 | container.rs |
CogRecord::SIZE |
2,048 | record.rs |
MAX_INLINE_EDGES |
64 | meta.rs |
MAX_CSR_EDGES |
12 | adjacency.rs |
SPLIT_THRESHOLD |
2,000 | insert.rs |
RESONANCE_THRESHOLD |
4,050 | insert.rs |
SAMPLE_POINTS |
7 | search.rs |
NARS HORIZON |
1.0 | nars.rs |
SCHEMA_VERSION |
1 | meta.rs |
SURPRISE_HIGH |
0.55 | reflection.rs |
SURPRISE_LOW |
0.45 | reflection.rs |
CONFIDENCE_HIGH |
0.50 | reflection.rs |
VERB_FELT_TRACE |
0xFE | felt_traversal.rs |
VERB_SIBLING_BUNDLE |
0xFD | felt_traversal.rs |
VERB_AWE |
0xFC | felt_traversal.rs |
VERB_REFLECTION |
0xFB | reflection.rs |
This architecture document was extended during the construction of the qualia module stack (resonance -> gestalt -> felt_traversal -> reflection). The code compiles and all tests pass. The substrate is ready for the next layer.