ANC — Anchor-Node Codec: Structured Reasoning Geometry for LLMs

A data-to-model encoding pipeline that makes small LLMs reason faster and more accurately by reorganizing raw data into a four-corner geometry (S/M/Q/R) before inference.

Author: Kevin T.N. — jkdkr2439@gmail.com License: AGPL-3.0

What ANC Does

ANC takes structured data (semantic nodes from a SQLite fieldmap) and encodes each node into four corners:

S — Structure (type, archetype)
M — Meaning (text content)
Q — Quantity (N/M/F slot counts)
R — Relation (slot presence pattern)

It then collapses that representation into explicit views that a model can read directly.

Project Structure

ANC/
├── core/                        # Core codec and archiver
│   ├── codec.py                 #   VocabMapper, FDICEncoder, BCLSerializer, GlyphCodec, TrueEngine
│   ├── archiver.py              #   Self-extracting archive builder
│   └── __init__.py
├── anc.py                       # CLI: pack, extract, inspect, query, render-prompt
├── anc_benchmark.py             # Prompt-surface benchmark runner (Ollama)
├── benchmarks/
│   ├── prompt_surface/          # Prompt-mode benchmark results (qwen2.5, gemma4)
│   └── native_quad/             # Native architecture benchmark results + experiment code
│       └── quad_native_experiment.py
├── docs/
│   ├── PIPELINE_FULL.md         # Full 10-stage pipeline documentation
│   ├── BENCHMARK_SUMMARY.md     # Prompt-surface benchmark analysis
│   ├── NATIVE_QUAD_V1.md        # Native quad v1 results
│   └── NATIVE_QUAD_V2.md        # Native quad v2 results
├── LICENSE                      # AGPL-3.0
└── .gitignore

Current Results

Results so far are from a small-scale benchmark: 1 sample pair, 2 function nodes, 4 scored fields (node_count, same_type, first_counts, shared_archetype). This is enough to show the encoding works, but not enough to claim generality.

Gemma 4 raw scores were re-parsed from thinking model output (original parser missed JSON inside thinking blocks).

Prompt Surface — Cross-Model Comparison (1 sample, score out of 4)

Mode	Chars	Qwen 0.5b	Qwen 1.5b	Qwen 3b	Gemma 4 E2B
raw	3335	1/4	0/4	2/4	3/4
semantic	1570	1/4	0/4	0/4	3/4
glyph	1637	1/4	2/4	3/4	4/4
hybrid	1424	—	3/4	0/4	4/4
dual_glyph	1307	—	3/4	0/4	†0/4
quad_collapse_full	1221	—	4/4	3/4	4/4
pyramid5	1240	—	2/4	—	4/4

†Gemma 4 dual_glyph scored 0/4 due to an output parsing failure (thinking model wraps JSON in markdown code blocks with terminal escape sequences), not because the model answered incorrectly. The raw output contains the correct answer.

Qwen 0.5b was only tested on 3 modes. Qwen 3b is more prompt-sensitive than expected — small formatting changes can move a mode from good to poor between runs.

Prompt Surface — Latency

Mode	Qwen 1.5b	Qwen 3b	Gemma 4 E2B
raw	37.8s	80.5s	260.6s
quad_collapse_full	9.2s	13.8s	75.1s
speedup	4.1x	5.8x	3.5x

ANC reduces latency across all tested models by compressing the prompt from ~3300 chars to ~1200 chars while making structure explicit. On Gemma 4 (thinking model), this saves ~185 seconds per query.

Native Architecture (custom PyTorch model, CPU, 1200 test samples)

Model	Exact Match
text_only baseline	0.3233
quad_native_v2 (with S/M/Q/R channels)	1.0000

The native model uses structural priors injected into logits — it is a hybrid learned+structural architecture, not a pure learned model. The 100% result reflects this design choice.

What Has Not Been Tested

Larger benchmark sets (20+ samples, diverse node types)
Tasks beyond the current 4-field schema
Integration with pretrained LLMs at the architecture level (prompt-surface tested on qwen2.5 and gemma4, native architecture only tested on a small custom model)
Generalization to other data domains

Usage

Render a reasoning surface

python anc.py render-prompt --db arc_fieldmap.db --mode quad_collapse --limit 2 --type function

Run prompt-surface benchmark

python anc_benchmark.py --model gemma4:e2b --type function --limit 2
python anc_benchmark.py --model qwen2.5:1.5b --type function --limit 2

Run native architecture experiment

python benchmarks/native_quad/quad_native_experiment.py --db arc_fieldmap.db --epochs 4

Pack/Extract a fieldmap DB

python anc.py pack --source arc_fieldmap.db --out arc_fieldmap_archive.py
python anc.py extract --archive arc_fieldmap_archive.py --out restored.db

Requirements

Python 3.10+
torch (for native quad experiment)
ollama (for prompt-surface benchmarks)
A glyph_dict.db at ../BCL/glyph_dict.db (for glyph modes)

License

This project is licensed under the GNU Affero General Public License v3.0.

If you use this code, modify it, or run it as part of a service, you must make the complete source code available under the same license.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ANC — Anchor-Node Codec: Structured Reasoning Geometry for LLMs

What ANC Does

Project Structure

Current Results

Prompt Surface — Cross-Model Comparison (1 sample, score out of 4)

Prompt Surface — Latency

Native Architecture (custom PyTorch model, CPU, 1200 test samples)

What Has Not Been Tested

Usage

Render a reasoning surface

Run prompt-surface benchmark

Run native architecture experiment

Pack/Extract a fieldmap DB

Requirements

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
benchmarks		benchmarks
core		core
docs		docs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
anc.py		anc.py
anc_benchmark.py		anc_benchmark.py

Folders and files

Latest commit

History

Repository files navigation

ANC — Anchor-Node Codec: Structured Reasoning Geometry for LLMs

What ANC Does

Project Structure

Current Results

Prompt Surface — Cross-Model Comparison (1 sample, score out of 4)

Prompt Surface — Latency

Native Architecture (custom PyTorch model, CPU, 1200 test samples)

What Has Not Been Tested

Usage

Render a reasoning surface

Run prompt-surface benchmark

Run native architecture experiment

Pack/Extract a fieldmap DB

Requirements

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages