ModelCypher

See what a model is doing below token level.

ModelCypher is a measurement and observability workbench for open-source model builders. It gives humans and frontier AI a clear way to inspect geometry, entropy, curvature, chain structure, and adapter-induced changes through workflow-first CLI surfaces instead of ad hoc activation scripts.

Current evidence state (2026-04-02): mc analyze is the clearest public entrypoint for prompt capture, prompt-family studies, and checkpoint or adapter comparison. mc train run remains shipped and geometry-derived, but the repo has not yet closed the promotable same-model same-data same-eval benchmark needed to claim "better than standard practice." See RESEARCH-ROADMAP.md.

The Thesis

A forward pass is a deterministic geometric map. The industry treats 15 training hyperparameters as knobs to tune — learning rate, rank, scale, warmup, clipping, schedule, decay, dropout, batch size, early stopping, target modules, weight init, epsilon, momentum, residual scaling. Every one of these has a closed-form geometric replacement derived from SVD, IEEE 754 machine precision, or a cited theorem. ModelCypher replaces all 15. See AGENTS.md for the full derivation philosophy.

Start By Measuring

poetry run mc analyze capture --model /path/to/model --prompt "Explain geodesics."
poetry run mc analyze family --model /path/to/model --manifest data/probes/prompt_family_minimal_pairs.json
poetry run mc analyze compare --left-model /path/to/base --right-model /path/to/base --right-adapter /path/to/adapter --manifest data/probes/prompt_family_minimal_pairs.json
poetry run mc analyze report --bundle /path/to/bundle
poetry run mc analyze report --bundle results/measurement_atlas/<run_id>
poetry run python scripts/run_measurement_atlas.py --model /path/to/model --manifest data/probes/measurement_atlas_casing.json --manifest data/probes/measurement_atlas_profanity_tone.json --manifest data/probes/measurement_atlas_grounded_hallucination.json --output-root results/measurement_atlas

These commands emit an observation bundle under results/analysis/<timestamp-slug>/ by default:

manifest.json
summary.json
REPORT.md
variants.jsonl
layer_metrics.jsonl
comparisons.jsonl

The prompt-family interface is explicit in phase 1. Each row includes: case_id, variant_id, text, optional tags, and optional comparison_to.

For research-only generation tracing, the measurement atlas runner writes a family artifact under results/measurement_atlas/<run_id>/ with:

run_manifest.json
summary.json
REPORT.md
ledger.tsv
variants.jsonl
sequence_metrics.jsonl
step_metrics.jsonl
space_step_metrics.jsonl
comparisons.jsonl
onset_events.jsonl

The retained replay-alignment closure for the shipped 350M atlas pack lives in results/measurement_atlas/REPORT.md. Current observed atlas surfaces are replay={hidden, embedding} and live={hidden}; run_manifest.json now records requested vs observed surfaces separately so the bundle does not overclaim unsupported replay space coverage. mc analyze report --bundle ... can now read both the standard mc analyze bundles and these atlas artifact directories, while atlas generation itself remains research-only in scripts/run_measurement_atlas.py.

Train When You Want To Act On The Measurements

poetry run mc train run --model /path/to/model --data /path/to/dataset --output /path/to/adapter

No learning rate. No rank selection. No warmup schedule. No gradient clipping. The optimizer and step sizes are derived from measured geometry rather than copied recipes.

Need extra instrumentation? Use flags on the same command path, such as --benchmark, --topo-monitor, --dim-monitor, or --entropy-reg.

What Gets Derived

#	What Industry Tunes	What ModelCypher Derives	Source
1	Learning rate (`1e-4`)	MASS spectral ceiling	Weyl 1912, Loizou 2020
2	Adam epsilon (`1e-8`)	Spectral noise floor	IEEE 754 + SVD
3	Momentum (`0.9/0.999`)	Cayley-Stiefel retraction	Wen & Yin 2013, Wang 2025
4	Weight decay (`0.01`)	Condition ratio `sigma_k / sigma_max`	SVD
5	Gradient clipping (`1.0`)	Removed — MASS bounds by construction	Weyl 1912
6	Warmup (5-10% steps)	Removed — geometric LR stable from step 0	Ma & Yarats 2021
7	LR schedule (cosine)	Removed — MASS is per-step, no schedule needed	Defazio 2024
8	Batch size	Gradient noise scale `B_crit`	McCandlish 2018
9	Early stopping (patience)	4 geometric criteria	SVD + IEEE 754
10	LoRA scale (`alpha/rank`)	Spectral bound `sigma_k(W) / \|\|BA\|\|`	Weyl perturbation theory
11	LoRA rank (`8`)	Null-space capacity `tail_dims`	Shannon effective rank
12	Target modules (`q+v`)	Spectral decay analysis	SVD per-layer
13	Dropout (`0.1`)	Product of two spectral ratios	Roy & Vetterli 2007
14	Weight init (random A, zero B)	Spectral normalized to `sigma_k`	SVD
15	Residual scaling (`1`)	Per-layer `sigma_max(x) / sigma_max(f(x))`	Power iteration

Full derivations with formulas: Geometric Hyperparameter Rosetta Stone

Quick Start

git clone https://github.com/Ethyros-AI/ModelCypher.git
cd ModelCypher
poetry install          # Python 3.11+
poetry run mc --help    # Verify CLI install

# Inspect a model's per-layer geometry
poetry run mc model info /path/to/model

# Build an observation bundle from one prompt
poetry run mc analyze capture --model /path/to/model --prompt "Explain geodesics."

# Build an observation bundle from a prompt family
poetry run mc analyze family --model /path/to/model --manifest data/probes/prompt_family_minimal_pairs.json

# Re-read an existing bundle and print the shared report view
poetry run mc analyze report --bundle /path/to/bundle

# Re-read a retained measurement-atlas artifact through the same report path
poetry run mc analyze report --bundle results/measurement_atlas/<run_id>

# Layer-wise intrinsic dimension profile
poetry run mc analyze dimension-profile --model /path/to/model --samples 50

# LoRA adapter spectral analysis
poetry run mc analyze lora-svd /path/to/adapter --base /path/to/model

# Train a LoRA adapter after inspecting the model
poetry run mc train run --model /path/to/model --data /path/to/data.jsonl --output /path/to/adapter

Evidence Snapshot

Question	What retained artifacts show	Tag
Does the measurement layer exist as a real workflow?	Yes. `mc analyze capture`, `mc analyze family`, and `mc analyze compare` now emit observation bundles with machine-readable artifacts plus a short report	[EMPIRICAL]
Canonical training path exists	`mc train run` is the shipped geometry-derived runtime path guarded by `pipeline_gate_v1`	[EMPIRICAL]
Does the retained 350M validation bundle close preservation?	No. `results/pipeline_validation/verdict.json` reports structural pass `5/5`, inference pass `3/5`, `all_pass = false`	[EMPIRICAL]
Does the retained evidence close "better than standard practice"?	No. `results/nblora_vs_standard/` is retained as `summary_only`, and the retained single-seed LFM2-350M summary does not support superiority of `nb_lora` over the kept baselines	[EMPIRICAL]
Does the 8B bundle close efficacy?	No. `results/g5_8b_validation_multiseed/multiseed_gates.json` still fails `cka_ok` and `degenerate_ok`	[EMPIRICAL]
Is quantization promising?	Yes as a measurement surface: `results/quantization_frontier/20260227T235714Z/quantization_frontier.json` shows PPL and degeneration improvement on all 3 retained models, but the frontier law is still open	[EMPIRICAL]

Falsified Training Claims

Hypothesis	Result	Tag
REINFORCE on 350M	Gradient orthogonal to CE; degradation monotonic with steps	[DISPROVEN]
SFT on reasoning traces	Format memorization: PPL drops, inference degrades	[DISPROVEN]
Pullback metric P = MM^T	P ≈ I throughout training (median deviation 0.001)	[DISPROVEN]
Stable rank predicts adapter rank	Pearson r = -0.51 vs tail_dims; measures different property	[DISPROVEN]
Constrained training (paired)	Constraints monotonically hurt	[DISPROVEN]

We publish failures because intellectual honesty is not optional. Current training blockers and exit criteria live in MISSION.md and RESEARCH-ROADMAP.md.

Measurement Toolkit

mc analyze is organized around five canonical workflows:

capture: measure one prompt or prompt file
family: run explicit minimal-pair or perturbation studies
compare: compare two targets on the same prompt family
report: read an existing bundle and render the shared high-signal view
probe: targeted probe and red-team workflows

Expert instruments remain directly callable when you want the underlying measurements without the bundle wrapper:

geometry and trajectory: reasoning-flow, geodesic-profile, entropy-trajectory, chain-profile, dimension-profile, jacobian-trace
probe and monitoring: calibrate-safety, jailbreak-test, probe-redteam, probe-behavioral, circuit-breaker, entropy-pattern
diagnostics and adapter analysis: lora-svd, benchmark, knowledge-type, curriculum-profile, crm-build, crm-compare

Full reference: CLI-REFERENCE.md

Architecture

Hexagonal (ports-and-adapters) with strict domain boundaries:

Core domain (core/domain/) — pure geometry and math, zero framework imports
Use cases (core/use_cases/) — orchestration, cannot import from adapters
Adapters (adapters/) — HuggingFace Hub, filesystem, model loading
Backends — MLX (primary, Apple Silicon), CUDA, JAX behind a protocol interface

All geometric computations are framework-agnostic. Backend selection is automatic.

Documentation

Document	What It Covers
Start Here	Installation, first observation bundle, and downstream training
Observation Bundles	`PromptFamilyManifest`, `ObservationBundle`, and ready-to-run perturbation manifests
Geometry Guide	Interpreting CKA, intrinsic dimension, curvature, and entropy measurements
Training Guide	Downstream adapter workflows and dataset preparation
CLI Reference	Workflow-first `mc analyze` plus expert command examples
Mission	Measurement-first mission and derived training standards
Glossary	60+ term definitions
Architecture	Hexagonal architecture and domain boundaries
Bibliography	All cited papers with local reference PDFs

Research Papers

Paper	Status	Thesis
The Shape of Knowledge	[EMPIRICAL]	Knowledge has measurable geometric structure; inference is trajectory
Invariant Semantic Structure	[PROVEN] intra-model; [CONJECTURAL] cross-model	CKA alignment invariance across layers (by construction on training probes)
Entropy Safety Signal	[CONJECTURAL]	Behavioral drift detection via entropy differentials
Cross-Architecture Transfer	[CONJECTURAL]	Knowledge transfer between model families via Procrustes alignment
ModelCypher Toolkit	[EMPIRICAL]	Implementation methodology and CLI design
The Semantic Highway	[EMPIRICAL]	Layer-wise intrinsic dimension compression (15.8 → 1.8 → 9.6)

Test Suite

6,809 tests. Includes Hypothesis property-based tests for numerical invariants (CKA symmetry, spectral bounds, null-space orthogonality).

poetry run pytest                              # Standard run
HYPOTHESIS_PROFILE=full poetry run pytest       # Full property-based testing

Platform Support

Platform	Backend	Status
macOS Apple Silicon (M1-M4)	MLX	Primary (optimized)
Linux + NVIDIA GPU	CUDA (PyTorch)	Supported
Linux + TPU	JAX	Supported

Citation

@software{kempf2026modelcypher,
  author = {Kempf, Jason},
  title = {ModelCypher: Geometry-First LoRA Training for LLMs},
  year = {2026},
  url = {https://github.com/Ethyros-AI/ModelCypher},
  license = {AGPL-3.0}
}

License

AGPL-3.0. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 2,396 Commits
.firecrawl		.firecrawl
data		data
docs		docs
examples		examples
papers		papers
plasma		plasma
scripts		scripts
src/modelcypher		src/modelcypher
tests		tests
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
DISCLAIMER.md		DISCLAIMER.md
GEMINI.md		GEMINI.md
LICENSE		LICENSE
README.md		README.md
TODO.md		TODO.md
llms.txt		llms.txt
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ModelCypher

The Thesis

Start By Measuring

Train When You Want To Act On The Measurements

What Gets Derived

Quick Start

Evidence Snapshot

Falsified Training Claims

Measurement Toolkit

Architecture

Documentation

Research Papers

Test Suite

Platform Support

Citation

License

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ModelCypher

The Thesis

Start By Measuring

Train When You Want To Act On The Measurements

What Gets Derived

Quick Start

Evidence Snapshot

Falsified Training Claims

Measurement Toolkit

Architecture

Documentation

Research Papers

Test Suite

Platform Support

Citation

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages