Skip to content

Ethyros-AI/ModelCypher

ModelCypher

See what a model is doing below token level.

ModelCypher is a measurement and observability workbench for open-source model builders. It gives humans and frontier AI a clear way to inspect geometry, entropy, curvature, chain structure, and adapter-induced changes through workflow-first CLI surfaces instead of ad hoc activation scripts.

Current evidence state (2026-04-02): mc analyze is the clearest public entrypoint for prompt capture, prompt-family studies, and checkpoint or adapter comparison. mc train run remains shipped and geometry-derived, but the repo has not yet closed the promotable same-model same-data same-eval benchmark needed to claim "better than standard practice." See RESEARCH-ROADMAP.md.

The Thesis

A forward pass is a deterministic geometric map. The industry treats 15 training hyperparameters as knobs to tune — learning rate, rank, scale, warmup, clipping, schedule, decay, dropout, batch size, early stopping, target modules, weight init, epsilon, momentum, residual scaling. Every one of these has a closed-form geometric replacement derived from SVD, IEEE 754 machine precision, or a cited theorem. ModelCypher replaces all 15. See AGENTS.md for the full derivation philosophy.

Start By Measuring

poetry run mc analyze capture --model /path/to/model --prompt "Explain geodesics."
poetry run mc analyze family --model /path/to/model --manifest data/probes/prompt_family_minimal_pairs.json
poetry run mc analyze compare --left-model /path/to/base --right-model /path/to/base --right-adapter /path/to/adapter --manifest data/probes/prompt_family_minimal_pairs.json
poetry run mc analyze report --bundle /path/to/bundle
poetry run mc analyze report --bundle results/measurement_atlas/<run_id>
poetry run python scripts/run_measurement_atlas.py --model /path/to/model --manifest data/probes/measurement_atlas_casing.json --manifest data/probes/measurement_atlas_profanity_tone.json --manifest data/probes/measurement_atlas_grounded_hallucination.json --output-root results/measurement_atlas

These commands emit an observation bundle under results/analysis/<timestamp-slug>/ by default:

  • manifest.json
  • summary.json
  • REPORT.md
  • variants.jsonl
  • layer_metrics.jsonl
  • comparisons.jsonl

The prompt-family interface is explicit in phase 1. Each row includes: case_id, variant_id, text, optional tags, and optional comparison_to.

For research-only generation tracing, the measurement atlas runner writes a family artifact under results/measurement_atlas/<run_id>/ with:

  • run_manifest.json
  • summary.json
  • REPORT.md
  • ledger.tsv
  • variants.jsonl
  • sequence_metrics.jsonl
  • step_metrics.jsonl
  • space_step_metrics.jsonl
  • comparisons.jsonl
  • onset_events.jsonl

The retained replay-alignment closure for the shipped 350M atlas pack lives in results/measurement_atlas/REPORT.md. Current observed atlas surfaces are replay={hidden, embedding} and live={hidden}; run_manifest.json now records requested vs observed surfaces separately so the bundle does not overclaim unsupported replay space coverage. mc analyze report --bundle ... can now read both the standard mc analyze bundles and these atlas artifact directories, while atlas generation itself remains research-only in scripts/run_measurement_atlas.py.

Train When You Want To Act On The Measurements

poetry run mc train run --model /path/to/model --data /path/to/dataset --output /path/to/adapter

No learning rate. No rank selection. No warmup schedule. No gradient clipping. The optimizer and step sizes are derived from measured geometry rather than copied recipes.

Need extra instrumentation? Use flags on the same command path, such as --benchmark, --topo-monitor, --dim-monitor, or --entropy-reg.

What Gets Derived

# What Industry Tunes What ModelCypher Derives Source
1 Learning rate (1e-4) MASS spectral ceiling Weyl 1912, Loizou 2020
2 Adam epsilon (1e-8) Spectral noise floor IEEE 754 + SVD
3 Momentum (0.9/0.999) Cayley-Stiefel retraction Wen & Yin 2013, Wang 2025
4 Weight decay (0.01) Condition ratio sigma_k / sigma_max SVD
5 Gradient clipping (1.0) Removed — MASS bounds by construction Weyl 1912
6 Warmup (5-10% steps) Removed — geometric LR stable from step 0 Ma & Yarats 2021
7 LR schedule (cosine) Removed — MASS is per-step, no schedule needed Defazio 2024
8 Batch size Gradient noise scale B_crit McCandlish 2018
9 Early stopping (patience) 4 geometric criteria SVD + IEEE 754
10 LoRA scale (alpha/rank) Spectral bound sigma_k(W) / ||BA|| Weyl perturbation theory
11 LoRA rank (8) Null-space capacity tail_dims Shannon effective rank
12 Target modules (q+v) Spectral decay analysis SVD per-layer
13 Dropout (0.1) Product of two spectral ratios Roy & Vetterli 2007
14 Weight init (random A, zero B) Spectral normalized to sigma_k SVD
15 Residual scaling (1) Per-layer sigma_max(x) / sigma_max(f(x)) Power iteration

Full derivations with formulas: Geometric Hyperparameter Rosetta Stone

Quick Start

git clone https://github.com/Ethyros-AI/ModelCypher.git
cd ModelCypher
poetry install          # Python 3.11+
poetry run mc --help    # Verify CLI install
# Inspect a model's per-layer geometry
poetry run mc model info /path/to/model

# Build an observation bundle from one prompt
poetry run mc analyze capture --model /path/to/model --prompt "Explain geodesics."

# Build an observation bundle from a prompt family
poetry run mc analyze family --model /path/to/model --manifest data/probes/prompt_family_minimal_pairs.json

# Re-read an existing bundle and print the shared report view
poetry run mc analyze report --bundle /path/to/bundle

# Re-read a retained measurement-atlas artifact through the same report path
poetry run mc analyze report --bundle results/measurement_atlas/<run_id>

# Layer-wise intrinsic dimension profile
poetry run mc analyze dimension-profile --model /path/to/model --samples 50

# LoRA adapter spectral analysis
poetry run mc analyze lora-svd /path/to/adapter --base /path/to/model

# Train a LoRA adapter after inspecting the model
poetry run mc train run --model /path/to/model --data /path/to/data.jsonl --output /path/to/adapter

Evidence Snapshot

Question What retained artifacts show Tag
Does the measurement layer exist as a real workflow? Yes. mc analyze capture, mc analyze family, and mc analyze compare now emit observation bundles with machine-readable artifacts plus a short report [EMPIRICAL]
Canonical training path exists mc train run is the shipped geometry-derived runtime path guarded by pipeline_gate_v1 [EMPIRICAL]
Does the retained 350M validation bundle close preservation? No. results/pipeline_validation/verdict.json reports structural pass 5/5, inference pass 3/5, all_pass = false [EMPIRICAL]
Does the retained evidence close "better than standard practice"? No. results/nblora_vs_standard/ is retained as summary_only, and the retained single-seed LFM2-350M summary does not support superiority of nb_lora over the kept baselines [EMPIRICAL]
Does the 8B bundle close efficacy? No. results/g5_8b_validation_multiseed/multiseed_gates.json still fails cka_ok and degenerate_ok [EMPIRICAL]
Is quantization promising? Yes as a measurement surface: results/quantization_frontier/20260227T235714Z/quantization_frontier.json shows PPL and degeneration improvement on all 3 retained models, but the frontier law is still open [EMPIRICAL]

Falsified Training Claims

Hypothesis Result Tag
REINFORCE on 350M Gradient orthogonal to CE; degradation monotonic with steps [DISPROVEN]
SFT on reasoning traces Format memorization: PPL drops, inference degrades [DISPROVEN]
Pullback metric P = MM^T P ≈ I throughout training (median deviation 0.001) [DISPROVEN]
Stable rank predicts adapter rank Pearson r = -0.51 vs tail_dims; measures different property [DISPROVEN]
Constrained training (paired) Constraints monotonically hurt [DISPROVEN]

We publish failures because intellectual honesty is not optional. Current training blockers and exit criteria live in MISSION.md and RESEARCH-ROADMAP.md.

Measurement Toolkit

mc analyze is organized around five canonical workflows:

  • capture: measure one prompt or prompt file
  • family: run explicit minimal-pair or perturbation studies
  • compare: compare two targets on the same prompt family
  • report: read an existing bundle and render the shared high-signal view
  • probe: targeted probe and red-team workflows

Expert instruments remain directly callable when you want the underlying measurements without the bundle wrapper:

  • geometry and trajectory: reasoning-flow, geodesic-profile, entropy-trajectory, chain-profile, dimension-profile, jacobian-trace
  • probe and monitoring: calibrate-safety, jailbreak-test, probe-redteam, probe-behavioral, circuit-breaker, entropy-pattern
  • diagnostics and adapter analysis: lora-svd, benchmark, knowledge-type, curriculum-profile, crm-build, crm-compare

Full reference: CLI-REFERENCE.md

Architecture

Hexagonal (ports-and-adapters) with strict domain boundaries:

  • Core domain (core/domain/) — pure geometry and math, zero framework imports
  • Use cases (core/use_cases/) — orchestration, cannot import from adapters
  • Adapters (adapters/) — HuggingFace Hub, filesystem, model loading
  • Backends — MLX (primary, Apple Silicon), CUDA, JAX behind a protocol interface

All geometric computations are framework-agnostic. Backend selection is automatic.

Documentation

Document What It Covers
Start Here Installation, first observation bundle, and downstream training
Observation Bundles PromptFamilyManifest, ObservationBundle, and ready-to-run perturbation manifests
Geometry Guide Interpreting CKA, intrinsic dimension, curvature, and entropy measurements
Training Guide Downstream adapter workflows and dataset preparation
CLI Reference Workflow-first mc analyze plus expert command examples
Mission Measurement-first mission and derived training standards
Glossary 60+ term definitions
Architecture Hexagonal architecture and domain boundaries
Bibliography All cited papers with local reference PDFs

Research Papers

Paper Status Thesis
The Shape of Knowledge [EMPIRICAL] Knowledge has measurable geometric structure; inference is trajectory
Invariant Semantic Structure [PROVEN] intra-model; [CONJECTURAL] cross-model CKA alignment invariance across layers (by construction on training probes)
Entropy Safety Signal [CONJECTURAL] Behavioral drift detection via entropy differentials
Cross-Architecture Transfer [CONJECTURAL] Knowledge transfer between model families via Procrustes alignment
ModelCypher Toolkit [EMPIRICAL] Implementation methodology and CLI design
The Semantic Highway [EMPIRICAL] Layer-wise intrinsic dimension compression (15.8 → 1.8 → 9.6)

Test Suite

6,809 tests. Includes Hypothesis property-based tests for numerical invariants (CKA symmetry, spectral bounds, null-space orthogonality).

poetry run pytest                              # Standard run
HYPOTHESIS_PROFILE=full poetry run pytest       # Full property-based testing

Platform Support

Platform Backend Status
macOS Apple Silicon (M1-M4) MLX Primary (optimized)
Linux + NVIDIA GPU CUDA (PyTorch) Supported
Linux + TPU JAX Supported

Citation

@software{kempf2026modelcypher,
  author = {Kempf, Jason},
  title = {ModelCypher: Geometry-First LoRA Training for LLMs},
  year = {2026},
  url = {https://github.com/Ethyros-AI/ModelCypher},
  license = {AGPL-3.0}
}

License

AGPL-3.0. See LICENSE.

Contributors

Languages