ModelCypher's training surface is a workbench, not just a single command. The workflow is:
inspect -> plan -> train -> evaluate -> compare -> export
The user-facing value is straightforward: ModelCypher derives target modules, ranks, stopping signals, and controller quantities from the model and data so you do not hand-tune folklore hyperparameters.
mc train runis the shipped training path.- Its control plane is geometry-derived.
- The repo has not yet closed a promotable head-to-head advantage over standard practice on real benchmark suites.
- That is a current limitation of a shipped tool, not a reason to pretend the workbench does not exist.
Training-related commands available now:
mc train runmc train evaluatemc train comparemc train exportmc train mergemc train statusmc train validate-derivedmc train star
poetry run mc model info /path/to/model
poetry run mc model capacity /path/to/model --sort-by recommended-rankpoetry run mc train run \
--model /path/to/model \
--data /path/to/train.jsonl \
--plan-onlyThis resolves the exact training plan without mutating model state. Use it to see the derived surface before you commit to a run.
poetry run mc train run \
--model /path/to/model \
--data /path/to/train.jsonl \
--output /path/to/adapterpoetry run mc train evaluate \
--model /path/to/model \
--adapter /path/to/adapter \
--data /path/to/validation.jsonlpoetry run mc train compare \
--model /path/to/model \
--adapter-a /path/to/adapter \
--data /path/to/validation.jsonlpoetry run mc train export \
--model /path/to/model \
--adapter /path/to/adapter \
--output /path/to/deployment_dir \
--target deployment_quantizedmc train run consumes JSONL with either:
{"text": "..."}{"messages": [{"role": "...", "content": "..."}]}
Examples:
{"text": "User: What is 2+2?\nAssistant: 4"}
{"messages": [{"role": "user", "content": "Hello"}, {"role": "assistant", "content": "Hi!"}]}If your source data is not already JSONL, use:
poetry run mc data prepare /path/to/source --output /path/to/train.jsonlCanonical geometry-derived LoRA training command. The goal is not to expose more knobs. The goal is to derive the plan, show it when asked, execute it without drift, and leave you with evidence about what happened.
poetry run mc train run \
-m /path/to/model \
-d /path/to/data.jsonl \
-o /path/to/adapter \
--eval-data /path/to/eval.jsonlOptions:
--model,-m(required)--data,-d(required)--output,-o--eval-data--benchmark--no-save--explain--plan-only--seq-length--seed--topo-monitor/--no-topo-monitor--dim-monitor/--no-dim-monitor--target-experts--entropy-reg/--no-entropy-reg
The workbench derives or resolves these surfaces from model and data state:
- target modules
- per-module ranks
- sequence length when omitted
- controller quantities used during training
- stopping and verification surfaces
- seed and eval split defaults
The controller does not expose a fixed scalar learning rate. The key statement is:
eta_step = min(eta_ceiling, eta_sps, eta_weyl)
mc train run surfaces training state in three buckets:
derived_nowFixed before training starts: seed, output path, sequence length, eval split, target modules, per-module ranks, optimizer geometry config.measured_during_trainingRuntime controller quantities:eta_ceiling,eta_sps,eta_weyl,eta_step, gradient-noise-derived batch size, and stopping certificate signals.verified_after_trainingPost-training gates: spectral bounds, CKA, degeneration, pipeline gate, and optional benchmark delta when--benchmarkis enabled.
poetry run mc train run \
-m /path/to/model \
-d /path/to/data.jsonl \
--plan-onlyUse this when you want the exact resolved plan without injecting adapters or creating output directories.
Example text output:
Resolved training plan
Model: /path/to/model
Dataset: /path/to/data.jsonl
Eval: derived split (pilot_variance)
Seed: 123456789 (derived_from_model_dataset_hash)
Output: /path/to/adapters/model-geometric-lora-123456789
Seq length: 256 (data_derived_max_token_length)
Split: pilot_variance | train=480 eval=32
Target surface: 96 modules | ranks=4-16 | params~1,572,864
Spectral bounds: sigma_k_min=2.1e-02 | sigma_max=8.7e+00 | ceiling=RMT signal-rank
Controller: no fixed scalar LR; MASS will choose eta_step = min(eta_ceiling, eta_sps, eta_weyl) online
Measured during training: eta_sps, eta_weyl, eta_step, gradient-noise batch size, stopping certificate, preservation telemetry
Verified after training: spectral bounds, CKA, degeneration, pipeline gate, optional benchmark delta
Benchmark: opt-in only; add --benchmark quick for pre/post task scores
poetry run mc train run \
-m /path/to/model \
-d /path/to/data.jsonl \
--explain \
--benchmark quickThis prints the resolved summary and then continues into training.
Evaluate a trained adapter against the base model. The command supports three modes; choose exactly one per run.
poetry run mc train evaluate \
-m /path/to/model \
-a /path/to/adapter \
--prompts /path/to/eval_prompts.jsonlUse this when you want side-by-side generations on a prompt set.
poetry run mc train evaluate \
-m /path/to/model \
-a /path/to/adapter \
-d /path/to/validation.jsonlUse this when you want loss or perplexity style validation.
poetry run mc train evaluate \
-m /path/to/model \
-a /path/to/adapter \
--benchmark quickUse this when you want lm-eval benchmark scores.
Compare two training runs or two adapters side by side.
poetry run mc train compare \
--result-a /path/to/run_a.json \
--result-b /path/to/run_b.jsonpoetry run mc train compare \
-m /path/to/model \
--adapter-a /path/to/adapter_a \
--adapter-b /path/to/adapter_b \
-d /path/to/validation.jsonlUse this when you want a winner call backed by measured deltas instead of impressionistic model sampling.
Export saved adapters into explicit deployment targets.
poetry run mc train export \
--model /path/to/model \
--adapter /path/to/adapter \
--output /path/to/deployment_dir \
--target deployment_quantizedAvailable targets:
adaptermerged_fp16deployment_quantized
Merge learned adapter state into base weights.
poetry run mc train merge \
--agent agent-001 \
--model /path/to/model \
--save \
--output /path/to/merged_modelShow current training state for a specific agent/model pair.
poetry run mc train status --agent agent-001 --model /path/to/modelCounterexample search for derived training. This is useful when you want to stress the current control plane and capture failures systematically.
poetry run mc train validate-derived \
-m /path/to/model \
-d /path/to/data.jsonl \
--trials 5 \
--report-path /tmp/derived-validation.jsonSTaR loop support built on top of the training services:
poetry run mc train star \
--model /path/to/model \
--data /path/to/base_data.jsonl \
--output /path/to/star_run \
--rounds 3 \
--problems-per-round 500Treat this as an advanced workflow, not the default starting point.
If you want more visibility into model behavior after training:
poetry run mc analyze dimension-profile --model /path/to/model
poetry run mc analyze entropy-trajectory --model /path/to/model
poetry run mc analyze spectral-trajectory --model /path/to/model
poetry run mc analyze lora-svd /path/to/adapter --base /path/to/model- The workbench is shipped, but benchmark superiority is still open.
--benchmarkis opt-in; it is available, but it is not yet the default path.- Experimental surfaces like merge and STaR exist, but they are not the core promise of the product today.
If a command fails or you want the exact current signature:
poetry run mc train --help
poetry run mc train run --help
poetry run mc train evaluate --help
poetry run mc train compare --help
poetry run mc data --help