An end-to-end implementation of a Tiny Recursive Model (TRM) reasoning system inspired by:
Less is More: Recursive Reasoning with Tiny Networks
Alexia Jolicoeur-Martineau et al.
The project reproduces the core idea of TRM: a single small neural network that performs iterative reasoning by repeatedly updating an internal latent state and refining its answer over multiple steps.
It includes the full pipeline:
- Synthetic data generation
- Offline training with deep supervision
- Model checkpointing
- Inference-only serving using trained weights
Most modern reasoning systems rely on very large models (LLMs).
TRM challenges this by showing that iterative computation with small models can match or outperform much larger networks on hard reasoning tasks.
Instead of scaling parameters, TRM scales thinking time.
TRM maintains three states:
x— problem input (fixed)y— current answer guess (updated over time)z— latent reasoning state (internal memory)
A single 2‑layer neural network is reused recursively:
-
Latent reasoning
Update the internal statezusing(x, y, z). -
Answer refinement
Reuse the same network, but zero outx, forcing the model to refineyusing only its internal reasoning state.
Repeating this loop multiple times allows the model to “think” before committing to an answer.
Training uses deep supervision instead of only supervising the final output:
- Run multiple full reasoning cycles
- Apply loss after each answer refinement step
- Improve training stability and convergence
The repo is organized like a small research → production ML system:
-
datasets/
Synthetic arithmetic task used for training and validation. -
train/
PyTorch training loop implementing TRM with deep supervision. -
inference/
Inference-only pipeline that loads trained checkpoints and runs recursive reasoning. -
core/
NumPy reference implementation of the recursive reasoning logic (for clarity, ablations, and validation). -
checkpoints/
Saved model weights. -
tests/
Unit tests validating correctness and stability.
- NumPy – reference implementation for research clarity and ablations.
- PyTorch – main path for training and production-style inference.
Inference uses only the trained model checkpoint.
python inference/run_inference.pyExample behavior: the model correctly predicts sums such as 3 + 7 = 10 and 5 + 9 = 14 via iterative refinement.
- Load trained weights from disk
- Initialize empty answer and latent state
- Run recursive reasoning loops
- Decode the final answer
- Return the prediction
Correctness is validated in multiple stages:
-
Unit tests
Check shapes, convergence logic, and pipeline integrity. -
Dynamical checks
Inspect latent state evolution and recursive stability -
Learning signal
Show that accuracy improves significantly beyond random guessing. -
Ablations
Vary recursion depth and supervision steps to confirm expected behavior
This project includes:
- data generation
- offline training
- model artifact saving
- inference pipeline
- evaluation and diagnostics
- reproducibility and tests
- clear separation between research code and serving code
This mirrors patterns used in production ML/ML Ops systems.
Paper:
Less is More: Recursive Reasoning with Tiny Networks
Alexia Jolicoeur-Martineau et al.
arXiv:2510.04871
Core insight: recursive reasoning with a single tiny network can rival much larger models by iterating computation rather than scaling parameters.