Skip to content

Athul0491/tiny-reasoning-model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🧠 Tiny Recursive Model (TRM) — End-to-End Reasoning System

An end-to-end implementation of a Tiny Recursive Model (TRM) reasoning system inspired by:

Less is More: Recursive Reasoning with Tiny Networks
Alexia Jolicoeur-Martineau et al.

The project reproduces the core idea of TRM: a single small neural network that performs iterative reasoning by repeatedly updating an internal latent state and refining its answer over multiple steps.

It includes the full pipeline:

  • Synthetic data generation
  • Offline training with deep supervision
  • Model checkpointing
  • Inference-only serving using trained weights

🔬 Research Background

Motivation

Most modern reasoning systems rely on very large models (LLMs).
TRM challenges this by showing that iterative computation with small models can match or outperform much larger networks on hard reasoning tasks.

Instead of scaling parameters, TRM scales thinking time.

Core TRM Idea

TRM maintains three states:

  • x — problem input (fixed)
  • y — current answer guess (updated over time)
  • z — latent reasoning state (internal memory)

A single 2‑layer neural network is reused recursively:

  1. Latent reasoning
    Update the internal state z using (x, y, z).

  2. Answer refinement
    Reuse the same network, but zero out x, forcing the model to refine y using only its internal reasoning state.

Repeating this loop multiple times allows the model to “think” before committing to an answer.

Deep Supervision

Training uses deep supervision instead of only supervising the final output:

  • Run multiple full reasoning cycles
  • Apply loss after each answer refinement step
  • Improve training stability and convergence

📁 Project Structure

The repo is organized like a small research → production ML system:

  • datasets/
    Synthetic arithmetic task used for training and validation.

  • train/
    PyTorch training loop implementing TRM with deep supervision.

  • inference/
    Inference-only pipeline that loads trained checkpoints and runs recursive reasoning.

  • core/
    NumPy reference implementation of the recursive reasoning logic (for clarity, ablations, and validation).

  • checkpoints/
    Saved model weights.

  • tests/
    Unit tests validating correctness and stability.

NumPy vs PyTorch

  • NumPy – reference implementation for research clarity and ablations.
  • PyTorch – main path for training and production-style inference.

🚀 Inference (Production Path)

Inference uses only the trained model checkpoint.

Running Inference

python inference/run_inference.py

Example behavior: the model correctly predicts sums such as 3 + 7 = 10 and 5 + 9 = 14 via iterative refinement.

What Happens During Inference

  1. Load trained weights from disk
  2. Initialize empty answer and latent state
  3. Run recursive reasoning loops
  4. Decode the final answer
  5. Return the prediction

🧠 How Correctness Is Validated

Correctness is validated in multiple stages:

  1. Unit tests
    Check shapes, convergence logic, and pipeline integrity.

  2. Dynamical checks
    Inspect latent state evolution and recursive stability

  3. Learning signal
    Show that accuracy improves significantly beyond random guessing.

  4. Ablations
    Vary recursion depth and supervision steps to confirm expected behavior

📊 System Design

This project includes:

  • data generation
  • offline training
  • model artifact saving
  • inference pipeline
  • evaluation and diagnostics
  • reproducibility and tests
  • clear separation between research code and serving code

This mirrors patterns used in production ML/ML Ops systems.

📚 Reference

Paper:

Less is More: Recursive Reasoning with Tiny Networks
Alexia Jolicoeur-Martineau et al.
arXiv:2510.04871

Core insight: recursive reasoning with a single tiny network can rival much larger models by iterating computation rather than scaling parameters.

About

End-to-end implementation of a recursive reasoning system inspired by "Less is More: Recursive Reasoning with Tiny Networks," including training pipeline, evaluation, and experiments.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages