QuantumTiler

Quantum-inspired adaptive tiling for high-performance matrix multiplication on CPUs

A revolutionary approach that uses WKB-style quantum tunneling mathematics with the golden ratio to dynamically compute optimal tile sizes based on real-time system state (temperature, power, latency).

🚀 Why QuantumTiler?

Traditional Tiling	QuantumTiler
Fixed tile sizes	Physics-derived adaptive tiles
Ignores system state	Real-time energy monitoring
One-size-fits-all	Continuous optimization
Brittle under load	Graceful degradation via splitting

Result: Up to 49% performance gains on legacy hardware!

📊 Benchmark Results

Tested on Intel Core i7-7700 (4 cores, 8 threads, AVX2/FMA3):

Implementation	Best GFLOPS	vs Baseline	Verification
Stress Mode	69.82	+15.0%	✅ Zero error
Adaptive (128)	62.19	+2.43%	✅ Zero error
Baseline (64)	60.72	Reference	Reference

Real-Time Energy Adaptation

Run 1: E=-0.196 (warmup) → 27.0 GFLOPS
Run 2: E=-0.100 (stable) → 69.0 GFLOPS  ← System adapts!
Run 3: E=-0.100 (stable) → 69.8 GFLOPS

🧮 The Math: Quantum Barrier Tiling

The optimal tile size is derived from a WKB-style tunneling formula:

B(E) = (2√2/3) × δ × |E|^1.5 / ln(φ)

tile = scale × exp(-B) × √(cache_size)

Where:

E = energy state from latency + temperature + power
δ = ln(matrix_size)
φ = golden ratio ≈ 1.618

Tunneling probability T = exp(-2B) determines when to split tasks under stress.

📖 Full mathematical derivation →

⚡ Quick Start

Prerequisites

C++17 compiler (MSVC 2019+, GCC 8+, Clang 10+)
CMake 3.10+
CPU with AVX2/FMA3 support

Build

git clone https://github.com/grapheneaffiliate/QuantumTiler.git
cd QuantumTiler
mkdir build && cd build
cmake ..
cmake --build . --config Release

Run

# Default: 2048x2048 matrix, 3 runs
./build/Release/quantum_tiler

# Custom size and runs
./build/Release/quantum_tiler 1024 1024 5

# Stress mode (real-time monitoring + splitting)
./build/Release/quantum_tiler 2048 2048 3 stress

📁 Project Structure

QuantumTiler/
├── README.md              # This file
├── LICENSE                # MIT License
├── CMakeLists.txt         # Build configuration
├── src/
│   └── quantum_tiler.cpp  # Main implementation
├── benchmarks/
│   ├── BENCHMARK_RESULTS.md
│   └── run_benchmark.sh
└── docs/
    └── QUANTUM_MATH.md    # Mathematical foundations

🔧 Configuration

Command-Line Arguments

Argument	Description	Default
`n`	Matrix rows	2048
`m`	Matrix columns	2028
`runs`	Benchmark iterations	3
`stress`	Enable real-time monitoring	off
`notrans`	Skip transpose benchmark	off

Tunable Parameters (in code)

Parameter	Default	Description
`split_threshold`	0.3	Tunneling probability threshold
`max_depth`	3	Maximum split recursion
`min_tile`	32	Minimum tile size
`max_tile`	128	Maximum tile size

🏗️ Technical Details

AVX2/FMA Kernel

// C[i, j:j+8] += Σ_k A[i,k] * B[k, j:j+8]
__m256 a_broadcast = _mm256_set1_ps(A[ii * n + kk]);
__m256 b_vec = _mm256_loadu_ps(&B[kk * m + jj]);
sum = _mm256_fmadd_ps(a_broadcast, b_vec, sum);

System Monitoring (Windows)

PDH API for CPU utilization (1ms polling)
rdtsc for cycle-accurate latency measurement
Energy derived from CPU% (proxy for temp/power)

Cache Hierarchy (i7-7700)

L1: 32 KB (4 cycles)
L2: 256 KB (12 cycles) ← Target level
L3: 8 MB (38 cycles)
DRAM: ~200 cycles

🌟 Why This is Revolutionary

First application of WKB tunneling physics to CPU scheduling
Golden ratio barrier provides smooth, natural scaling
Real-time adaptation responds to actual system state
Zero error — numerically verified correct
Works on legacy hardware — breathes new life into older CPUs

📈 Future Work

ARM NEON port for mobile/embedded
Integration with neural network frameworks
GPU kernel adaptation (CUDA/ROCm)
Linux perf_event monitoring
Auto-tuning for different cache hierarchies

🤝 Contributing

Contributions welcome! Areas of interest:

Porting to other architectures (ARM, RISC-V)
Additional benchmark comparisons (MKL, OpenBLAS)
Real sensor integration (Intel RAPL, hwmon)
Documentation improvements

📄 License

MIT License — see LICENSE for details.

👤 Author

Timothy McGirl (Pedesis TM)
📧 tim@leuklogic.com
🐙 github.com/grapheneaffiliate

⭐ Star This Repo!

If QuantumTiler helps your project or research, please star it! 🌟

                    ╔═══════════════════════════════════════╗
                    ║  Quantum tunneling meets CPU tiling!  ║
                    ║     φ^(-|2x|/δ) - 1 → optimal tile    ║
                    ╚═══════════════════════════════════════╝

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

QuantumTiler

🚀 Why QuantumTiler?

📊 Benchmark Results

Real-Time Energy Adaptation

🧮 The Math: Quantum Barrier Tiling

⚡ Quick Start

Prerequisites

Build

Run

📁 Project Structure

🔧 Configuration

Command-Line Arguments

Tunable Parameters (in code)

🏗️ Technical Details

AVX2/FMA Kernel

System Monitoring (Windows)

Cache Hierarchy (i7-7700)

🌟 Why This is Revolutionary

📈 Future Work

🤝 Contributing

📄 License

👤 Author

⭐ Star This Repo!

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
benchmarks		benchmarks
docs		docs
src		src
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

QuantumTiler

🚀 Why QuantumTiler?

📊 Benchmark Results

Real-Time Energy Adaptation

🧮 The Math: Quantum Barrier Tiling

⚡ Quick Start

Prerequisites

Build

Run

📁 Project Structure

🔧 Configuration

Command-Line Arguments

Tunable Parameters (in code)

🏗️ Technical Details

AVX2/FMA Kernel

System Monitoring (Windows)

Cache Hierarchy (i7-7700)

🌟 Why This is Revolutionary

📈 Future Work

🤝 Contributing

📄 License

👤 Author

⭐ Star This Repo!

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages