Skip to content

iotaaxel/torchscript-performance-bench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TorchScript Performance Benchmark

A comprehensive benchmarking suite for comparing PyTorch execution modes: eager execution, TorchScript JIT, and torch.compile.

Python 3.8+ PyTorch 2.0+ License: MIT

Features

  • Multiple Execution Modes: Compare eager, TorchScript, and torch.compile performance
  • CPU vs GPU: Benchmark on both CPU and GPU devices
  • Three Model Architectures: CNN, MLP, and Transformer models included
  • Profiling & Tracing: Built-in profiling utilities for performance analysis
  • Graph Analysis: Compare computation graphs between execution modes
  • Visualizations: Automatic generation of performance charts and time series plots
  • CLI Interface: Easy-to-use command-line tool for running benchmarks

Installation

# Clone the repository
git clone git@github.com:iotaaxel/torchscript-performance-bench.git
cd torchscript-performance-bench

# Install dependencies
pip install -r requirements.txt

Quick Start

# Benchmark all models
python scripts/run_benchmarks.py --all

# Benchmark a specific model
python scripts/run_benchmarks.py --model cnn

# Compare CPU vs GPU
python scripts/run_benchmarks.py --model transformer --compare-cpu-gpu

# Custom benchmark settings
python scripts/run_benchmarks.py --model mlp --num-runs 200 --warmup-runs 20

Usage

Command Line Interface

python scripts/run_benchmarks.py [OPTIONS]

Options:
  --model {cnn,mlp,transformer}  Model type to benchmark
  --all                           Run benchmarks for all models
  --device {cpu,cuda}             Device to run on (default: auto-detect)
  --num-runs INT                  Number of benchmark iterations (default: 100)
  --warmup-runs INT               Number of warmup iterations (default: 10)
  --compare-cpu-gpu               Compare CPU and GPU performance
  --no-plots                      Skip generating plots
  --no-save                       Skip saving results to JSON

Programmatic Usage

from models import create_cnn_model
from bench import BenchmarkRunner, ExecutionMode

# Create model
model = create_cnn_model()

# Create input function
def input_fn():
    return torch.randn(1, 3, 32, 32)

# Run benchmark
runner = BenchmarkRunner(warmup_runs=10, num_runs=100, device='cuda')
results = runner.benchmark(
    model, 
    input_fn, 
    modes=[ExecutionMode.EAGER, ExecutionMode.TORCHSCRIPT, ExecutionMode.COMPILE]
)

# Access results
for mode, result in results.items():
    print(f"{mode}: {result.mean_time_ms:.3f} ± {result.std_time_ms:.3f} ms")

Project Structure

torchscript-performance-bench/
├── models/              # Model definitions
│   ├── cnn.py          # Small CNN model
│   ├── mlp.py          # Multi-layer perceptron
│   └── transformer.py  # Tiny transformer block
├── bench/              # Benchmarking infrastructure
│   ├── benchmark.py    # Core benchmark runner
│   └── profiler.py     # Profiling utilities
├── scripts/            # Scripts and utilities
│   ├── run_benchmarks.py  # CLI runner
│   └── visualize.py    # Visualization tools
├── tests/              # Unit tests
├── reports/            # Generated reports and plots
└── requirements.txt    # Python dependencies

Models

Small CNN

A lightweight convolutional neural network with:

  • 3 convolutional layers with batch normalization
  • Adaptive average pooling
  • Fully connected layers with dropout

MLP

A multi-layer perceptron with:

  • Configurable hidden layer sizes
  • Batch normalization and dropout
  • Automatic input flattening

Tiny Transformer

A minimal transformer block with:

  • Multi-head self-attention
  • Position-wise feed-forward network
  • Layer normalization and residual connections

Benchmark Results

Results are automatically saved to:

  • JSON: reports/{model_name}_results.json - Detailed numerical results
  • Plots:
    • reports/{model_name}_{device}_comparison.png - Bar chart comparison
    • reports/{model_name}_{device}_timeseries.png - Time series plot
    • reports/{model_name}_{device}_speedup.png - Speedup relative to eager
    • reports/{model_name}_cpu_vs_gpu.png - CPU vs GPU comparison

Profiling

Use the profiler to analyze model execution:

from bench.profiler import Profiler

profiler = Profiler(device='cuda')
profile_data = profiler.profile_model(model, input_tensor, execution_mode='torchscript')

# Get graph representation
graph_str = profiler.get_graph_representation(model, input_tensor)

# Compare graphs
comparison = profiler.compare_graphs(graph1, graph2)

Testing

Run the test suite:

pytest tests/

Requirements

  • Python 3.8+
  • PyTorch 2.0+
  • matplotlib 3.7+
  • numpy 1.24+
  • tqdm 4.65+

License

MIT License

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Sponsor this project

 

Packages

No packages published