Skip to content

JGalego/lqcd-neuron-sdk

Repository files navigation

LQCD Neuron SDK

A high-performance Python SDK for running Lattice QCD simulations on AWS Trainium and Inferentia instances.

Overview

This SDK provides optimized tensor operations and algorithms specifically designed for Lattice QCD computations on AWS Neuron-powered EC2 instances. It leverages PyTorch/XLA and the AWS Neuron SDK to deliver superior performance compared to traditional GPU-based solutions.

Features

  • 🚀 High Performance: Optimized for AWS Trainium/Inferentia hardware
  • 🔢 Core Components: Lattice geometry, gauge fields, fermion fields
  • ⚡ Efficient Operators: Wilson and Staggered fermion implementations
  • 🔧 Linear Solvers: Conjugate Gradient and BiCGStab iterative solvers
  • 📊 Benchmarking: Built-in performance comparison tools
  • 🔄 Fallback Support: CPU/NumPy backend for development
  • 🧪 Well Tested: Comprehensive test suite

Quick Start

Installation

  1. Clone the repository:

    git clone https://github.com/JGalego/lqcd-neuron-sdk
    cd lqcd-neuron-sdk
  2. Install dependencies:

    pip install -e .
  3. For AWS Neuron support (on Trainium/Inferentia instances):

    # Install AWS Neuron SDK
    pip install torch-neuronx neuronx-cc --extra-index-url https://pip.repos.neuron.amazonaws.com

Basic Example

from lqcd_neuron.core import Lattice, GaugeField
from lqcd_neuron.operators import WilsonOperator

# Create a 4D lattice
lattice = Lattice((8, 8, 8, 8), device="xla")  # Use "cpu" for development

# Initialize gauge field
gauge_field = GaugeField(lattice)

# Compute average plaquette
avg_plaq = gauge_field.average_plaquette()
print(f"Average plaquette: {avg_plaq}")

# Create Wilson fermion operator
wilson_op = WilsonOperator(lattice, gauge_field, mass=0.1)

Core Components

1. Lattice Geometry

from lqcd_neuron.core import Lattice

# Create 4D spacetime lattice
lattice = Lattice(dimensions=(Nt, Nx, Ny, Nz), device="xla")

2. SU(3) Gauge Fields

from lqcd_neuron.core import GaugeField

gauge = GaugeField(lattice)
plaquette = gauge.average_plaquette()
action = gauge.wilson_action(beta=6.0)

3. Fermion Fields

from lqcd_neuron.core import FermionField

fermion = FermionField(lattice)
fermion.random_initialize()
norm = fermion.norm()

4. QCD Operators

from lqcd_neuron.operators import WilsonOperator, StaggeredOperator

# Wilson fermions
wilson_op = WilsonOperator(lattice, gauge_field, mass=0.1)
result = wilson_op.apply(fermion_field)

# Staggered fermions  
staggered_op = StaggeredOperator(lattice, gauge_field, mass=0.05)

5. Linear Solvers

from lqcd_neuron.solvers import ConjugateGradient, BiCGStab

# CG solver for D†D * x = b
def ddag_d_operator(field):
    return wilson_op.dagger(wilson_op.apply(field))

cg = ConjugateGradient(ddag_d_operator, tolerance=1e-12)
solution = cg.solve(rhs_field)

Performance Benchmarking

The SDK includes comprehensive benchmarking tools:

from lqcd_neuron.benchmarks import PerformanceBenchmark

# Benchmark on different devices
benchmark = PerformanceBenchmark(device="xla")  # or "cpu", "cuda"

# Run comprehensive benchmarks
results = benchmark.run_comprehensive_benchmark([
    (8, 8, 8, 8),
    (16, 16, 16, 16),
    (32, 32, 32, 32)
])

# Compare devices
comparison = benchmark.compare_devices(["cpu", "xla", "cuda"])
benchmark.print_summary()

Running on Neuron Devices

XLA Device Backend

To use AWS Trainium/Inferentia devices, specify device="xla" when creating lattices:

# Create lattice on Neuron device
lattice = Lattice((8, 8, 8, 8), device="xla")
gauge_field = GaugeField(lattice)

# XLA compilation happens on first operation (slower)
result1 = wilson_op.apply(fermion_field)  # Compilation + execution

# Subsequent operations use compiled code (faster)
result2 = wilson_op.apply(fermion_field)  # Fast execution

Performance Optimization Tips

  • Warm-up runs: First operations trigger XLA compilation
  • Batch operations: Process multiple configurations together
  • Larger lattices: Better utilization on bigger problems (16⁴+)
  • Repeated operations: Amortize compilation cost over many runs

Device Detection

# Check if running on Neuron-capable instance
import subprocess
try:
    result = subprocess.run(['neuron-ls'], capture_output=True)
    if result.returncode == 0:
        print("Neuron devices available!")
        # Use device="xla"
    else:
        print("No Neuron devices, using CPU fallback")
        # Use device="cpu"
except FileNotFoundError:
    print("Neuron SDK not installed")

Examples

See the examples/ directory for complete working examples:

  • basic_plaquette.py - Computing gauge observables (CPU)
  • wilson_fermion_demo.py - Fermion operators and linear solvers (CPU)
  • benchmark_demo.py - Performance benchmarking (multi-device)
  • simple_neuron_xla.py - Basic XLA operations on Neuron devices
  • neuron_device_demo.py - Full Neuron device demo with performance comparison

Run examples:

# CPU-based examples (work on any system)
python examples/basic_plaquette.py
python examples/wilson_fermion_demo.py  
python examples/benchmark_demo.py

# Neuron-specific examples (require Trainium/Inferentia instances)
python examples/simple_neuron_xla.py
python examples/neuron_device_demo.py

Testing

Run the test suite:

# Simple tests (no external dependencies)
python tests/simple_test.py

# Full test suite (requires pytest)
pytest tests/ -v

AWS Deployment

Trainium/Inferentia Setup

  1. Launch EC2 instance:

    • Use trn1 (Trainium) or inf2 (Inferentia) instance types
    • Recommended: trn1.2xlarge or larger
  2. Install Neuron SDK:

    # Configure Neuron repository
    . /etc/os-release
    sudo tee /etc/apt/sources.list.d/neuron.list > /dev/null <<EOF
    deb https://apt.repos.neuron.amazonaws.com ${VERSION_CODENAME} main
    EOF
    
    # Install Neuron packages
    sudo apt-get update -y
    sudo apt-get install aws-neuronx-dkms aws-neuronx-collectives aws-neuronx-runtime-lib aws-neuronx-tools -y
    
    # Install PyTorch Neuron
    pip install torch-neuronx neuronx-cc --extra-index-url https://pip.repos.neuron.amazonaws.com
  3. Verify installation:

    neuron-ls  # Should show available Neuron devices

Performance Optimization

  • Batch operations for maximum throughput
  • Use XLA compilation for optimal performance
  • Profile with Neuron tools for bottleneck identification
  • Compare against GPU baselines using benchmarking suite

Architecture

lqcd-neuron-sdk/
├── src/lqcd_neuron/          # Main package
│   ├── core/                 # Core data structures  
│   │   ├── lattice.py        # 4D lattice geometry
│   │   ├── neuron_tensor.py  # Optimized tensor operations
│   │   ├── gauge_field.py    # SU(3) gauge fields
│   │   └── fermion_field.py  # Fermion fields with spinors
│   ├── operators/            # QCD operators
│   │   ├── wilson.py         # Wilson fermion operator
│   │   └── staggered.py      # Staggered fermion operator
│   ├── solvers/              # Linear algebra solvers
│   │   ├── cg.py             # Conjugate Gradient
│   │   └── bicgstab.py       # BiCGStab
│   └── benchmarks/           # Performance tools
│       └── performance.py    # Benchmarking suite
├── tests/                    # Test suite
├── examples/                 # Usage examples
└── docs/                     # Documentation

Development

Requirements

  • Python 3.8+
  • NumPy (always required)
  • PyTorch + torch-neuronx (for Neuron support)
  • pytest (for testing)

Development Setup

# Development install
pip install -e ".[dev]"

# Run linting
black src/ tests/
flake8 src/

# Type checking  
mypy src/

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Add tests for new functionality
  4. Ensure all tests pass
  5. Submit a pull request

License

MIT License - see LICENSE file for details.

Citation

If you use this SDK in your research, please cite:

@software{lqcd_neuron_sdk,
  title={LQCD Neuron SDK: High-Performance Lattice QCD on AWS Trainium and Inferentia},
  author={João Galego},
  year={2025},
  url={https://github.com/JGalego/lqcd-neuron-sdk}
}

References

Articles

Docs

Frameworks


Questions? Open an issue or contact the development team.

About

A high-performance Python SDK for running Lattice QCD simulations on AWS Trainium and Inferentia instances.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages