Skip to content

whisprer/c-simd-rng-lib

Repository files navigation

Universal Architecture PRNG std. Replacement C++ Lib

[README.md]

C-Simd-Rng-Lib

Release Version Build Status

Commits Last Commit Issues Version Platform Python License

C-Simd-Rng-Lib Banner

A high-performance, cross-platform random number generation library with SIMD and GPU acceleration.

Overview - [resuced from eird formatting doldrums by the one and only RTC!!! thanxyou :) ]

universal_rng_lib is a fast, flexible RNG library written in modern C++. It supports a range of algorithms including Xoroshiro128++ and WyRand, with runtime autodetection of the best CPU vectorization (SSE2, AVX2, AVX-512, NEON) and optional OpenCL GPU support.

It significantly outperforms the C++ standard library RNGs and can replace them in scientific simulations, games, real-time systems, and more.

Features

  • ✅ Multiple PRNGs: Xoroshiro128++, WyRand
  • ✅ SIMD Acceleration: SSE2, AVX2, AVX-512, NEON (auto-detect at runtime)
  • ✅ OpenCL GPU support (optional)
  • ✅ Scalar fallback for universal compatibility
  • ✅ Batch generation for improved throughput
  • ✅ Support for 16–1024 bit generation
  • ✅ Cross-platform: Windows (MSVC, MinGW), Linux
  • ✅ MIT Licensed

Quick Start

Requirements

  • C++17-compatible compiler
  • CMake 3.15+
  • Ninja (recommended)
  • OpenCL SDK (optional)

Build Instructions

Linux/Mac (bash)

git clone https://github.com/YOUR_USERNAME/universal_rng_lib.git
cd universal_rng_lib
mkdir build && cd build
cmake .. -G Ninja -DCMAKE_BUILD_TYPE=Release
cmake --build . --parallel
./rng_selftest

Windows (MSYS2 MinGW64 shell)

git clone https://github.com/YOUR_USERNAME/universal_rng_lib.git
cd universal_rng_lib
mkdir build && cd build
cmake .. -G Ninja -DCMAKE_BUILD_TYPE=Release -DRNG_WITH_OPENCL=OFF
cmake --build . --parallel
./rng_selftest.exe

Note: If AVX2 is supported, it will automatically be compiled in and used.

Usage Example

#include "universal_rng.h"
#include <iostream>

int main() {
    // Create RNG instance (seed, algorithm_id, bitwidth)
    universal_rng_t* rng = universal_rng_new(1337, 0, 1);

    // Generate 64-bit random integer
    uint64_t val = universal_rng_next_u64(rng);
    std::cout << "Random u64: " << val << std::endl;

    // Generate double in range [0,1)
    double d = universal_rng_next_double(rng);
    std::cout << "Random double: " << d << std::endl;

    // Cleanup
    universal_rng_free(rng);
}

Replace C++ Standard RNG

To use universal_rng_lib in place of std::mt19937 or std::default_random_engine:

Replace all instances of:

std::mt19937 rng(seed);

with:

auto* rng = universal_rng_new(seed, 0, 1);  // use algorithm 0 = Xoroshiro128++

Replace:

rng(); // or dist(rng)

with:

universal_rng_next_u64(rng);

Use universal_rng_next_double(rng); for floating-point needs.

Replace cleanup:

delete rng;

with:

universal_rng_free(rng);

File Structure

.
├── include/                # All public headers
│   └── universal_rng.h    # Main header
├── Benchmarking/           # Benchmarking Results 
│                           # [compared against C++ std. lib]
├── src/                    # Source code
│   ├── simd_avx2.cpp
│   ├── simd_sse2.cpp
│   ├── simd_avx512.cpp
│   ├── universal_rng.cpp
│   └── runtime_detect.cpp
├── lib_files/              # Prebuilt binaries
│   ├── mingw_shared/
│   ├── msvc_shared/
│   └── linux_shared/
├── extras/                 # Environment setups and tools
│   └── windows/
├── docs/                   # In-depth design documentation
│   ├── key_SIMD-implementation_design-principles.md
│   ├── explain_of_3-7's_refactor.md
│   └── opencl-implementation-details.md
└── tests/                  # Self-test and benchmarks

SIMD & Dispatch Design

  • Auto-detects best available instruction set at runtime
  • Gracefully falls back to scalar or SSE2
  • Batches can be used to further accelerate performance
  • Detection failures are handled gracefully

Example detection result:

CPU feature detection:
  SSE2: Yes
  AVX2: Yes
  AVX512: No
Trying AVX2 implementation...
Using AVX2 implementation

Benchmarking & Performance

  • Batch mode yields 1.7×–2.5× speedup over naive generation
  • AVX2 performs ~3–5× faster than std::mt19937
  • AVX-512 versions under development

License

MIT License – see LICENSE.md for full terms.

Reference

This library is partially inspired by:

  • David Blackman & Sebastiano Vigna's paper on Scrambled Linear PRNGs (SLRNG)

About

A Replacement for the C++ std. RNG Lib that employs SIMD

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Sponsor this project

Packages

No packages published