Skip to content

Distributed-memory parallel C++/MPI implementation of the delta-stepping algorithm for fast single source shortest paths on large graphs.

Notifications You must be signed in to change notification settings

tanshoo/delta-stepping

Repository files navigation

Distributed Δ-Stepping Algorithm

This project implements the Δ-stepping algorithm, a high performance, distributed-memory parallel algorithm for solving the Single Source Shortest Paths (SSSP) problem on large scale graphs.

The Δ-stepping algorithm combines the work efficiency of Dijkstra’s algorithm with the parallelism of Bellman-Ford, making it well suited for distributed systems. The algorithm processes vertices in buckets based on their tentative distances, allowing parallel relaxation of edges within each bucket while maintaining correctness. This approach enables efficient weak (Gustafson) scaling on parallel systems.

This implementation is written in C++ and uses MPI for parallelism. It's based on the algorithm and optimizations described in “Scalable Single Source Shortest Path Algorithms for Massively Parallel Systems” by V. T. Chakaravarthy, F. Checconi, F. Petrini, and Y. Sabharwal, Proceedings of the IEEE International Parallel & Distributed Processing Symposium (IPDPS), 2014. [PDF].

Table of Contents

Report

A detailed report describing the implementation specifics of the algorithm, the impact of various optimizations and benchmark results for multiple graph sizes can be found in report.pdf

Optimizations

This implementation includes several optimizations from the referenced paper, plus additional improvements.

Edge Classification

Edges are divided into short and long groups based on a threshold parameter Δ.

  • Short edges have weights <Δ and are processed multiple times within a bucket.
  • Long edges have weights ≥Δ and are relaxed only once per bucket.

This reduces redundant relaxations, since long edges cannot produce tentative distances that fall into the current bucket’s range.

Inner-Outer Short (IOS) Heuristic

During the short edge phase, an edge is relaxed only if the new tentative distance places the target vertex within the current bucket.

  • Inner short edges satisfy this condition and are processed immediately.
  • Outer short edges are processed in the long edge phase

This avoids unnecessary relaxations on edges that cannot affect the current bucket.

Push and Pull Communication in Long Edge Phase

  • In push mode currently settled vertices send updated distances to their neighbors
  • In pull mode unsettled vertices request distances from their neighbors

For each bucket, the optimal communication mode is chosen by estimating the total push/pull communciation volume using a heuristic.

Hybridization with Bellman-Ford

After a specified fraction of vertices is settled, the algorithm switches to Bellman-Ford for all remaining buckets, which are more sparsely occupied. This reduces the number of long edge phases which require synchronization, while keeping similar total work.

Dynamic Index Width (custom optimization)

When exchanging distances, vertex indices are encoded relative to the first owned vertex of the receiver. Because of this, if each process owns ≤ 2¹⁶ vertices, the relative index can be represented as a 16-bit number instead of a 32-bit one.

Requirements

  • MPI: OpenMPI or MPICH
  • C++ Compiler: C++17 support
  • OpenMP: For intra-node parallelism
  • Python 3: To run the checker (optional)

Usage

Building

git clone https://github.com/tanshoo/delta-stepping.git
cd delta-stepping
make

Running the Algorithm

Each process <rank> reads its input from <input_dir>/$RANK.in and writes output to <output_dir>/$RANK.out.

The output directory must exist before running.

Running locally

mpiexec -n <num_processes> bash -c 'RANK=${OMPI_COMM_WORLD_LOCAL_RANK}; ./sssp <input_filepath>/$RANK.in <output_filepath>/$RANK.out'

Running on SLURM

srun -n <num_processes> bash -c 'RANK=${SLURM_PROCID}; ./sssp <input_filepath>/$RANK.in <output_filepath>/$RANK.out'

Input/Output Format

Input Format

Each MPI rank receives a file with:

First line: n first last

  • n: Total number of vertices in the graph

  • first: Index of the first vertex owned by this process

  • last: Index of the last vertex owned by this process

  • Subsequent lines: u v w

    • Edge from u to v with weight w

Output Format

Each process writes a file with the shortest distances for its vertices, one per line, sorted by vertex index.

Input Assumptions

  • The graph is undirected, with no self-loops or duplicate edges.
  • All edge weights are non-negative.
  • The source vertex is always 0.
  • Each process owns a contiguous range of vertices.

Utilities

Build all utilities:

cd utils
make

R-MAT graph generator:

./rmat_generator <scale> <edge_factor> <rmat1|rmat2> <seed> [output_file]
  • Output: Edgelist in format u v w (default filename: <rmat_type>_<scale>_<seed>.edgelist) with w in [0,255]

Sequential Dijkstra (reference):

./dijkstra <basename>
  • Input: <basename>.edgelist
  • Output: <basename>.out (distances from node 0)

Distributed testcase generator:

Splits an edgelist and output into per process files for testing.

./test_generator <basename> <num_processes>
  • Input: <basename>.edgelist and <basename>.out
  • Output: Directory <basename>_<num_processes> with files 0.in, ..., N-1.in and 0.out, ..., N-1.out

About

Distributed-memory parallel C++/MPI implementation of the delta-stepping algorithm for fast single source shortest paths on large graphs.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published