This project implements the Δ-stepping algorithm, a high performance, distributed-memory parallel algorithm for solving the Single Source Shortest Paths (SSSP) problem on large scale graphs.
The Δ-stepping algorithm combines the work efficiency of Dijkstra’s algorithm with the parallelism of Bellman-Ford, making it well suited for distributed systems. The algorithm processes vertices in buckets based on their tentative distances, allowing parallel relaxation of edges within each bucket while maintaining correctness. This approach enables efficient weak (Gustafson) scaling on parallel systems.
This implementation is written in C++ and uses MPI for parallelism. It's based on the algorithm and optimizations described in “Scalable Single Source Shortest Path Algorithms for Massively Parallel Systems” by V. T. Chakaravarthy, F. Checconi, F. Petrini, and Y. Sabharwal, Proceedings of the IEEE International Parallel & Distributed Processing Symposium (IPDPS), 2014. [PDF].
A detailed report describing the implementation specifics of the algorithm, the impact of various optimizations and benchmark results for multiple graph sizes can be found in report.pdf
This implementation includes several optimizations from the referenced paper, plus additional improvements.
Edges are divided into short and long groups based on a threshold parameter Δ.
- Short edges have weights <Δ and are processed multiple times within a bucket.
- Long edges have weights ≥Δ and are relaxed only once per bucket.
This reduces redundant relaxations, since long edges cannot produce tentative distances that fall into the current bucket’s range.
During the short edge phase, an edge is relaxed only if the new tentative distance places the target vertex within the current bucket.
- Inner short edges satisfy this condition and are processed immediately.
- Outer short edges are processed in the long edge phase
This avoids unnecessary relaxations on edges that cannot affect the current bucket.
- In push mode currently settled vertices send updated distances to their neighbors
- In pull mode unsettled vertices request distances from their neighbors
For each bucket, the optimal communication mode is chosen by estimating the total push/pull communciation volume using a heuristic.
After a specified fraction of vertices is settled, the algorithm switches to Bellman-Ford for all remaining buckets, which are more sparsely occupied. This reduces the number of long edge phases which require synchronization, while keeping similar total work.
When exchanging distances, vertex indices are encoded relative to the first owned vertex of the receiver. Because of this, if each process owns ≤ 2¹⁶ vertices, the relative index can be represented as a 16-bit number instead of a 32-bit one.
- MPI: OpenMPI or MPICH
- C++ Compiler: C++17 support
- OpenMP: For intra-node parallelism
- Python 3: To run the checker (optional)
git clone https://github.com/tanshoo/delta-stepping.git
cd delta-stepping
makeEach process <rank> reads its input from <input_dir>/$RANK.in and writes output to <output_dir>/$RANK.out.
The output directory must exist before running.
mpiexec -n <num_processes> bash -c 'RANK=${OMPI_COMM_WORLD_LOCAL_RANK}; ./sssp <input_filepath>/$RANK.in <output_filepath>/$RANK.out'srun -n <num_processes> bash -c 'RANK=${SLURM_PROCID}; ./sssp <input_filepath>/$RANK.in <output_filepath>/$RANK.out'Each MPI rank receives a file with:
First line: n first last
-
n: Total number of vertices in the graph -
first: Index of the first vertex owned by this process -
last: Index of the last vertex owned by this process -
Subsequent lines:
u v w- Edge from
utovwith weightw
- Edge from
Each process writes a file with the shortest distances for its vertices, one per line, sorted by vertex index.
- The graph is undirected, with no self-loops or duplicate edges.
- All edge weights are non-negative.
- The source vertex is always 0.
- Each process owns a contiguous range of vertices.
Build all utilities:
cd utils
makeR-MAT graph generator:
./rmat_generator <scale> <edge_factor> <rmat1|rmat2> <seed> [output_file]- Output: Edgelist in format
u v w(default filename:<rmat_type>_<scale>_<seed>.edgelist) with w in [0,255]
Sequential Dijkstra (reference):
./dijkstra <basename>- Input:
<basename>.edgelist - Output:
<basename>.out(distances from node 0)
Distributed testcase generator:
Splits an edgelist and output into per process files for testing.
./test_generator <basename> <num_processes>- Input:
<basename>.edgelistand<basename>.out - Output: Directory
<basename>_<num_processes>with files0.in, ...,N-1.inand0.out, ...,N-1.out