This repository contains code for our shared-memory parallel (r, s) nucleus decomposition algorithms. Note that the repository uses the Graph-Based Benchmark Suite (GBBS) for parallel primitives and benchmarks.
The benchmarks/NucleusDecomposition/ directory contains the main executable,
NucleusDecomposition_main.
Compiler:
- g++ >= 7.4.0 with pthread support for the ParlayLib scheduler
- g++ >= 7.4.0 with support for Cilk Plus
Build system:
- Bazel 2.0.0
To build, navigate to the benchmarks/NucleusDecomposition/ directory, and run:
$ bazel build :NucleusDecomposition_mainMost optionality from GBBS
applies. In particular, to compile benchmarks for graphs with
more than 2^32 edges, the LONG command-line parameter should be set.
The default compilation is the ParlayLib scheduler (Homegrown).
To compile with the Cilk Plus scheduler instead of the Homegrown scheduler, use
the Bazel configuration --config=cilk. To compile using OpenMP instead, use
the Bazel configuration --config=openmp. To compile serially instead, use the
Bazel configuration --config=serial.
Note that the default compilation mode in bazel is to build optimized binaries
(stripped of debug symbols). You can compile debug binaries by supplying -c dbg to the bazel build command.
The following command cleans the directory:
$ bazel clean # removes all executablesThe applications take as input the adjacency graph format used by GBBS.
The adjacency graph format starts with a sequence of offsets one for each vertex, followed by a sequence of directed edges ordered by their source vertex. The offset for a vertex i refers to the location of the start of a contiguous block of out edges for vertex i in the sequence of edges. The block continues until the offset of the next vertex, or the end if i is the last vertex. All vertices and offsets are 0 based and represented in decimal. The specific format is as follows:
AdjacencyGraph
<n>
<m>
<o0>
<o1>
...
<o(n-1)>
<e0>
<e1>
...
<e(m-1)>
This file is represented as plain text.
Using SNAP Graphs
Graphs from the SNAP dataset collection are used in our experiments. We use a tool from the GBBS that converts the most common SNAP graph format to the adjacency graph format that GBBS accepts. Usage example:
# Download a graph from the SNAP collection.
wget https://snap.stanford.edu/data/wiki-Vote.txt.gz
gzip --decompress ${PWD}/wiki-Vote.txt.gz
# Run the SNAP-to-adjacency-graph converter.
# Run with Bazel:
bazel run //utils:snap_converter -- -s -i ${PWD}/wiki-Vote.txt -o <output file>The applications take the input graph as input, as well as flags to specify
the parameters of the (r, s) nucleus decomposition algorithm and desired
optimizations. Note that the -s flag must be set to indicate a symmetric
(undirected) graph, and the -rounds 1 argument must be passed in.
The options for arguments are:
--rCliquefollowed by an integer specifying r--sCliquefollowed by an integer specifying s--numberOfLevelsfollowed by a string, which must beONE_LEVEL,TWO_LEVEL, orMULTI_LEVEL--numberOfMultiLevelsfollowed by an integer which must be less than or equal to r, specifying the number of levels desired if theMULTI_LEVELoption is chosen (this argument is ignored ifONE_LEVELorTWO_LEVELis chosen)--inverseIndexMapfollowed by a string, which must beBINARY_SEARCHorSTORED_POINTERS(this option is ignored ifONE_LEVELis chosen)--contiguousSpace, which specifies that contiguous space should be used for the parallel hash table T if set (this argument is ignored ifONE_LEVELorSTORED_POINTERSare chosen, both of which require contiguous space)--relabel, which specifies that the graph should be relabeled if set--updateAggregationfollowed by a string, which must beSIMPLE_ARRAY,LIST_BUFFER, orHASH_TABLE--compact, which specifies that the graph should be relabeled if set (this argument only applies to r = 2 and s = 3, assumingTWO_LEVEL,STORED_POINTERS, andHASH_TABLE)
Example Usage
The main executable is NucleusDecomposition_main in the benchmarks/NucleusDecomposition/ directory.
After navigating to the benchmarks/NucleusDecomposition/ directory, a template command is:
$ bazel run :NucleusDecomposition_main -- -s -rounds 1 --rClique 3 --sClique 4 --numberOfLevels TWO_LEVEL --inverseIndexMap STORED_POINTERS --relabel --updateAggregation LIST_BUFFER </path/to/input/graph>