Experimental OpenMP support by rubenhorn · Pull Request #81 · okkevaneck/prospr

rubenhorn · 2025-11-09T21:03:35Z

I implemented the naive approach mentioned in #70 using OpenMP.

Unit tests exist, but the scaling behavior and parallel efficiency of this implementation for different instances have not yet been investigated.
⚠️ This algorithm evaluates more solutions than the serial version, because the enumerated subtrees cannot (yet) use information about the global optimum from each other.
Therefore, this implementation is more geared towards leveraging parallel CPU architectures to achieve real world speed-ups over fewer evaluations.

A note on the checkpoint feature implemented in #77...

This feature is currently not supported in the parallel implementation, because each thread will use the same checkpoint file and the order and assignment of loop indices is not deterministic.
One option is to just disable check pointing for the parallel version of the algorithm:

const char* varName = "PROSPR_CACHE_DIR";
const char* oldValue = std::getenv(varName);
bool existed = (oldValue != nullptr);
std::string backup;
if (existed) {
   backup = oldValue;
   std::cerr << "[Warning] Checkpointing is not supported with OpenMP.\n";
#ifdef _WIN32
   _putenv_s(varName, "");
#else
   unsetenv(varName);
#endif
}
/* Solve instance here. */
if (existed) {
#ifdef _WIN32
   _putenv_s(varName, backup.c_str());
#else
   setenv(varName, backup.c_str(), 1);
#endif
}

However, this is a very sloppy way of dealing with it and potentially dangerous when the caller is using multiple threads to solve multiple instances, since the environment is shared.
A better idea would be to re-write the parallel for-loop to use the master-worker pattern using OpenMP tasks. Then, only the current optimum and the remaining subtrees (identified by their index) need to be stored. Any progress on subtrees which are currently being solved is lost. Upon restart, the subtrees are enumerated again and the number of previously solved subtrees are removed from the start of the task queue.
I guess depth_first_bnb still needs to know not to load/create checkpoints, but this could be handled using another parameter or modified semantics of bool is_pre_folded.

MPI and multi-node parallelism

I think MPI is not a good fit here, because it has some disadvantages.

The small number of computations compared to the number of nodes in the tree result in excessive communication/synchronization overhead. I think it's very likely that the practical scalability is therefore rather limited and a single node will suffice.
Shared memory parallelism (as opposed to distributed memory in MPI) presents some opportunities for nice optimization tricks. As noted in this comment, by sharing the global optimum across all threads, more aggressive pruning (close to that of the serial algorithm) could be achieved.

TODOs

Investigate performance on non-trivial instances/large many-core systems
Reconcile with Checkpoints for depth_first_bnb #77 (checkpoint support)

rubenhorn · 2026-01-21T14:44:25Z

prospr/core/src/depth_first_bnb.cpp

+  }
+  float targetSubtreeCount = (float)(workerCount) * work_ratio;
+  /* Each node has up to 3 child nodes -> determine closest depth to match work_ratio */
+  size_t pre_fold_depth = (size_t)std::max(0ll, llround(log(targetSubtreeCount) / log(3.0)));


Wrong! Each node has dim+dim-1 children. (Replace hardcoded 3.0!)

Experimental OpenMP support

fd03368

rubenhorn changed the base branch from master to develop November 9, 2025 21:04

rubenhorn and others added 7 commits November 9, 2025 22:04

Merge branch 'develop' into experimental/depth_first_bnb_parallel

4fc9ee5

Fixed: macOS does not support OpenMP by default

9a2d10f

Small fix: Missing import

f2a4611

Small fix

7a198c4

Fixed pytest

a21a6e1

Fixed OpenMP on Windows

eb26c7b

Merge branch 'develop' into experimental/depth_first_bnb_parallel

76e7a24

rubenhorn marked this pull request as ready for review November 12, 2025 19:50

rubenhorn marked this pull request as draft November 12, 2025 19:50

Ruben Horn and others added 2 commits November 12, 2025 22:54

Ran githook

551592f

Removed unnecessary friend function declaration.

8125228

rubenhorn commented Jan 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experimental OpenMP support#81

Experimental OpenMP support#81
rubenhorn wants to merge 10 commits intookkevaneck:developfrom
rubenhorn:experimental/depth_first_bnb_parallel

rubenhorn commented Nov 9, 2025

Uh oh!

rubenhorn Jan 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rubenhorn commented Nov 9, 2025

I implemented the naive approach mentioned in #70 using OpenMP.

A note on the checkpoint feature implemented in #77...

MPI and multi-node parallelism

TODOs

Uh oh!

rubenhorn Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant