cuAOA HPC Support Changes #4

HaroldMargeta-Cacace · 2025-07-28T17:54:17Z

The current version of cuAOA is not especially well-suited for use in limited-access HPC environments with multiple GPU types, particularly due to its requirement of CUDA 12 and C++ 20. Therefore, to expand access to users in such environments, the following changes have been made:

Added compatibility to CUDA 11 (while maintaining CUDA 12 compatibility)
Altered code to compile properly with C++17 (this is important for compatibility with CUDA 11)
Added multi-arch compilation
Install script dynamically adjusts compile_flags.txt
Default build is done solely with maturin (though a uv build can still be done by uncommenting the appropriate lines of code)
Corrected how cuAOA recognizes environment filepaths in Makefile
Fixed an incorrect CUDA API call

Changes install.sh to build using maturin instead of uv. Option to build with uv is still present in the comments.

Added code that dynamically updates compile_flags.txt to add arch flags for sm_89 and sm_90 if on CUDA >=12

Changes Makefile to properly recognize the current conda environment filepaths (and likewise nvcc). Furthermore, now compiles for additional architectures (sm_70, 75, 80, and 86 if user has at least CUDA 11, and does 89 and 90 if nvcc --version returns CUDA 12). For maintenance of compatibility with CUDA 11 and 12, adds both lib64 and lib in addition to compiling with C++ 17 instead of 20.

Changes polynomial.hpp to enable compilation with C++ 17.

Changes polynomial.cpp to enable compilation with C++ 17.

Makes changes to polynomial.cpp to enable compilation with C++17.

Changes cudaGetDeviceProperties_v2(&prop, i); to the correct CUDA API function, cudaGetDeviceProperties(&prop, i);

Changes compile_flags.txt to make it consistent with others

Changes install.sh to correctly modify compile_flags.txt.

…instances of BruteFroce

JFLXB

First of all, thank you for the contribution! We really appreciate the time and effort you've put into this.

Most of the comments are related to the build process and the behavior of the generated wheels with this setup. We're particularly interested in understanding the motivation behind some of the changes, and curious why the build seems to work on your setup but leads to issues on ours.

Looking forward to iterating on this with you!

JFLXB · 2025-07-29T19:57:39Z

cuaoa/internal/Makefile


-# Additional include directories
+# Base NVCC flags
+NVCC_FLAGS := -std=c++17 -Xcompiler="-fPIC -O3" \


On my system I also require the -fPIC flag for the GPP_FLAGS which are currently unset.

Adding the following lines to the Makefile fixes compilation issues on my end:

# Base GPP flags GPP_FLAGS := -std=c++17 -fPIC -O3

-std=c++17 and -O3 are added for completeness

JFLXB · 2025-07-29T20:31:21Z

install.sh

+COMPILE_FLAGS_FILE="cuaoa/internal/compile_flags.txt"
+
+# Detect CUDA version (major only)
+CUDA_VERSION_STR=$($NVCC --version | grep "release" | sed -E 's/.*release ([0-9]+)\..*/\1/')


$NVCC should be changed to just nvcc.

JFLXB · 2025-07-29T20:59:21Z

src/pycuaoa/__init__.pyi

    def get_num_nodes(self) -> uint64: ...

-# BruteFroce
+# BruteForce (typo; need to fix all instances later)


Thanks for the hint! 😅

Will track this in #5 and fix after merging this PR.

JFLXB · 2025-07-29T21:00:03Z

install.sh

-
+# execute_command "uv build" "Building the project from source... (this might take a while)" "$GEAR "
+pip install maturin[patchelf]
+execute_command "maturin build --release" "Building the project from source... (this might take a while)" "$GEAR "


What was the reason to switch from uv back to maturin?

Executing the install.sh script with maturin as the build tool results in an incorrect/unusable wheel for me, i.e., running the example from the README.md results in unexpected values for all computations.

However, changing this line to*:

execute_command "uv sync" "Building the project from source... (this might take a while)" "$GEAR "

Lets us keep L44 as is and have a functioning wheel with respect to the Usage Example Code.

* uv build also did not work as expected

JFLXB · 2025-07-29T21:00:37Z

README.md

 - [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html): A crucial tool for environment and package management.
 - [uv](https://docs.astral.sh/uv/): Another crucial tool for environment and package management.
- [Python >= 3.11](https://www.python.org/downloads/): Required for running the Python code. Other versions may work but have not been tested.
+- [Python = 3.11](https://www.python.org/downloads/): Required for running the Python code. Other versions will not work due to incompatibility with maturin.


What are the "incompatibilit[ies] with maturin" for other Python version?

I tested with versions 3.12 again with the ./install.sh script using uv sync (see other comment) for building and encountered no issues when executing the README.md usage code with:

python <script_containing_usage_code.py>

HaroldMargeta-Cacace added 12 commits July 26, 2025 21:41

Update install.sh

afefa04

Changes install.sh to build using maturin instead of uv. Option to build with uv is still present in the comments.

Update install.sh

4179ba8

Added code that dynamically updates compile_flags.txt to add arch flags for sm_89 and sm_90 if on CUDA >=12

Update polynomial.hpp

e29c919

Changes polynomial.hpp to enable compilation with C++ 17.

Update polynomial.cpp outside wrapper

3a54e70

Changes polynomial.cpp to enable compilation with C++ 17.

Update polynomial.cpp inside wrapper

74b005a

Makes changes to polynomial.cpp to enable compilation with C++17.

Update device_info.cpp

0b42818

Changes cudaGetDeviceProperties_v2(&prop, i); to the correct CUDA API function, cudaGetDeviceProperties(&prop, i);

Update compile_flags.txt

e9b5195

Changes compile_flags.txt to make it consistent with others

Update install.sh

b143025

Changes install.sh to correctly modify compile_flags.txt.

Fix typo in python API for BruteForce class

cf3ab75

Put the typo back in because I really don't want to have to find all …

c2edbcf

…instances of BruteFroce

Update README.md

b5fadc7

JFLXB self-assigned this Jul 29, 2025

JFLXB added the enhancement New feature or request label Jul 29, 2025

JFLXB self-requested a review July 29, 2025 19:53

JFLXB removed their assignment Jul 29, 2025

JFLXB reviewed Jul 29, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cuAOA HPC Support Changes #4

cuAOA HPC Support Changes #4

Uh oh!

HaroldMargeta-Cacace commented Jul 28, 2025 •

edited

Loading

Uh oh!

JFLXB left a comment

Uh oh!

JFLXB Jul 29, 2025

Uh oh!

JFLXB Jul 29, 2025

Uh oh!

JFLXB Jul 29, 2025

Uh oh!

JFLXB Jul 29, 2025

Uh oh!

JFLXB Jul 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cuAOA HPC Support Changes #4

Are you sure you want to change the base?

cuAOA HPC Support Changes #4

Uh oh!

Conversation

HaroldMargeta-Cacace commented Jul 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JFLXB left a comment

Choose a reason for hiding this comment

Uh oh!

JFLXB Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

JFLXB Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

JFLXB Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

JFLXB Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

JFLXB Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

HaroldMargeta-Cacace commented Jul 28, 2025 •

edited

Loading