Skip to content

Conversation

@HaroldMargeta-Cacace
Copy link

@HaroldMargeta-Cacace HaroldMargeta-Cacace commented Jul 28, 2025

The current version of cuAOA is not especially well-suited for use in limited-access HPC environments with multiple GPU types, particularly due to its requirement of CUDA 12 and C++ 20. Therefore, to expand access to users in such environments, the following changes have been made:

  • Added compatibility to CUDA 11 (while maintaining CUDA 12 compatibility)
  • Altered code to compile properly with C++17 (this is important for compatibility with CUDA 11)
  • Added multi-arch compilation
  • Install script dynamically adjusts compile_flags.txt
  • Default build is done solely with maturin (though a uv build can still be done by uncommenting the appropriate lines of code)
  • Corrected how cuAOA recognizes environment filepaths in Makefile
  • Fixed an incorrect CUDA API call

Changes install.sh to build using maturin instead of uv. Option to build with uv is still present in the comments.
Added code that dynamically updates compile_flags.txt to add arch flags for sm_89 and sm_90 if on CUDA >=12
Changes Makefile to properly recognize the current conda environment filepaths (and likewise nvcc). Furthermore, now compiles for additional architectures (sm_70, 75, 80, and 86 if user has at least CUDA 11, and does 89 and 90 if nvcc --version returns CUDA 12). For maintenance of compatibility with CUDA 11 and 12, adds both lib64 and lib in addition to compiling with C++ 17 instead of 20.
Changes polynomial.hpp to enable compilation with C++ 17.
Changes polynomial.cpp to enable compilation with C++ 17.
Makes changes to polynomial.cpp to enable compilation with C++17.
Changes cudaGetDeviceProperties_v2(&prop, i); to the correct CUDA API function, cudaGetDeviceProperties(&prop, i);
Changes compile_flags.txt to make it consistent with others
Changes install.sh to correctly modify compile_flags.txt.
@JFLXB JFLXB self-assigned this Jul 29, 2025
@JFLXB JFLXB added the enhancement New feature or request label Jul 29, 2025
@JFLXB JFLXB self-requested a review July 29, 2025 19:53
@JFLXB JFLXB removed their assignment Jul 29, 2025
Copy link
Owner

@JFLXB JFLXB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all, thank you for the contribution! We really appreciate the time and effort you've put into this.

Most of the comments are related to the build process and the behavior of the generated wheels with this setup. We're particularly interested in understanding the motivation behind some of the changes, and curious why the build seems to work on your setup but leads to issues on ours.

Looking forward to iterating on this with you!


# Additional include directories
# Base NVCC flags
NVCC_FLAGS := -std=c++17 -Xcompiler="-fPIC -O3" \
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On my system I also require the -fPIC flag for the GPP_FLAGS which are currently unset.

Adding the following lines to the Makefile fixes compilation issues on my end:

# Base GPP flags
GPP_FLAGS := -std=c++17 -fPIC -O3

-std=c++17 and -O3 are added for completeness

COMPILE_FLAGS_FILE="cuaoa/internal/compile_flags.txt"

# Detect CUDA version (major only)
CUDA_VERSION_STR=$($NVCC --version | grep "release" | sed -E 's/.*release ([0-9]+)\..*/\1/')
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$NVCC should be changed to just nvcc.

def get_num_nodes(self) -> uint64: ...

# BruteFroce
# BruteForce (typo; need to fix all instances later)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the hint! 😅

Will track this in #5 and fix after merging this PR.


# execute_command "uv build" "Building the project from source... (this might take a while)" "$GEAR "
pip install maturin[patchelf]
execute_command "maturin build --release" "Building the project from source... (this might take a while)" "$GEAR "
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the reason to switch from uv back to maturin?

Executing the install.sh script with maturin as the build tool results in an incorrect/unusable wheel for me, i.e., running the example from the README.md results in unexpected values for all computations.

However, changing this line to*:

execute_command "uv sync" "Building the project from source... (this might take a while)" "$GEAR "

Lets us keep L44 as is and have a functioning wheel with respect to the Usage Example Code.


* uv build also did not work as expected

- [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html): A crucial tool for environment and package management.
- [uv](https://docs.astral.sh/uv/): Another crucial tool for environment and package management.
- [Python >= 3.11](https://www.python.org/downloads/): Required for running the Python code. Other versions may work but have not been tested.
- [Python = 3.11](https://www.python.org/downloads/): Required for running the Python code. Other versions will not work due to incompatibility with maturin.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the "incompatibilit[ies] with maturin" for other Python version?

I tested with versions 3.12 again with the ./install.sh script using uv sync (see other comment) for building and encountered no issues when executing the README.md usage code with:

python <script_containing_usage_code.py>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants