Enable builds without direct torch.cuda availability and support sm89 / sm90. by wanderingai · Pull Request #5 · HazyResearch/flash-fft-conv

wanderingai · 2023-11-22T01:15:53Z

This PR allows for monarch_cuda kernel to be built for computes sm80, sm89, and sm90, which includes the following GPUs:

A100
H100
L40
RTX 6000 Ada

Additionally the setup.py is updated to enable builds based on nvcc availability but without direct torch.cuda availability for flexible builds.

Update:

The compiler flags have been abstracted to support both PTX and SASS builds while defaulting to the original Ampere-based PTX-only build i.e. -gencode=arch=compute_80,code=compute_80.

Successfully tested by building a docker image and running tests under tests/:

…lability.

DanFu09 · 2023-11-22T04:17:53Z

It looks like this PR is introducing some race conditions - when I install using this branch, some tests fail:

pytest -s -q tests/test_flashfftconv.py
Running 1120 items in this shard
......................................................................................................................................................................................................................
F.....................................................................................................................................................................................................................
................................................................F.F...F...............................................................................................................................................
......................................................................................................................................................................................................................
......................................................................................................................................................................................................................
..................................................

… ptx build.

michaelfeil · 2024-01-16T13:29:21Z

@wanderingai Love this PR, its a improvement from the previous setup.py. Can this be merged, worst case with no sm_90 flags by default?

Enable sm89 and sm90 builds and allow builds without direct cuda avai…

9461b6b

…lability.

DanFu09 self-requested a review November 22, 2023 04:18

Support both ptx and sass builds and default to original ampere-based…

b1f4f83

… ptx build.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable builds without direct torch.cuda availability and support sm89 / sm90.#5

Enable builds without direct torch.cuda availability and support sm89 / sm90.#5
wanderingai wants to merge 2 commits intoHazyResearch:mainfrom
wanderingai:multiarch-setup

wanderingai commented Nov 22, 2023 •

edited

Loading

Uh oh!

DanFu09 commented Nov 22, 2023

Uh oh!

michaelfeil commented Jan 16, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

wanderingai commented Nov 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DanFu09 commented Nov 22, 2023

Uh oh!

michaelfeil commented Jan 16, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

wanderingai commented Nov 22, 2023 •

edited

Loading