Skip to content

ackRow/softmax-kernels

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

softmax-kernels

GPU kernel optimization: Softmax

Performance plot

Usage

The following code was tested using the docker image: nvidia/cuda:12.4.0-devel-ubuntu22.04 on a Geforce RTX 2070

  • Build Python library with CUDA bindings
cd cuda
pip install .
  • Test both implementations against Pytorch baseline
python3 assertions.py
  • Run a benchmark
python3 benchmark.py
  • Profile both implementations
ncu --set full [-o output_path] python3 -O assertions.py

Articles

About

GPU kernel optimization: Softmax

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published