Trainable fast and memory-efficient sparse attention
transformers pytorch english transformer triton chinese cuda-kernels cutlass attention-mechanism attention-is-all-you-need self-attention pytorch-implementation flash-attention triton-kernels dynamic-mask-attention
-
Updated
Oct 29, 2025 - C++