Skip to content

fix: pass Adesc.ld/Ddesc.ld as ldb/ldc for cublas grouped batched GEMM

20d37dd
Select commit
Loading
Failed to load commit list.
Sign in for the full log view
Open

feat(turbomind): integrate cublasGemmGroupedBatchedEx for Qwen3.5 MoE inference on Blackwell GPUs with memory copy optimizations #4490

fix: pass Adesc.ld/Ddesc.ld as ldb/ldc for cublas grouped batched GEMM
20d37dd
Select commit
Loading
Failed to load commit list.