Releases: JuliaGPU/KernelAbstractions.jl
Releases · JuliaGPU/KernelAbstractions.jl
v0.9.39
KernelAbstractions v0.9.39
Changes
- On 1.11+
aliascopehas the potential to cause a miss-compilation even when the user is not using@Const.
Merged pull requests:
- Unified memory allocations (#630) (@christiangnrd)
- SPIRVIntrinsics 0.5 support (#638) (@christiangnrd)
- CI: Build PoCL without hwloc (#639) (@christiangnrd)
- Use
macos-15-intel(#640) (@christiangnrd) - Aliasscope miss-compile for 1.11+ (#653) (@vchuravy)
- Add test for accumulate issue (#654) (@vchuravy)
Closed issues:
- Unified memory interface? (#601)
- Loading KA 0.10 and oneAPI/OpenCL will lead to "MethodOverwritten" errors due to MethodTable collisions (#610)
- ones function is not type-stable for CUDA (#634)
- error with @synchronize() and loads/stores with static number of items (#642)
- Using
@fastmathfor a function inlined in a kernel yields PTX compile error for Float64 arrays in Julia 1.12.0 (#643)
v0.9.38
KernelAbstractions v0.9.38
Feature changes
- Add API support for unified memory allocations
Merged pull requests:
- [0.9] Unified memory allocations (#632) (@christiangnrd)
v0.9.37
KernelAbstractions v0.9.37
Feature changes
- Support
@kerneldefinition inside functions
Merged pull requests:
- Use stacked method tables (#615) (@vchuravy)
- avoid boxing when
@kernelis used as a closure (#625) (@simeonschaub)
v0.9.36
KernelAbstractions v0.9.36
Feature changes
get_backendsupport for StaticArrays
Merged pull requests:
- Use Printf to report errors from POCL (#592) (@vchuravy)
- use unsafe_indices for a few examples (#612) (@vchuravy)
- Switch to SPIRVIntrinsics 0.3 and the new backend (#614) (@vchuravy)
- KA.__synchronize, add GLOBAL_MEM_FENCE semantic (#618) (@vchuravy)
- add get_backend for StaticArrays (#621) (@vchuravy)
Closed issues:
- How to improve CPU performance? (#357)
v0.9.35
KernelAbstractions v0.9.35
Merged pull requests:
- Implement a CPU backend using POCL (#556) (@vchuravy)
- [0.10] Forbid divergent execution of work-group barriers (#558) (@vchuravy)
- Bump julia-actions/setup-julia from 1 to 2 (#561) (@dependabot[bot])
- Switch Format.yml to CUDA.jl style (#568) (@vchuravy)
- Test pocl#main on CI (#569) (@vchuravy)
- CompatHelper: add new compat entry for SPIRVIntrinsics at version 0.2, (keep existing compat) (#571) (@github-actions[bot])
- CompatHelper: add new compat entry for GPUCompiler at version 1, (keep existing compat) (#572) (@github-actions[bot])
- CompatHelper: add new compat entry for LLVM at version 9, (keep existing compat) (#573) (@github-actions[bot])
- Check that malformed allocations throw and don't stackoverflow (#576) (@vchuravy)
- Check that malformed allocations throw and don't stackoverflow (#576) (#577) (@vchuravy)
- Avoid callgraph recursion due to exception branch in get_global_id (#579) (@vchuravy)
- Remove CPU(static=true) test (#580) (@vchuravy)
- Set SPIR-V to 1.2 (#582) (@vchuravy)
- use POCL with fixes (#589) (@vchuravy)
- use barrier with LOCAL_MEM_FENCE (#591) (@vchuravy)
- Test correct backend in examples test (#597) (@christiangnrd)
- Switch to pocl_jll@v7 (#599) (@vchuravy)
- prevent
get_backendfrom overflowing the stack (#602) (@nsajko) - [NFC] Ignore formatting PRs in blame (#604) (@christiangnrd)
- Enable downstream CI for 0.10 (#608) (@vchuravy)
- Disable Float16 on the CPU backend (#609) (@vchuravy)
Closed issues:
v0.9.34
KernelAbstractions v0.9.34
Merged pull requests:
- Bump googleapis/code-suggester from 2 to 4 (#560) (@dependabot[bot])
- Allow opt-out of implicit bounds-checking (#563) (@vchuravy)
- [0.9] Forbid divergent execution of work-group barriers (#564) (@vchuravy)
- Update Changelog in docs (#565) (@vchuravy)
- Fix docs and test for unsafe_indicies=true (#566) (@vchuravy)
- Fix indicies->indices typo everywhere (#567) (@vchuravy)
v0.9.33
KernelAbstractions v0.9.33
Merged pull requests:
v0.9.32
KernelAbstractions v0.9.32
- Clarify the semantics of
KernelAbstractions.copyto!and addKernelAbstractions.pagelock! - Add support for multiple devices per backend
Merged pull requests:
- Run Runic after explicit return rule addition (#516) (@fredrikekre)
- Avoid the exception branch in expand (#518) (@vchuravy)
- Allow for ndims query (#551) (@vchuravy)
- Switch Runic CI (#552) (@vchuravy)
- Update quickstart.md (#553) (@Dale-Black)
- support multiple devices per backend (#554) (@vchuravy)
- Document the semantics of copyto! and add pagelock! (#555) (@vchuravy)
Closed issues:
- Add Feature to Select Devices to Execute Kernels On (#458)
v0.9.31
KernelAbstractions v0.9.31
Merged pull requests: