Releases: JuliaGPU/JACC.jl
Releases · JuliaGPU/JACC.jl
v1.1.0
What's Changed
- Add CI Coverage by @williamfgc in #326
- Fix AMDGPU runner by @williamfgc in #327
- Fix AMDGPU runner by @williamfgc in #328
- Fix runner by @williamfgc in #329
- Add path to coverage file by @williamfgc in #330
- Fix path to codecov file by @williamfgc in #331
- Try preinstalled codecov-cli by @williamfgc in #332
- Fix path to codecov-cli by @williamfgc in #333
- Add ci failure for codecov upload by @williamfgc in #334
- Add Contributing.md by @williamfgc in #324
- Delay codecov notifications by @williamfgc in #335
- Upload to Codecov on PR only by @williamfgc in #336
- Bump coverage reports on codecov by @williamfgc in #337
- Upload coverage during merge by @williamfgc in #338
- Reduce n builds by @williamfgc in #339
- Don't update coverage when merging by @williamfgc in #340
- Add PR Info to Codecov by @williamfgc in #341
- Add JACC short demo in README by @williamfgc in #342
- Add Covecov flags by @williamfgc in #343
- Add carryforward flags by @williamfgc in #344
- Upload coverage report on merge by @williamfgc in #345
- Fix coverage-on-merge by @williamfgc in #346
- Relax coverage checks by @williamfgc in #348
- Remove kernel from coverage by @williamfgc in #347
- Add OpenSSF badge by @williamfgc in #349
- Use shmem info and launch config for CUDA and AMDGPU kernels by @PhilipFackler in #351
- Add optional static checks by @williamfgc in #350
- General improvements by @williamfgc in #352
- Update README and JACC diagram by @williamfgc in #353
- Release v1.1.0 by @PhilipFackler in #354
Full Changelog: v1.0.0...v1.1.0
v1.0.0
What's Changed
- Use explicit type for dims in ParallelReduce by @PhilipFackler in #292
- Fix backend bits by @PhilipFackler in #293
- Fix type instability causing crash in 2D parallel_for on AMDGPU by @PhilipFackler in #299
- Add to_device and create_stream by @PhilipFackler in #300
- Versions of
array()for allocating uninitialized arrays by @PhilipFackler in #301 - Add Apple GPU CI on ExCL by @williamfgc in #305
- Fix runners in CI by @williamfgc in #307
- Metal backend by @williamfgc in #306
- Use correct arch label in CI by @williamfgc in #308
- Update README by @williamfgc in #310
- Correct dimensions for
JACC.sharedby @PhilipFackler in #309 - Use explicit type for workspace member to avoid type instability by @PhilipFackler in #313
- Add basic macro syntax by @PhilipFackler in #312
- Refactored
ParallelReduceouter constructors into JACC.reducer by @PhilipFackler in #314 - Update AMDGPU perf-test kernel by @luraess in #315
- Custom ranges for parallel_for and parallel_reduce by @PhilipFackler in #316
- Repo GPU CI JuliaORNL to JuliaGPU by @williamfgc in #318
- Fix ReadMe by @williamfgc in #317
- Fix documentation link in badge by @williamfgc in #319
- Update deploydocs org to JuliaGPU by @williamfgc in #321
- Add API documentation by @williamfgc in #320
- Update api_usage.md by @PhilipFackler in #322
New Contributors
Full Changelog: v0.6.0...v1.0.0
v0.6.0
What's Changed
- Skip set_backend if passed the same backend by @PhilipFackler in #251
- Add ubuntu arm runner by @PhilipFackler in #250
- Add project documentation by @williamfgc in #252
- Add docs badge and acknowledgement by @williamfgc in #253
- Add missing
synchronizeimplementation by @PhilipFackler in #256 - Improvements to "threads" backend by @PhilipFackler in #257
- Occupancy with oneAPI by @PhilipFackler in #260
- Fixed incorrect condition in AMDGPU LaunchSpec parallel_for by @PhilipFackler in #263
- Update NVIDIA CI runner by @williamfgc in #269
- Managed/Unmanaged reduce workspace for GPU backends by @PhilipFackler in #270
- Use -1 to signal default shmem_size by @PhilipFackler in #272
- Added to_host function by @PhilipFackler in #273
- Add GTX1080 CI workflow by @williamfgc in #274
- Docs badge by @williamfgc in #275
- Implement JACC.Multi for oneAPI by @PhilipFackler in #279
- Add
@inlineto default parallel_for by @williamfgc in #282 - N-dimensional versions and API update by @PhilipFackler in #276
- Allow more manipulation of backends by @PhilipFackler in #277
- Add JACC.Async implementations by @PhilipFackler in #278
- Simplify parallel_reduce implementation by @PhilipFackler in #284
- Test cleanup and benchmarks by @PhilipFackler in #246
- Add
Optype parameter toParallelReduceby @PhilipFackler in #287 - Updated version to 0.6.0 by @PhilipFackler in #288
- Fixed compat entries by @PhilipFackler in #289
Full Changelog: v0.5.0...v0.6.0
v0.5.0
What's Changed
- Added JACC.Async for CUDA backend. JACC.Async.copy does not work XD by @pedrovalerolara in #227
- Fix scoping for parallel_for in threads impl by @PhilipFackler in #229
- Fix conditions in 2d reduce kernel by @PhilipFackler in #231
shared(::AbstractArray)andsync_workgroup()by @PhilipFackler in #202- JACC.Multi API updates by @PhilipFackler in #228
- Use max shmem device props by @PhilipFackler in #235
- Added API functions for do-style syntax by @PhilipFackler in #241
- Scope synchronize properly for threads by @PhilipFackler in #243
- Added
fillfunction by @PhilipFackler in #244 - Fix AMDGPU version by @williamfgc in #239
- Update README by @williamfgc in #238
- Refactor parallel_reduce for do syntax by @PhilipFackler in #248
- Updated version to 0.5.0 by @PhilipFackler in #249
Full Changelog: v0.4.0...v0.5.0
v0.4.0
What's Changed
- Add
LaunchSpecversions ofparallel_reduceby @PhilipFackler in #216 - Better occupancy for 2D parallel_for and prevent oversubscription by @PhilipFackler in #224
- Release v0.4.0 by @PhilipFackler in #225
Full Changelog: v0.3.1...v0.4.0
v0.3.1
What's Changed
- Replaced while loops in reduce kernels by @PhilipFackler in #206
- Change
shmem_sizeto use thread count (like AMDGPUExt) by @PhilipFackler in #210 - Release v0.3.1 by @PhilipFackler in #211
Full Changelog: v0.3.0...v0.3.1
v0.3.0
What's Changed
- Bump to Julia 1.11.3 in cousteau by @williamfgc in #197
- Reorganize source code and modules by @PhilipFackler in #193
- Release v0.3.0 by @PhilipFackler in #204
- Make sure comparison is in host memory in
sharedtest case by @PhilipFackler in #208
Full Changelog: v0.2.1...v0.3.0
v0.2.1
What's Changed
- Install backend on
set_backendby @PhilipFackler in #195 - Release v0.2.1 by @PhilipFackler in #196
Full Changelog: v0.2.0...v0.2.1
v0.2.0
What's Changed
- Added JACCASYNC API for threads.jl backend. Note that JACC.Async work… by @pedrovalerolara in #161
- Bump Atomix to 1.0.1 by @williamfgc in #165
- Remove rocm by @williamfgc in #166
- Bump AMDGPU CI version by @williamfgc in #168
- Update CI labels by @williamfgc in #169
- Update README info by @williamfgc in #164
- Added @init_backend for user convenience. by @PhilipFackler in #170
- Update
parallel_reduceby @PhilipFackler in #173 - Remove
Arraytype and addarrayfunction by @PhilipFackler in #177 - Fix bug in oneAPI 1D reduce by @PhilipFackler in #182
- WIP: AMDGPU compute occupancy by @PhilipFackler in #178
- WIP: Better blocks/threads calculations for CUDA backend by @PhilipFackler in #136
- Add parallel_for API with keyword struct by @PhilipFackler in #188
- Use computed occupancy for amdgpu parallel_reduce by @PhilipFackler in #190
- Release v0.2.0 by @PhilipFackler in #191
Full Changelog: v0.1.1...v0.2.0
v0.1.1
What's Changed
- Update to macos-latest for github actions by @PhilipFackler in #152
- Update to macos-latest for github actions by @PhilipFackler in #153
- Switched to TestItemRunner and enabled selecting tests by name or tag by @PhilipFackler in #154
- Updated JACC.BLAS test by @PhilipFackler in #155
- Update oneAPI testing by @PhilipFackler in #151
- Fixed
@maybe_threadedto work as intended with precompilation by @PhilipFackler in #158 - Release v0.1.1 by @PhilipFackler in #159
Full Changelog: v0.1.0...v0.1.1