Conversation
Exploit that TILE_K=32 matches I2_S group size — within a K-tile all elements share the same block and group. Each u32 read yields 4 elements (one per byte), replacing 8 element-by-element loads with 2 batch-unpack iterations per thread. Closes #21 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove 5 identical private createUniform/createUniformBuffer methods from model.ts, attention.ts, bitlinear.ts, ffn.ts, transformer.ts and replace with a shared free function in buffer-pool.ts (already imported by all callers). Closes #22 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Diff all .wgsl files between TS and Rust shader directories to catch desynchronization early. Closes #23 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add static unicodeToByte cache mirroring the existing byteToUnicode pattern. bytesToString() no longer rebuilds a 256-entry Map on every call (per-token hot path). Closes #24 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Bucket buffers by (usage, sizeClass) where sizeClass is the next power of 2 of the aligned size. Within each bucket, buffers are exact-size matches so acquire is O(1) pop from the free list instead of O(n) scan. Add stats() for introspection and trim() to reclaim free GPU memory. Closes #25 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
createUniformBufferhelper — deduplicate 5 identical private methods across nn/ classes into a shared free function inbuffer-pool.tsdiffall.wgslfiles between TS and Rust directories to catch desynchronizationunicodeToBytecache sobytesToString()stops rebuilding a 256-entry Map on every token decode(usage, nextPow2(alignedSize))for O(1) acquire instead of O(n) scan; addstats()andtrim()methodsTest plan
npm run buildpassesnpm run lintpassescargo check --workspacepasses🤖 Generated with Claude Code