Skip to content

Code improvements (#21–#25)#26

Merged
m96-chan merged 5 commits intomainfrom
code-improvements-21-25
Feb 22, 2026
Merged

Code improvements (#21–#25)#26
m96-chan merged 5 commits intomainfrom
code-improvements-21-25

Conversation

@m96-chan
Copy link
Copy Markdown
Owner

@m96-chan m96-chan commented Feb 22, 2026

Summary

Test plan

🤖 Generated with Claude Code

m96-chan and others added 5 commits February 22, 2026 12:33
Exploit that TILE_K=32 matches I2_S group size — within a K-tile all
elements share the same block and group. Each u32 read yields 4 elements
(one per byte), replacing 8 element-by-element loads with 2 batch-unpack
iterations per thread.

Closes #21

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove 5 identical private createUniform/createUniformBuffer methods
from model.ts, attention.ts, bitlinear.ts, ffn.ts, transformer.ts and
replace with a shared free function in buffer-pool.ts (already imported
by all callers).

Closes #22

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Diff all .wgsl files between TS and Rust shader directories to catch
desynchronization early.

Closes #23

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add static unicodeToByte cache mirroring the existing byteToUnicode
pattern. bytesToString() no longer rebuilds a 256-entry Map on every
call (per-token hot path).

Closes #24

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Bucket buffers by (usage, sizeClass) where sizeClass is the next power
of 2 of the aligned size. Within each bucket, buffers are exact-size
matches so acquire is O(1) pop from the free list instead of O(n) scan.

Add stats() for introspection and trim() to reclaim free GPU memory.

Closes #25

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@m96-chan m96-chan merged commit 35e5f84 into main Feb 22, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant