What's new in v0.2.0
GPU Acceleration Layer
- New
ComputeBackend protocol covering the two O(V×D²) GEMMs that dominate encode/decode cost: rotate, rotate_inverse, project, normalize_rows, restore_norms
NumPyBackend — always available, delegates to canonical implementations to prevent drift
TorchBackend — CUDA (NVIDIA) and MPS (Apple Silicon via PyTorch); pip install semafold[torch]
MLXBackend — Metal (Apple Silicon via MLX); pip install semafold[mlx]
- Thread-safe auto-detection registry with priority chain: CUDA → MPS → MLX → NumPy
- On-device matrix cache keyed by
(ctypes.data, shape) — bounded to ≤ 64 entries, safe against id() reuse after LRU eviction
list_backends() probes availability without instantiating backend objects
- Plain
pip install semafold unchanged — NumPy remains the default, no new required dependencies
Typed Enum Surface
EncodeObjective, EncodeMetric, EncodingSegmentKind added to the stable root export
EncodingBoundType, WorkloadSuitability added to the stable root export
- All enum fields accept both enum instances and string values — existing string-based code continues to work
Quality
- Windows compatibility verified
- 189 tests passing across Python 3.10–3.13
- Pyright 0 errors
Install
pip install semafold # NumPy core — no GPU required
pip install semafold[torch] # + NVIDIA CUDA / Apple MPS
pip install semafold[mlx] # + Apple Silicon Metal
## Upgrade notes
No breaking changes. String values for enum fields continue to work — enums are the preferred form going forward.