Skip to content
View atcuality2021's full-sized avatar

Block or report atcuality2021

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Popular repositories Loading

  1. vllm-gb10-gemma4 vllm-gb10-gemma4 Public

    Complete vLLM + Gemma 4 for NVIDIA DGX Spark GB10 — one command install with benchmarks

    Shell 1 1

  2. manthanquant manthanquant Public

    3-bit Lloyd-Max KV Cache Compression for LLM Inference on NVIDIA DGX Spark GB10 — 5.12x compression, 0.983 cosine similarity, pure numpy on ARM unified memory

    Python 1

  3. vllm-gb10 vllm-gb10 Public

    Custom native vLLM for NVIDIA DGX Spark GB10 (ARM aarch64, Blackwell sm_121)

    Shell 1

  4. vllm-gemma4-patch vllm-gemma4-patch Public

    Gemma 4 support patch for vLLM 0.18.x — backports PR #38826

    Shell 1

  5. manthanquant-x86 manthanquant-x86 Public

    TurboQuant KV Cache Compression for vLLM on x86 GPUs — 5.12x compression, 0.983 cosine similarity (BiltIQ AI)

    Python