Skip to content
@neuralmagic

Neural Magic

Neural Magic (Acquired by Red Hat) empowers developers to optimize & deploy LLMs at scale. Our model compression & acceleration enable top performance with vLLM

Pinned Loading

  1. deepsparse deepsparse Public archive

    Sparsity-aware deep learning inference runtime for CPUs

    Python 3.2k 192

Repositories

Showing 10 of 81 repositories
  • research Public

    Repository to enable research flows

    neuralmagic/research’s past year of commit activity
    Python 3 0 0 3 Updated Oct 28, 2025
  • vllm Public Forked from vllm-project/vllm

    A high-throughput and memory-efficient inference and serving engine for LLMs

    neuralmagic/vllm’s past year of commit activity
    Python 15 Apache-2.0 10,958 0 18 Updated Oct 28, 2025
  • axolotl Public Forked from axolotl-ai-cloud/axolotl

    Go ahead and axolotl questions

    neuralmagic/axolotl’s past year of commit activity
    Python 0 Apache-2.0 1,192 0 5 Updated Oct 26, 2025
  • lighteval Public Forked from huggingface/lighteval

    Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends

    neuralmagic/lighteval’s past year of commit activity
    Python 0 MIT 369 0 0 Updated Oct 22, 2025
  • pytorch Public Forked from pytorch/pytorch

    Tensors and Dynamic neural networks in Python with strong GPU acceleration

    neuralmagic/pytorch’s past year of commit activity
    Python 0 26,339 0 2 Updated Oct 21, 2025
  • sglang Public Forked from sgl-project/sglang

    SGLang is a fast serving framework for large language models and vision language models.

    neuralmagic/sglang’s past year of commit activity
    Python 0 Apache-2.0 3,200 0 1 Updated Oct 17, 2025
  • neuralmagic/model-validation-configs’s past year of commit activity
    2 0 0 2 Updated Oct 16, 2025
  • nm-actions Public

    Neural Magic GHA

    neuralmagic/nm-actions’s past year of commit activity
    Python 0 Apache-2.0 0 0 3 Updated Oct 8, 2025
  • arena-hard-auto Public Forked from lmarena/arena-hard-auto

    Arena-Hard-Auto: An automatic LLM benchmark.

    neuralmagic/arena-hard-auto’s past year of commit activity
    Python 0 Apache-2.0 129 0 1 Updated Oct 8, 2025
  • DeepEP-test Public Forked from smarterclayton/DeepEP

    DeepEP: an efficient expert-parallel communication library

    neuralmagic/DeepEP-test’s past year of commit activity
    Cuda 0 MIT 973 0 0 Updated Sep 26, 2025