Skip to content

seanguo61/diffusion-language-model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 

Repository files navigation

diffusion-language-model

This repository contains a collection of resources and papers on diffusion language models.

Contents

Resources

Introductory Posts

Diffusion language models
Dieleman, Sander
[Website]

Introductory Lectures

Gemini-diffusion
Google
[Website]

papers

Survey

Diffusion Models for Non-autoregressive Text Generation: A Survey
[https://arxiv.org/abs/2303.06574]

A Survey of Diffusion Models in Natural Language Processing
[https://arxiv.org/abs/2305.14671]

Discrete Diffusion in Large Language and Multimodal Models: A Survey
[https://arxiv.org/pdf/2506.13759]

Must-Read

Structured Denoising Diffusion Models in Discrete State-Spaces
D3PM
[https://arxiv.org/abs/2107.03006]

Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution
SEED
[https://arxiv.org/abs/2310.16834]

Simple and Effective Masked Diffusion Language Models
MDLM Neurips 2024
[https://openreview.net/forum?id=L4uaAR4ArM]

Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
ICLR 2025
[https://arxiv.org/abs/2503.09573]

Simplified and Generalized Masked Diffusion for Discrete Data
Neurips 2024, deepmind
[https://github.com/google-deepmind/md4]

Energy-Based Diffusion Language Models for Text Generation
ICLR 2025, stefano Ermon
[https://arxiv.org/abs/2410.21357]

LaViDa: A Large Diffusion Language Model for Multimodal Understanding
[https://arxiv.org/abs/2505.16839]

Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding
[https://arxiv.org/pdf/2505.16990]

Diffusion Language Models Are Versatile Protein Learners
[arxiv]

Neurips25

NeurIPS 2025 papers: Most of these focus on discrete diffusion or diffusion language models, with a few covering other areas.

  • Why Masking Diffusion Works: Condition on the Jump Schedule for Improved Discrete Diffusion

  • Theoretical Benefit and Limitation of Diffusion Language Model

  • STEAD: Robust Provably Secure Linguistic Steganography with Diffusion Language Model

  • StateSpaceDiffuser: Bringing Long-Context Content to Diffusion World Models

  • State Size Independent Statistical Error Bound for Discrete Diffusion Models

  • Remasking Discrete Diffusion Models with Inference-Time Scaling

  • Reinforcing the Diffusion Chain of Lateral Thought with Diffusion Language Models

  • On Efficiency-Effectiveness Trade-off of Diffusion-based Recommenders

  • Non-Markovian Discrete Diffusion with Causal Language Models

  • Next Semantic Scale Prediction via Hierarchical Diffusion Language Models

  • MRO: Enhancing Reasoning in Diffusion Language Models via Multi-Reward Optimization

  • MMaDA: Unraveling The Design Space for Multimodal Large Diffusion Language Models

  • Learnable Sampler Distillation for Discrete Diffusion Models

  • LaViDa: A Large Diffusion Model for Vision-Language Understanding

  • Language Modeling by Language Models

  • Large Language Diffusion Models

  • KLASS: KL-Adaptive Stability Sampling for Fast Inference in Masked Diffusion Models

  • Informed Correctors for Discrete Diffusion Models

  • Heterogeneous Diffusion Structure Inference for Network Cascade

  • GeoAda: Efficiently Finetune Geometric Diffusion Models with Equivariant Adapters

  • Generative Pre-trained Autoregressive Diffusion Transformer

  • Fine-Tuning Discrete Diffusion Models with Policy Gradient Methods

  • Fast Solvers for Discrete Diffusion Models: Theory and Applications of High-Order Algorithms

  • Fast and Fluent Diffusion Language Models via Convolutional Decoding and Rejective Fine-tuning

  • Fading to Grow: Growing Preference Ratios via Preference Fading Discrete Diffusion for Recommendation

  • Encoder-Decoder Block Diffusion Language Models for Efficient Training and Inference

  • Don’t Let It Fade: Preserving Edits in Diffusion Language Models via Token Timestep Allocation

  • Absorb and Converge: Provable Convergence Guarantee for Absorbing Discrete Diffusion Models

  • Accelerated Sampling from Masked Diffusion Models via Entropy Bounded Unmasking

  • Accelerating Diffusion LLMs via Adaptive Parallel Decoding

  • Ambient Diffusion Omni: Training Good Models with Bad Data

  • Ambient Proteins - Training Diffusion Models on Noisy Structures

  • Anchored Diffusion Language Model

  • Beyond Masked and Unmasked: Discrete Diffusion Models via Partial Masking

  • Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents

  • Consistent Sampling and Simulation: Molecular Dynamics with Energy-Based Diffusion Models

  • Constrained Discrete Diffusion

  • Continuous Diffusion Model for Language Modeling

  • d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning

  • Deep Compositional Phase Diffusion for Long Motion Sequence Generation

  • Diffusion Beats AR in Data-Constrained Settings

  • DINGO: Constrained Inference for Diffusion LLMs

  • Discrete Diffusion Models: Novel Analysis and New Sampler Guarantees

  • Discrete Spatial Diffusion: Intensity-Preserving Diffusion Modeling

  • dKV-Cache: The Cache for Diffusion Language Models

  • Hierarchical Koopman Diffusion: Fast Generation with Interpretable Diffusion Trajectory

  • Neural Hamiltonian Diffusions for Modeling Structured Geometric Dynamics

  • NeuralPLexer3: Accurate Biomolecular Complex Structure Prediction with Flow Models

  • Unifying Text Semantics and Graph Structures for Temporal Text-attributed Graphs with Large Language Models

  • Towards a Cascaded LLM Framework for Cost-effective Human-AI Decision-Making

  • C3PO: Optimized Large Language Model Cascades with Probabilistic Cost Constraints for Reasoning

  • LAW 2025: Bridging Language, Agent, and World Models for Reasoning and Planning

  • WALL-E: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents

  • World Models Should Prioritize the Unification of Physical and Social Dynamics

  • SimWorld: An Open-ended Simulator for Agents in Physical and Social Worlds

  • Social World Model-Augmented Mechanism Design Policy Learning

  • World Models Should Prioritize the Unification of Physical and Social Dynamics

  • PhysDiff: A Physically-Guided Diffusion Model for Multivariate Time Series Anomaly Detection

  • SegMASt3R: Leveraging Geometric Foundation Models for Wide-Baseline Segment Matching

  • Towards Multiscale Graph-based Protein Learning with Geometric Secondary Structural Motifs

About

papers related to diffusion language models

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published