The source code for our paper "Formulating Discrete Probability Flow Through Optimal Transport", Pengze Zhang*, Hubery Yin*, Chen Li, Xiaohua Xie, NeurIPS 2023. Video: [English]
Continuous diffusion models are commonly acknowledged to display a deterministic probability flow, whereas discrete diffusion models do not. In this paper, we aim to establish the fundamental theory for the probability flow of discrete diffusion models. Specifically, we first prove that the continuous probability flow is the Monge optimal transport map under certain conditions, and also present an equivalent evidence for discrete cases. In view of these findings, we are then able to define the discrete probability flow in line with the principles of optimal transport. Finally, drawing upon our newly established definitions, we propose a novel sampling method that surpasses previous discrete diffusion models in its ability to generate more certain outcomes. Extensive experiments on the synthetic toy dataset and the CIFAR-10 dataset have validated the effectiveness of our proposed discrete probability flow.
[2024.1.8] Our discrete probability flow can achieve the controllable generation of interpolation in the latent space.
- Python 3.9.0
- jax 0.4.8
- jaxlib 0.4.7
- CUDA 12.1
- NVIDIA A100 40GB PCIe
Open the directory for sddm.
cd sddmThe synthetic data can be divided into 7 categories: 2spirals, 8gaussians, checkerboard, circles, moons, pinwheel, swissroll. You can set 'data_name' for selection.
Binary graycode
data_name=XXX bash sddm/synthetic/data/run_binary_data_dump.shBase5 code
data_name=XXX bash sddm/synthetic/data/run_base5_data_dump.shBase10 code
data_name=XXX bash sddm/synthetic/data/run_base10_data_dump.shYou can also directly download our synthetic dataset into ./sddm/data
Binary graycode
data_name=XXX config_name=binary_graycode bash sddm/synthetic/train_binary_graycode.shBase5 code
data_name=XXX config_name=base5_code bash ./sddm/synthetic/train_base5_code.shBase10 code
data_name=XXX config_name=base10_code bash ./sddm/synthetic/train_base10_code.shYou can also directly download our pre-trained model into ./sddm/results
Please switch 'sampler_type' in 'sddm/synthetic/config/*.py' to choose lbjf or dpf sampling.
Binary graycode
data_name=XXX config_name=binary_graycode bash sddm/synthetic/binary_test_mmd.sh Base5 code
data_name=XXX config_name=base5_code bash sddm/synthetic/base5_test_mmd.shBase10 code
data_name=XXX config_name=base10_code bash sddm/synthetic/base10_test_mmd.shPlease switch 'sampler_type' in 'sddm/synthetic/config/*.py' to choose lbjf or dpf sampling.
Binary graycode
data_name=XXX config_name=binary_graycode bash sddm/synthetic/binary_test_std.sh Base5 code
data_name=XXX config_name=base5_code bash sddm/synthetic/base5_test_std.shBase10 code
data_name=XXX config_name=base10_code bash sddm/synthetic/base10_test_std.sh- Python 3.9.7
- pytorch 1.12.1
- torchvision 0.13.1
- CUDA 11.3
- NVIDIA A100 40GB PCIe
Open the directory for TauLDR.
cd TauLDRThe model is provided by TauLDR. You can directly download the model into ./TauLDR/models/cifar10
Please download the x_T into ./TauLDR for reproduction.
Switch 'DPF_type' in 'TauLDR/config/eval/cifar10.py' to 0 / 1 to choose TauLDR / DPF sampling.
python generate_test_certainty_data.pypython test_std.py --DPF_type XPlease download the pretrained Cifar10 classifier, and put it in ./TauLDR
python test_class_std.py --DPF_type Xpython test_entropy.py --DPF_type XPlease download the x_T into ./TauLDR for reproduction our Figure 12.
Switch 'DPF_type' in 'TauLDR/config/eval/cifar10.py' to 0 / 1 to choose TauLDR / DPF sampling.
python visualization.pyWe trained TauLDR on the Celeb dataset. You can directly download our pre-trained model into ./TauLDR/models/celeb_128.
Switch 'DPF_type' in 'TauLDR/config/eval/celeb.py' to 0 / 1 to choose TauLDR / DPF sampling.
python visualization_celeb.py@inproceedings{
zhang2023formulating,
title={Formulating Discrete Probability Flow Through Optimal Transport},
author={Pengze Zhang and Hubery Yin and Chen Li and Xiaohua Xie},
booktitle={Advances in Neural Information Processing Systems},
year={2023},
url={https://openreview.net/forum?id=I9GNrInbdf}
}We build our project based on SDDM and TauLDR. We thank them for their wonderful work and code release.
