vq-compress

Image compression and reconstruction using pretrained autoencoder, vqgan first stage models from latent diffusion and taming transformer repo. Model codes, configs are copied from these repos with unnecessary parts removed.

Saves autoencoding model encoded output as compressed format. This output is passed to decoder on receiver side to reconstruct a lossy compressed version original image. Based on chosen seettings of autoencoding models the encoded output or its indices in case vqgan can be saved which can be used for reconstuction.

To save vram and prevent extra processing only encoder or decoder weights based on compression or decompression task is loaded. Training code is removed but should be able to load models trained on original repo.

Compressed data is saved in safetensors format. For compression if batch size larger than 1 is used then each output contains encode output tensor for the whole batch.

vq-f4, vq-f8, kl-f4, kl-f8 configs provide the best reconstruction results.

Compressing with vqgan model by removing --kl and adding --vq_ind for vq-f4, vq-f8 should provide best compression ratio. Further using a zip program to compress the saved output may provide better quality and file size reduction than using jpeg with quality reduced to around 60 percent. A good quality pretrained reconstruction model for vq-f8 followed by zip compression may provide best results in terms of file size.

When using --vq_ind also adding ind_bit to 8 should give the most compressed output though not best quality. It will not work with most configs as output value ranges need to be from 0-255. Only this config and its associated model will work with int_bit set to 8.

Install

Run following command on setup.py folder before running library.

pip install -e .

Commands

Compress

kl compress,

python compression.py -s "SRC_PATH" -d "DEST_PATH" --cfg "CONFIG_YAML_PATH" --ckpt "VAE_CKPT_PATH" --kl --batch 2 --img_size 384

vq compress with indices,

python compression.py -s "SRC_PATH" -d "DEST_PATH" --cfg "CONFIG_YAML_PATH" --ckpt "VAE_CKPT_PATH" --batch 1 --img_size 512 --vq_ind --ind_bit 16

Decompress

kl decompress,

python compression.py -s "SRC_PATH" -d "DEST_PATH" --cfg "CONFIG_YAML_PATH" --ckpt "VAE_CKPT_PATH" --kl --dc

vq decompress with indices,

python compression.py -s "SRC_PATH" -d "DEST_PATH" --cfg "CONFIG_YAML_PATH" --ckpt "VAE_CKPT_PATH" --dc --vq_ind

Flags

If --dc flag is provided it runs decompression otherwise compresses input.

--aspect resize image keeping aspect ratio with smaller dimension size set to --img_size. May fail for large images not fitting in gpu memory.

For --ind_bit with possible values 8 or 16 vqgan indices are saved as uint8 or int16 reducing compressed output file size. Only needed for compression.

--xformers uses xformers if available to reduce memory consumption and may also increse speed.

--float16 process in float16 precision to reduce memory consumption.

Currently 3 types of data compression is available.

For --kl autoencoder kl pretrained model encode output is saved.
If --kl not specified then vqgan encode output is saved.
If --vq_ind specified then indices are saved. These are used to reconstruct image.

Pretrained Model and Configs

Original configs can be found here. More weights can be found on latent diffusion repo. ru-dalle vq-f8-gumbel model trained on taming transformers repo can also be used.

For kl-f8 stable diffusion vae ckpt can be used. Gives 8x downsampling.

For kl-f4 config,

https://ommer-lab.com/files/latent-diffusion/kl-f4.zip

For vq-f4 config,

https://ommer-lab.com/files/latent-diffusion/vq-f4.zip

Following may provide better compression rates but there maybe noticable degradation in reconstructed images.

For vq-f8 config,

https://ommer-lab.com/files/latent-diffusion/vq-f8.zip

For vq-f8-n256 config,

https://ommer-lab.com/files/latent-diffusion/vq-f8-n256.zip

For kl-f16 config,

https://ommer-lab.com/files/latent-diffusion/kl-f16.zip

For kl-f32 config,

https://ommer-lab.com/files/latent-diffusion/kl-f32.zip

For vq-f8-gumbel config,

https://heibox.uni-heidelberg.de/d/2e5662443a6b4307b470/

For vq-f8-rudalle config,

https://huggingface.co/ai-forever/rudalle-utils/tree/main

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
configs		configs
vqcompress		vqcompress
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

vq-compress

Install

Commands

Compress

Decompress

Flags

Pretrained Model and Configs

References

About

Uh oh!

Releases 10

Packages

Uh oh!

Languages

License

quickgrid/vq-compress

Folders and files

Latest commit

History

Repository files navigation

vq-compress

Install

Commands

Compress

Decompress

Flags

Pretrained Model and Configs

References

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 10

Packages 0

Uh oh!

Languages

Packages