Photorealistic images creation from semantic segmentation masks, which are labeled sketches that depict the layout of a scene.
DibPhot (Web Application) Repo | Report (Spanish)
The aim is to give realism to the semantic sketch, known as a segmentation map, by automatically adding colours, textures, shadows and reflections, among other details. To this purpose, techniques based on artificial neural networks is used, specifically generative models that allow images to be synthesised without the need to specify a symbolic model in detail. Such synthesised image can also be controlled by a style image that allows the scene’s setting to be changed. For example, a daytime scene can be turned into a sunset. To achieve the style transfer, a variational autoencoder is used and connected to the generative adversarial network in charge of image synthesis.
Clone this repo.
git clone https://github.com/MarAl15/SemanticImageSynthesis.git
cd SemanticImageSynthesis/
Create a virtual environment (recommended).
virtualenv --system-site-packages -p python3.8 ./env
source ./env/bin/activate
Install dependencies.
pip install -r requirements.txt
Please note that this code uses Tensorflow with GPU support. You must therefore have CUDA and cuDNN installed beforehand.
This code has been tested on an NVIDIA GeForce RTX 2060 with CUDA 10.1 + cuDNN 7.6.5 for TF 2.3.0 and CUDA 11.2 + cuDNN 8.10.0 for TF 2.10.0.
You must download the datasets beforehand. You can download the ADE20K dataset or its subset ADE20K Outdoors, among others.
If the error Input 'filename' of 'ReadFile' Op has type float32 that does not match expected type of string. is thrown, create new subdirectories to store them in. For instance,
(Directory structure)
img_train_path/
...train/
......train_image_001.jpg
......train_image_002.jpg
...... ...
segmask_train_path/
...train/
......label_image_001.jpg
......label_image_002.jpg
...... ...
=>
--image_dir img_train_path
--label_dir segmask_train_path
python train.py
- Load data
--image_dirMain directory name where the pictures are located.--label_dirMain directory name where the semantic segmentation masks are located.--semantic_label_pathFilename containing the semantic labels.--img_heightThe height size of image.--img_widthThe width size of image.--crop_sizeDesired size of the square crop.--batch_sizeMini-batch size.
- Image Encoder
--use_vaeIf specified, enable training with an image encoder.--lambda_kldWeight for KL Divergence loss.
- Generator
--z_dimDimension of the latent z vector.--lambda_featuresWeight for feature matching loss.--lambda_vggWeight for VGG loss.
- Adam Optimizer
--lrInitial learning rate.--beta1Hyperparameter to control the 1st moment decay.--beta2Hyperparameter to control the 2nd moment decay.
- Training
--total_epochsTotal number of epochs.--decay_epochEpoch from which the learning rate begins to decay linearly to zero.--prob_datasetPercentage of the maximum number elements in the dataset that will be used (and shuffled) for training and shuffle on each epoch.--print_info_freqFrequency to print information.--log_dirDirectory name to log losses.--save_img_freqFrequency to autosave the fake image, associated segmentation map and real image.--results_dirDirectory name to save the images.--save_model_freqFrequeny to save the checkpoints.--checkpoint_dirDirectory name to save them.
Please use python train.py --help or python train.py -h to see all the options.
$ python train.py --use_vae --image_dir "./datasets/ADE5K/images/" --label_dir "./datasets/ADE5K/annotations/" \
--semantic_label_path './datasets/ADE5K/semantic_labels.txt' \
--checkpoint_dir './checkpoints/ADE5K_VAE/' --results_dir './results/ADE5K_VAE/train' \
--decay_epoch 400 --total_epochs 800 --batch_size 2 --print_info_freq 20
python test_one.py
--label_filenameSemantic segmentation mask filename.--style_filenameStyle filter filename if use VAE.--use_vaeIf specified, enable training with an image encoder.--semantic_label_pathFilename containing the semantic labels.--checkpoint_dirDirectory name to restore the latest checkpoint.--results_dirDirectory name to save the images.
Please use python test_one.py --help or python test_one.py -h to see all the options.
- (Park et al.) T. Park, M. Liu, T. Wang, J. Zhu. "Semantic Image Synthesis with Spatially-Adaptive Normalization"
- Tensorflow Documentation
Mar Alguacil




