18786 Project: Enhancing Text-to-Image Generation with Fine-Grained Semantic Control

Enhance the AttnGAN model using state-of-the-art technology such as BERT and CLIP models for richer text interpretation and more detailed image outputs.

Branch Info

master: The original branch of AttGAN updated to latest torch versions with improved-gan from OpenAI for Inception Score calculation. Serves as our baseline for DAMSM with RNN text encoder and CNN image encoder.
bert: DAMSM with BERT based text encoder and CNN image encoder
clip: DAMSM with RNN based text encoder and CLIP image encoder
clip-text-image: DAMSM with CLIP text encoder and CLIP image encoder
bert-clip: DAMSM with BERT text encoder and CLIP image encoder

Data

Download preprocessed metadata forcoco and save them to data/
Download coco dataset and extract the images to data/coco/

Dependencies

pip install the following packages:

python-dateutil
easydict
pandas
torchfile
nltk
scikit-image==0.19.0
torch

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
DAMSMencoders		DAMSMencoders
code		code
data		data
improved-gan		improved-gan
models		models
.gitignore		.gitignore
18786 Fine Grained Text to Image Generation.pptx		18786 Fine Grained Text to Image Generation.pptx
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

18786 Project: Enhancing Text-to-Image Generation with Fine-Grained Semantic Control

Branch Info

Data

Dependencies

About

Uh oh!

Releases

Packages

Languages

angadbajwa23/IDL_Project

Folders and files

Latest commit

History

Repository files navigation

18786 Project: Enhancing Text-to-Image Generation with Fine-Grained Semantic Control

Branch Info

Data

Dependencies

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages