AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks
Offered By: University of Central Florida via YouTube
Course Description
Overview
Explore the innovative AttnGAN model for fine-grained text-to-image generation in this 46-minute lecture from the University of Central Florida. Delve into the architecture's key components, including the text encoder, conditioning augmentation, generator, attention network, and image encoder. Examine the DAMSM loss and its role in improving image quality. Learn about experimental results on various datasets, evaluation metrics like Inception score, and component analysis. Discover the model's capabilities in generating novel scenarios and understand its limitations in capturing global coherent structure. Gain insights into the challenges and advancements in text-to-image synthesis using attentional generative adversarial networks.
Syllabus
Intro
Problem: Text-to-image
Related work
Architecture - Motivation
Architecture - Text Encoder
Architecture - Conditioning Augmentation
Architecture - Generator F.
Architecture - Attention network Fatin
Architecture - Image Encoder
Architecture - DAMSM loss
Experiments - Datasets
Experiments - Evaluation • Inception score
Experiments - Component Analysis
Experiments - Qualitative (CUB)
Experiments - Novel scenarios
Experiments - Failure cases Did not capture global coherent structure
Taught by
UCF CRCV
Tags
Related Courses
Ilustración EditorialDomestika Introducción al grabado ilustrado con Procreate
Domestika Diffusion Models Beat GANs on Image Synthesis - Machine Learning Research Paper Explained
Yannic Kilcher via YouTube DALL-E Mini Explained - ML Coding Series
Aleksa Gordić - The AI Epiphany via YouTube Diffusion Models Beat GANs on Image Synthesis - ML Coding Series - Part 2
Aleksa Gordić - The AI Epiphany via YouTube