August 08, 2022

Ahmet Canberk Baykal, M.S. 2022

Current position: PhD Student / AI Researcher at University of Cambridge (Homepage, LinkedIn, Email)
MS Thesis: GAN Inversion Based Image Manipulation with Text-Guided Encoders. August 2022. (PDF, Presentation ).

Thesis Abstract: Style-based Generative adversarial networks (StyleGAN) enable very high quality image synthesis while learning disentangled latent spaces. Hence, there is a lot of recent work focusing on semantic image editing by latent space manipulation. A particularly emerging field is editing images based on target textual descriptions. Existing approaches tackle this problem either by performing instance-level latent code optimization which is not very efficient or by mapping predefined text prompts to editing directions in the latent space. In contrast, in this thesis work, we present two novel approaches that enable image editing guided by textual descriptions. Our idea is to use either a text-conditioned encoder network or a text-conditioned adapter network that predicts a residual latent code in a feed forward manner. Both quantitative and qualitative results demonstrate that our methods outperform competing approaches in terms of manipulation accuracy, i.e., how well the synthesized images match the textual descriptions while ensuring highly realistic results and preserving features of the original image. We also demonstrate that our method can generalize to various domains including human faces, cats, and birds.

No comments: