August 30, 2022

KUIS AI success in the 1st Shared Task on Multilingual Clause-Level Morphology

Congratulations to the KUIS AI Team for their success in MRL 2022: Emre Can Açıkgöz, Müge Kural, Tilek Chubakov, Gözde Gül Şahin, Deniz Yuret.

Full post...

August 12, 2022

Serdar Özsoy, M.S. 2022

Current position: Senior Specialist - Data Science in Arçelik Global (LinkedIn)
MS Thesis: Self-Supervised Learning with an Information Maximization Criterion. August 2022. (PDF, Presentation)

Thesis Abstract:

Self-supervised learning provides a solution to learn effective representations from large amounts of data without performing data labeling, which is often expensive in terms of time, effort, and cost.The main problem with the self-supervised learning approach, in general, is collapse, i.e., obtaining identical representations for all inputs while matching different representations generated from the same input. In this thesis, we argue that information maximization among latent representations of different versions of the same input naturally prevents collapse. To this end, we propose a novel self-supervised learning method, CorInfoMax, based on maximizing the second-order statistics-based measure of mutual information that reflects the degree of correlation between the latent representation arguments. Maximizing this correlative information measure between alternative latent representations of the same input serves two main purposes: (1) it avoids the collapse problem by generating feature vectors with non-degenerate covariances; (2) it increases the linear dependence between alternative representations, ensuring that they are related to each other. The proposed information maximization objective is simplified to an objective function based on Euclidean distance regularized by the log-determinant of the feature covariance matrix. Due to the regularization term acting as a natural barrier against feature space degeneracy, CorInfoMax also prevents dimensional collapse by enforcing representations to span across the entire feature space. Empirical experiments show that CorInfoMax achieves better or competitive performance results over state-of-the-art self-supervised learning methods across different tasks and datasets.

Full post...

August 09, 2022

Barış Batuhan Topal, M.S. 2022

Current position: ML Research Engineer at PixerLabs (LinkedIn)
MS Thesis: Domain-adaptive Self-supervised Pre-training for Face and Body Detection in Drawings. August 2022. (PDF, Presentation, Code).

Thesis Abstract:

Drawings are powerful means of pictorial abstraction and communication. Understanding diverse forms of drawings, including digital arts, cartoons, and comics, has been a major problem of interest for the computer vision and computer graphics communities. Although there are large amounts of digitized drawings from comic books and cartoons, they contain vast stylistic variations, which necessitate expensive manual labeling for training domain-specific recognizers.

In this work, I show how self-supervised learning, based on a teacher-student network with a modified student network update design, can be used to build face and body detectors. My setup allows exploiting large amounts of unlabeled data from the target domain when labels are provided for only a small subset of it. I further demonstrate that style transfer can be incorporated into my learning pipeline to bootstrap detectors using a vast amount of out-of-domain labeled images from natural images (i.e., images from the real world). My combined architecture yields detectors with state-of-the-art (SOTA) and near-SOTA performance using minimal annotation effort.

Through the utilization of this detector architecture, I accomplish a set of additional tasks. First, I extract a large set of facial drawing images (∼1.2 million instances) from unlabeled data and train SOTA generative adversarial network (GAN) models to generate and a SOTA GAN inversion model to reconstruct faces. When the detector-aided data is leveraged, these generative models successfully learn diverse stylistic features. Secondly, I implement an annotation tool to enlarge the existing set of annotated data. This tool offers users to annotate bounding boxes of panels, speech bubbles, narrations, faces, and bodies; to associate text boxes with faces and bodies; to transcript the text; to match the same characters in the image.

Full post...

August 08, 2022

Ahmet Canberk Baykal, M.S. 2022

Current position: PhD Student / AI Researcher at University of Cambridge (Homepage, LinkedIn, Email)
MS Thesis: GAN Inversion Based Image Manipulation with Text-Guided Encoders. August 2022. (PDF, Presentation ).

Thesis Abstract: Style-based Generative adversarial networks (StyleGAN) enable very high quality image synthesis while learning disentangled latent spaces. Hence, there is a lot of recent work focusing on semantic image editing by latent space manipulation. A particularly emerging field is editing images based on target textual descriptions. Existing approaches tackle this problem either by performing instance-level latent code optimization which is not very efficient or by mapping predefined text prompts to editing directions in the latent space. In contrast, in this thesis work, we present two novel approaches that enable image editing guided by textual descriptions. Our idea is to use either a text-conditioned encoder network or a text-conditioned adapter network that predicts a residual latent code in a feed forward manner. Both quantitative and qualitative results demonstrate that our methods outperform competing approaches in terms of manipulation accuracy, i.e., how well the synthesized images match the textual descriptions while ensuring highly realistic results and preserving features of the original image. We also demonstrate that our method can generalize to various domains including human faces, cats, and birds.

Full post...