tag:blogger.com,1999:blog-85408762022-09-29T10:47:56.598+03:00Deniz Yuret's Homepage<a href="http://ai.ku.edu.tr">AI.KU</a>
<a href="http://goo.gl/Hq0sZ3">BibTeX</a>
<a href="http://goo.gl/5yFtRO">Courses</a>
<a href="http://goo.gl/wfy0NS">CV</a>
<a href="http://goo.gl/YdylL9">Downloads</a>
<a href="http://goo.gl/ELujdC">GitHub</a>
<a href="http://goo.gl/Us4SF6">Publications</a>
<a href="http://goo.gl/vBk9jm">Projects</a>
<a href="http://goo.gl/LmgQRC">Scholar</a>
<a href="http://goo.gl/vNpEjP">Students</a>
Deniz Yurethttp://www.blogger.com/profile/00578023665603100985noreply@blogger.comBlogger319125tag:blogger.com,1999:blog-8540876.post-16376889218556679992022-09-20T10:02:00.002+03:002022-09-20T10:02:43.627+03:00Teke Tek Bilim ProgramıHabertürk TV Teke Tek Bilim Programında Fatih Altaylı, Boğaziçi'nden Cem Say ve ODTÜ'den Şeyda Ertekin ile yapay zeka konuştuk. Tüm program için link: <a href="https://youtu.be/1R2XHcOXq9o" target="_blank">https://youtu.be/1R2XHcOXq9o</a>. <iframe width="400" height="222" src="https://www.youtube.com/embed/QPVpptA94IE" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> <span class="fullpost"></span>Deniz Yurethttp://www.blogger.com/profile/00578023665603100985noreply@blogger.com0tag:blogger.com,1999:blog-8540876.post-13443285485671762982022-09-19T08:12:00.000+03:002022-09-19T08:12:20.388+03:00Self-Supervised Learning with an Information Maximization CriterionSerdar Ozsoy, Shadi Hamdan, Sercan Ö. Arik, Deniz Yuret, Alper T. Erdogan. To appear in <a href="https://nips.cc/Conferences/2022" target="_blank">NeurIPS, Nov 2022</a>. (<a href="https://arxiv.org/pdf/2209.07999" target="_blank">PDF</a>, <a href="https://arxiv.org/abs/2209.07999" target="_blank">arXiv:2209.07999</a>) <div class="separator" style="clear: both;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjRy4_4q7Q0Fz0sKH0LWs4WBRkXm9dX8s9UfleLnnYnr9q9KWth1d5nFwTy95yVbg9bdSMkps-KgDpKYqlrg0R2Seo2vpdQN1pSTtxlbblOP4E1vlvXwzntGPGsxNBpRRwZvhXvxhr_zGIyu4lBNYuuqP_TbTOsFblbo6veDq10HEfnSiadRw/s1600/Screenshot%202022-09-19%208.11.03%20AM.png" style="display: block; padding: 1em 0; text-align: center; "><img alt="" border="0" data-original-height="163" data-original-width="388" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjRy4_4q7Q0Fz0sKH0LWs4WBRkXm9dX8s9UfleLnnYnr9q9KWth1d5nFwTy95yVbg9bdSMkps-KgDpKYqlrg0R2Seo2vpdQN1pSTtxlbblOP4E1vlvXwzntGPGsxNBpRRwZvhXvxhr_zGIyu4lBNYuuqP_TbTOsFblbo6veDq10HEfnSiadRw/s1600/Screenshot%202022-09-19%208.11.03%20AM.png"/></a></div> <span class="fullpost"><p><b>Abstract:</b>Self-supervised learning allows AI systems to learn effective representations from large amounts of data using tasks that do not require costly labeling. Mode collapse, i.e., the model producing identical representations for all inputs, is a central problem to many self-supervised learning approaches, making self-supervised tasks, such as matching distorted variants of the inputs, ineffective. In this article, we argue that a straightforward application of information maximization among alternative latent representations of the same input naturally solves the collapse problem and achieves competitive empirical results. We propose a self-supervised learning method, CorInfoMax, that uses a second-order statistics-based mutual information measure that reflects the level of correlation among its arguments. Maximizing this correlative information measure between alternative representations of the same input serves two purposes: (1) it avoids the collapse problem by generating feature vectors with non-degenerate covariances; (2) it establishes relevance among alternative representations by increasing the linear dependence among them. An approximation of the proposed information maximization objective simplifies to a Euclidean distance-based objective function regularized by the log-determinant of the feature covariance matrix. The regularization term acts as a natural barrier against feature space degeneracy. Consequently, beyond avoiding complete output collapse to a single point, the proposed approach also prevents dimensional collapse by encouraging the spread of information across the whole feature space. Numerical experiments demonstrate that CorInfoMax achieves better or competitive performance results relative to the state-of-the-art SSL approaches. </p></span>Deniz Yurethttp://www.blogger.com/profile/00578023665603100985noreply@blogger.com0tag:blogger.com,1999:blog-8540876.post-58422793621843702482022-09-09T19:58:00.024+03:002022-09-16T20:23:45.628+03:00Müge Kural, M.S. 2022<table><tr valign="top"><td><div class="separator" style="clear: both;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiMUmxaMA2fFU_BCBVS53hcssV6QqlCKib3_rJSDx7jP5V2JFVPVAjb8Gxg8_jsC0x0pEk5Gb5liNQRHBhFTawcUrP2s9jDzET-8hxwOkQBKE3GGsz2vGcJYnk96eI830gE-Ux1mUaQb5BmArGu2HK-HzzN_wSujbdVr89gEHeZ4909G6-RPg/s200/MugeKural_KocUniversity2016.jpg" style="display: block; padding: 1em 0; text-align: center; "><img alt="" border="0" height="120" data-original-height="200" data-original-width="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiMUmxaMA2fFU_BCBVS53hcssV6QqlCKib3_rJSDx7jP5V2JFVPVAjb8Gxg8_jsC0x0pEk5Gb5liNQRHBhFTawcUrP2s9jDzET-8hxwOkQBKE3GGsz2vGcJYnk96eI830gE-Ux1mUaQb5BmArGu2HK-HzzN_wSujbdVr89gEHeZ4909G6-RPg/s200/MugeKural_KocUniversity2016.jpg"/></a></div> </td><td><br/><b>Current position</b>: PhD Student, Koç University (<a href="https://www.linkedin.com/in/muge-kural-1038b841">LinkedIn</a>, <a href="mailto:mugekural@gmail.com">Email</a>) <!--, <a href="">LinkedIn</a>, <a href="">Email</a>) --> <br/><b>MS Thesis</b>: Unsupervised learning of morphology. September 2022. (<a href="https://drive.google.com/file/d/1k3xi4NW3Ey7kAqd60mB8rtxsicT-V1_C/view?usp=sharing">PDF</a>, <a href="https://docs.google.com/presentation/d/17lI4pxgFa4BF6zhyQVdR5eqYOjB23phXv6fPc_U6gCc/edit?usp=sharing">Presentation</a>) <!-- , <a href="https://github.com//">Code</a> --> </td></tr></table><span class="fullpost"><p><iframe src="https://docs.google.com/viewer?srcid=17lI4pxgFa4BF6zhyQVdR5eqYOjB23phXv6fPc_U6gCc&pid=explorer&efh=false&a=v&chrome=false&embedded=true" frameborder="0" width="420" height="245" allowfullscreen="true" mozallowfullscreen="true" webkitallowfullscreen="true"></iframe></p> <b>Thesis Abstract:</b><p> Unsupervised learning of morphological rules is one of the expected abilities of natural language processing (NLP) models since children learn these rules during their native language acquisition without supervision. Based on this expectation, we present a comprehensive experimental setup for evaluating the morphological learning of several unsupervised models such as Autoencoders (AE), Variational Autoencoders (VAE), Character-level Language Models (CharLM) and Vector Quantized Variational Autoencoders (VQVAE) at the following tasks: probing for morphological features, morphological segmentation and morphological reinflection. In our study, we show that for probing, all models outperform baselines with an indication of encoding morphological knowledge; for morphological segmentation, VAE and CharLMs have comparable performances to unsupervised SOTA models; for morphological reinflection, VQVAE with multiple codebooks has the ability to identify the lemma and suffixes of a word and turns out to be a good candidate to perform inflectional tasks. </p> </span>Deniz Yurethttp://www.blogger.com/profile/00578023665603100985noreply@blogger.com0tag:blogger.com,1999:blog-8540876.post-7636795723084609922022-08-30T20:40:00.000+03:002022-08-30T20:40:20.360+03:00KUIS AI success in the 1st Shared Task on Multilingual Clause-Level MorphologyCongratulations to the KUIS AI Team for their success in <a href="https://sigtyp.github.io/st2022-mrl.html" target="_blank">MRL 2022</a>: Emre Can Açıkgöz, Müge Kural, Tilek Chubakov, Gözde Gül Şahin, Deniz Yuret. <div class="separator" style="clear: both;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhC7W5U66XmLVGpr82XXEhuL7yQYNBpg5Pgze01KbwpIXQhdTmKofOUEAgVMLx5h73fwdkp1gUent7ZoY1GIM_hAojho4_Dxf3m4Ha0Go4K85haL1HaSXWTrpG_ZtqV2JGAzCTvX69cDXh5RwhnyDLUuGshz7iGQqeKDqTHODX_xkwpQUfdpw/s1600/IMG-20220830-WA0000.jpg" style="display: block; padding: 1em 0; text-align: center; "><img alt="" border="0" data-original-height="1236" data-original-width="827" width="300" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhC7W5U66XmLVGpr82XXEhuL7yQYNBpg5Pgze01KbwpIXQhdTmKofOUEAgVMLx5h73fwdkp1gUent7ZoY1GIM_hAojho4_Dxf3m4Ha0Go4K85haL1HaSXWTrpG_ZtqV2JGAzCTvX69cDXh5RwhnyDLUuGshz7iGQqeKDqTHODX_xkwpQUfdpw/s1600/IMG-20220830-WA0000.jpg"/></a></div><span class="fullpost"></span>Deniz Yurethttp://www.blogger.com/profile/00578023665603100985noreply@blogger.com0tag:blogger.com,1999:blog-8540876.post-26718686765767883772022-08-12T11:00:00.020+03:002022-09-16T20:22:00.531+03:00Serdar Özsoy, M.S. 2022<table><tr valign="top"><td> <div class="separator" style="clear: both;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgQEWkEy8rAO3FoysbRwSYu6bc1_eReh4Dr4Pw807XrHx2JU6VeSsT50f0F6cFmoOfKFNjjinfrRgoqjbwAskzlWlqdS80BmcKKA99vlGXsd6G92SfO6yoIRJXnfE68W4G_accwEPHU0J1OuDf1G9dbZftj80nsr9afOqaoOuaViQdD6xGJZQ/s374/1516641965734.jpeg" style="display: block; padding: 1em 0; text-align: center; "><img alt="" border="0" height="120" data-original-height="374" data-original-width="374" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgQEWkEy8rAO3FoysbRwSYu6bc1_eReh4Dr4Pw807XrHx2JU6VeSsT50f0F6cFmoOfKFNjjinfrRgoqjbwAskzlWlqdS80BmcKKA99vlGXsd6G92SfO6yoIRJXnfE68W4G_accwEPHU0J1OuDf1G9dbZftj80nsr9afOqaoOuaViQdD6xGJZQ/s320/1516641965734.jpeg"/></a></div> </td><td><br/><b>Current position</b>: Senior Specialist - Data Science in Arçelik Global (<a href="https://www.linkedin.com/in/serdar-ozsoy/">LinkedIn</a>) <!--, <a href="">LinkedIn</a>, <a href="">Email</a>) --> <br/><b>MS Thesis</b>: Self-Supervised Learning with an Information Maximization Criterion. August 2022. (<a href="https://drive.google.com/file/d/1a2WCYitV7-39n8vxYHpHzxQNNWRtqdtN/view?usp=sharing">PDF</a>, <a href="https://drive.google.com/file/d/1KQMOyzbCrQaMgjmB1qZzuTQLZEGrv3q_/view?usp=sharing">Presentation</a>) <!-- , <a href="https://github.com//">Code</a> --> </td></tr></table><span class="fullpost"><p><iframe src="https://docs.google.com/viewer?srcid=1KQMOyzbCrQaMgjmB1qZzuTQLZEGrv3q_&pid=explorer&efh=false&a=v&chrome=false&embedded=true" frameborder="0" width="420" height="245" allowfullscreen="true" mozallowfullscreen="true" webkitallowfullscreen="true"></iframe></p> <b>Thesis Abstract:</b><p> Self-supervised learning provides a solution to learn effective representations from large amounts of data without performing data labeling, which is often expensive in terms of time, effort, and cost.The main problem with the self-supervised learning approach, in general, is collapse, i.e., obtaining identical representations for all inputs while matching different representations generated from the same input. In this thesis, we argue that information maximization among latent representations of different versions of the same input naturally prevents collapse. To this end, we propose a novel self-supervised learning method, CorInfoMax, based on maximizing the second-order statistics-based measure of mutual information that reflects the degree of correlation between the latent representation arguments. Maximizing this correlative information measure between alternative latent representations of the same input serves two main purposes: (1) it avoids the collapse problem by generating feature vectors with non-degenerate covariances; (2) it increases the linear dependence between alternative representations, ensuring that they are related to each other. The proposed information maximization objective is simplified to an objective function based on Euclidean distance regularized by the log-determinant of the feature covariance matrix. Due to the regularization term acting as a natural barrier against feature space degeneracy, CorInfoMax also prevents dimensional collapse by enforcing representations to span across the entire feature space. Empirical experiments show that CorInfoMax achieves better or competitive performance results over state-of-the-art self-supervised learning methods across different tasks and datasets. </p> </span>Deniz Yurethttp://www.blogger.com/profile/00578023665603100985noreply@blogger.com0tag:blogger.com,1999:blog-8540876.post-60992200019416863942022-08-09T17:00:00.025+03:002022-08-18T10:26:19.182+03:00Barış Batuhan Topal, M.S. 2022<table><tr valign="top"><td> <div class="separator" style="clear: both;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiR9eaJjnPEKSg2Qc1ZBG_qSUJ2bcLyBq0vEG2yw4bk9c0OyQKEk1ACEptyF_ji3hr18R0nziBNkgNRMnUZIOqEfcHjbzbMA9uUJUqgxy_UljABh4gtlcXkaNwrEFUmzOGSICdKQTe3Yc9Sh-2DiYmO8Dxqc192PvlN8lkSULESyVdM54l6rg/s300/Bar%C4%B1%C5%9FBatuhanTopal_SabanciUniversity2020_2-213x300.jpg" style="display: block; padding: 1em 0; text-align: center; "><img alt="" border="0" height="120" data-original-height="300" data-original-width="213" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiR9eaJjnPEKSg2Qc1ZBG_qSUJ2bcLyBq0vEG2yw4bk9c0OyQKEk1ACEptyF_ji3hr18R0nziBNkgNRMnUZIOqEfcHjbzbMA9uUJUqgxy_UljABh4gtlcXkaNwrEFUmzOGSICdKQTe3Yc9Sh-2DiYmO8Dxqc192PvlN8lkSULESyVdM54l6rg/s200/Bar%C4%B1%C5%9FBatuhanTopal_SabanciUniversity2020_2-213x300.jpg"/></a></div> </td><td><br/><b>Current position</b>: PhD Student, Koç University (<a href="https://www.linkedin.com/in/barisbatuhan/">LinkedIn</a>) <br/><b>MS Thesis</b>: Domain-adaptive Self-supervised Pre-training for Face and Body Detection in Drawings. August 2022. (<a href="https://drive.google.com/file/d/1gyIDamWdzrzAu7vaokQ-QiXsm91ElfDs/view?usp=sharing">PDF</a>, <a href="https://drive.google.com/file/d/1v6iXGvo-zoXrxssIozNpeNTkE8w4DISX/view?usp=sharing">Presentation</a>, <a href="https://github.com/barisbatuhan">Code</a>). </td></tr></table><span class="fullpost"> <p><iframe src="https://docs.google.com/viewer?srcid=1v6iXGvo-zoXrxssIozNpeNTkE8w4DISX&pid=explorer&efh=false&a=v&chrome=false&embedded=true" width="410px" height="242px"></iframe></p> <b>Thesis Abstract:</b><p> Drawings are powerful means of pictorial abstraction and communication. Understanding diverse forms of drawings, including digital arts, cartoons, and comics, has been a major problem of interest for the computer vision and computer graphics communities. Although there are large amounts of digitized drawings from comic books and cartoons, they contain vast stylistic variations, which necessitate expensive manual labeling for training domain-specific recognizers. </p><p> In this work, I show how self-supervised learning, based on a teacher-student network with a modified student network update design, can be used to build face and body detectors. My setup allows exploiting large amounts of unlabeled data from the target domain when labels are provided for only a small subset of it. I further demonstrate that style transfer can be incorporated into my learning pipeline to bootstrap detectors using a vast amount of out-of-domain labeled images from natural images (i.e., images from the real world). My combined architecture yields detectors with state-of-the-art (SOTA) and near-SOTA performance using minimal annotation effort. </p><p>Through the utilization of this detector architecture, I accomplish a set of additional tasks. First, I extract a large set of facial drawing images (∼1.2 million instances) from unlabeled data and train SOTA generative adversarial network (GAN) models to generate and a SOTA GAN inversion model to reconstruct faces. When the detector-aided data is leveraged, these generative models successfully learn diverse stylistic features. Secondly, I implement an annotation tool to enlarge the existing set of annotated data. This tool offers users to annotate bounding boxes of panels, speech bubbles, narrations, faces, and bodies; to associate text boxes with faces and bodies; to transcript the text; to match the same characters in the image. </p></span>Deniz Yurethttp://www.blogger.com/profile/00578023665603100985noreply@blogger.com0tag:blogger.com,1999:blog-8540876.post-78624209814946770952022-08-08T16:50:00.003+03:002022-08-18T10:26:47.099+03:00Ahmet Canberk Baykal, M.S. 2022<table> <tr><td valign="top"><div class="separator" style="clear: both;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgckb1J3nMO2T0GtloTeEFvKhDif0OFbme2PIsIWARKvJmJLd0uwwkRW0LWJ04wom2WuHUFUYa9Nf0TmDustXcRV0p9pxJ1ym7AmGf84QAH5xKcrfL8Nj8pZw1ytWt25RzYPGXVZ-HNd2AulMIbDXHNtXfDVVvS4tLYyWnvlIpKRmqXbsGE8Q/s300/AhmetCanberkBaykal_BilkentUniversity2020-e1599730760283-210x300.jpg" style="display: block; padding: 1em 0; text-align: center; "><img alt="" border="0" height="120" data-original-height="300" data-original-width="210" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgckb1J3nMO2T0GtloTeEFvKhDif0OFbme2PIsIWARKvJmJLd0uwwkRW0LWJ04wom2WuHUFUYa9Nf0TmDustXcRV0p9pxJ1ym7AmGf84QAH5xKcrfL8Nj8pZw1ytWt25RzYPGXVZ-HNd2AulMIbDXHNtXfDVVvS4tLYyWnvlIpKRmqXbsGE8Q/s200/AhmetCanberkBaykal_BilkentUniversity2020-e1599730760283-210x300.jpg"/></a></div></td> <td valign="top"><br/><b>Current position</b>: PhD Student, Koç University (<a href="https://johnberg1.github.io/">Homepage</a>, <a href="https://www.linkedin.com/in/canberkbaykal/">LinkedIn</a>, <a href="mailto:canberk.baykal1@gmail.com">Email</a>) <br/><b>MS Thesis</b>: GAN Inversion Based Image Manipulation with Text-Guided Encoders. August 2022. (<a href="https://drive.google.com/file/d/1IDV2jGJtwpXQ9ZoF5A81s-aVm7REighA/view?usp=sharing">PDF</a>, <a href="https://docs.google.com/presentation/d/153OzY01CkTGenvP06t7UBHVk8XTfvdNRm4miolzbtTw/edit?usp=sharing">Presentation</a> <!-- , <a href="https://github.com//">Code</a>-->). </td></tr></table><span class="fullpost"><p><iframe src="https://docs.google.com/presentation/d/e/2PACX-1vTvJod8kc2nyIyCW3x1SPZLV_uF37WoI9_qHQxWlGZIiSm6DAhqynjdF-K1KkHIazkIcaRmqpWCanRW/embed?start=false&loop=false&delayms=3000" frameborder="0" width="420" height="265" allowfullscreen="true" mozallowfullscreen="true" webkitallowfullscreen="true"></iframe></p> <b>Thesis Abstract:</b>Style-based Generative adversarial networks (StyleGAN) enable very high quality image synthesis while learning disentangled latent spaces. Hence, there is a lot of recent work focusing on semantic image editing by latent space manipulation. A particularly emerging field is editing images based on target textual descriptions. Existing approaches tackle this problem either by performing instance-level latent code optimization which is not very efficient or by mapping predefined text prompts to editing directions in the latent space. In contrast, in this thesis work, we present two novel approaches that enable image editing guided by textual descriptions. Our idea is to use either a text-conditioned encoder network or a text-conditioned adapter network that predicts a residual latent code in a feed forward manner. Both quantitative and qualitative results demonstrate that our methods outperform competing approaches in terms of manipulation accuracy, i.e., how well the synthesized images match the textual descriptions while ensuring highly realistic results and preserving features of the original image. We also demonstrate that our method can generalize to various domains including human faces, cats, and birds. </span>Deniz Yurethttp://www.blogger.com/profile/00578023665603100985noreply@blogger.com0tag:blogger.com,1999:blog-8540876.post-66566480992868590122022-06-20T09:27:00.003+03:002022-06-21T17:41:17.736+03:00Modulating Bottom-Up and Top-Down Visual Processing via Language-Conditional Filtersİlker Kesen, Ozan Arkan Can, Erkut Erdem, Aykut Erdem, Deniz Yuret. June 20, 2022. Best paper at the 5th Multimodal Learning and Applications Workshop (<a href="https://mula-workshop.github.io/" target="_blank">MULA 2022</a>) in conjunction with <a href="https://cvpr2022.thecvf.com/" target="_blank">CVPR 2022</a>. (<a href="https://openaccess.thecvf.com/content/CVPR2022W/MULA/html/Kesen_Modulating_Bottom-Up_and_Top-Down_Visual_Processing_via_Language-Conditional_Filters_CVPRW_2022_paper.html" target="_blank">PDF</a>, <a href="https://arxiv.org/abs/2003.12739">arXiv:2003.12739</a>, <a href="https://drive.google.com/file/d/1O50xQ91hyRNB5vrxcf0A7-PQLOBOaag_/view" target="_blank">presentation video</a>). <div class="separator" style="clear: both;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiyBOSSFbvD0dQh-Inc_iYVicFBkaMLt96Q7jONNnAxjm2oocnrKWWKhlC4s4mKt3OF9MO0euO6-O2mWHVsMW-fEKlEvKqRwSnd73M4a7JkfE_x0ZTM-uxrL3Mu-mNrRb87tD6IF-_U-ETeguVXFVht2wYll6YJYRlfxjpcdtTqyNLlpRKhlQ/s2434/PastedGraphic-4.png" style="display: block; padding: 1em 0; text-align: center; "><img alt="" border="0" width="400" data-original-height="1440" data-original-width="2434" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiyBOSSFbvD0dQh-Inc_iYVicFBkaMLt96Q7jONNnAxjm2oocnrKWWKhlC4s4mKt3OF9MO0euO6-O2mWHVsMW-fEKlEvKqRwSnd73M4a7JkfE_x0ZTM-uxrL3Mu-mNrRb87tD6IF-_U-ETeguVXFVht2wYll6YJYRlfxjpcdtTqyNLlpRKhlQ/s320/PastedGraphic-4.png"/></a></div> <span class="fullpost"><p><b> Abstract:</b> How to best integrate linguistic and perceptual processing in multi-modal tasks that involve language and vision is an important open problem. In this work, we argue that the common practice of using language in a top-down manner, to direct visual attention over high-level visual features, may not be optimal. We hypothesize that the use of language to also condition the bottom-up processing from pixels to high-level features can provide benefits to the overall performance. To support our claim, we propose a model for language-vision problems involving dense prediction, and perform experiments on two different multi-modal tasks: image segmentation from referring expressions and language-guided image colorization. We compare results where either one or both of the top-down and bottom-up visual branches are conditioned on language. Our experiments reveal that using language to control the filters for bottom-up visual processing in addition to top-down attention leads to better results on both tasks and achieves state-of-the-art performance. Our analysis of different word types in input expressions suggest that the bottom-up conditioning is especially helpful in the presence of low level visual concepts like color.</p></span>Deniz Yurethttp://www.blogger.com/profile/00578023665603100985noreply@blogger.com0tag:blogger.com,1999:blog-8540876.post-55372388196079906182022-06-09T18:01:00.017+03:002022-06-21T18:06:21.325+03:00Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models (BIG-bench)Srivastava et al. (442 authors). March 2022. <a href="https://arxiv.org/abs/2206.04615" target="_blank">arXiv:2206.04615</a> [cs.CL]. (<a href="https://github.com/google/BIG-bench" target="_blank">github</a>). <p><b>Abstract:</b> Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-future capabilities and limitations of language models. To address this challenge, we introduce the Beyond the Imitation Game benchmark (BIG-bench). BIG-bench currently consists of 204 tasks, contributed by 442 authors across 132 institutions. Task topics are diverse, drawing problems from linguistics, childhood development, math, common-sense reasoning, biology, physics, social bias, software development, and beyond. BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models. We evaluate the behavior of OpenAI's GPT models, Google-internal dense transformer architectures, and Switch-style sparse transformers on BIG-bench, across model sizes spanning millions to hundreds of billions of parameters. In addition, a team of human expert raters performed all tasks in order to provide a strong baseline. Findings include: model performance and calibration both improve with scale, but are poor in absolute terms (and when compared with rater performance); performance is remarkably similar across model classes, though with benefits from sparsity; tasks that improve gradually and predictably commonly involve a large knowledge or memorization component, whereas tasks that exhibit "breakthrough" behavior at a critical scale often involve multiple steps or components, or brittle metrics; social bias typically increases with scale in settings with ambiguous context, but this can be improved with prompting.</p><span class="fullpost"></span>Deniz Yurethttp://www.blogger.com/profile/00578023665603100985noreply@blogger.com0tag:blogger.com,1999:blog-8540876.post-5340872483781608812022-05-25T17:32:00.004+03:002022-06-21T17:59:38.553+03:00CRAFT: A Benchmark for Causal Reasoning About Forces and inTeractionsTayfun Ates, M. Ateşoğlu, Çağatay Yiğit, Ilker Kesen, Mert Kobas, Erkut Erdem, Aykut Erdem, Tilbe Goksun, Deniz Yuret. May 2022. In Findings of the Association for Computational Linguistics: ACL 2022, pages 2602–2627, Dublin, Ireland. Association for Computational Linguistics. (<a href="https://aclanthology.org/2022.findings-acl.205/" target="_blank">PDF</a>, <a href="https://openreview.net/forum?id=WOF1QKCD_2D" target="_blank">openreview</a>, <a href="https://arxiv.org/abs/2012.04293">arXiv:2012.04293</a>, <a href="https://docs.google.com/presentation/d/1iRoka0CIgH1l1rVSpkylNhzKqDsH9ht8SoDs60_HYqo/edit?usp=sharing" target="_blank">poster</a>). <p><iframe src="https://docs.google.com/presentation/d/e/2PACX-1vROiNQm-liu1JmB1weQlTCu9q1ACen0GuLavgAYuHW0XmJSSMGWG9rjwvRnoPFJeg82tgUy6tXJ8DKW/embed?start=false&loop=false&delayms=3000" frameborder="0" width="400" height="594" allowfullscreen="true" mozallowfullscreen="true" webkitallowfullscreen="true"></iframe></p> <span class="fullpost"><p><b> Abstract:</b> Humans are able to perceive, understand and reason about causal events. Developing models with similar physical and causal understanding capabilities is a long-standing goal of artificial intelligence. As a step towards this direction, we introduce CRAFT, a new video question answering dataset that requires causal reasoning about physical forces and object interactions. It contains 58K video and question pairs that are generated from 10K videos from 20 different virtual environments, containing various objects in motion that interact with each other and the scene. Two question categories in CRAFT include previously studied descriptive and counterfactual questions. Additionally, inspired by the Force Dynamics Theory in cognitive linguistics, we introduce a new causal question category that involves understanding the causal interactions between objects through notions like cause, enable, and prevent. Our results show that even though the questions in CRAFT are easy for humans, the tested baseline models, including existing state-of-the-art methods, do not yet deal with the challenges posed in our benchmark.</p></span>Deniz Yurethttp://www.blogger.com/profile/00578023665603100985noreply@blogger.com0tag:blogger.com,1999:blog-8540876.post-59086954382904474562022-05-25T17:14:00.006+03:002022-06-21T17:25:44.824+03:00Mukayese: Turkish NLP Strikes BackAli Safaya, Emirhan Kurtuluş, Arda Göktoğan, Deniz Yuret. May 2022. In Findings of the Association for Computational Linguistics: ACL 2022, pages 846–863, Dublin, Ireland. Association for Computational Linguistics. (<a href="https://aclanthology.org/2022.findings-acl.69/" target="_blank">PDF</a>, <a href="https://openreview.net/forum?id=l9H3-sPAnY" target="_blank">openreview</a>, <a href="https://arxiv.org/abs/2203.01215">arXiv:2203.01215</a>, <a href="https://docs.google.com/presentation/d/1ybCBn5hdM1Zgk1QI3pawuArZ1EiM4KDFzfNc2N-SMZE/edit?usp=sharing" target="_blank">poster</a>). <p><iframe src="https://docs.google.com/presentation/d/e/2PACX-1vTZyODOE9HeFrqLKlUA5UWm7Zia-zmrbAsNwMlugWXt3pqk5hE8ZDdS27hPHEhRWFMOHw2P9NPEjUEX/embed?start=false&loop=false&delayms=3000" frameborder="0" width="400" height="594" allowfullscreen="true" mozallowfullscreen="true" webkitallowfullscreen="true"></iframe></p> <span class="fullpost"><p><b> Abstract:</b> Having sufficient resources for language X lifts it from the under-resourced languages class, but not necessarily from the under-researched class. In this paper, we address the problem of the absence of organized benchmarks in the Turkish language. We demonstrate that languages such as Turkish are left behind the state-of-the-art in NLP applications. As a solution, we present Mukayese, a set of NLP benchmarks for the Turkish language that contains several NLP tasks. We work on one or more datasets for each benchmark and present two or more baselines. Moreover, we present four new benchmarking datasets in Turkish for language modeling, sentence segmentation, and spell checking. All datasets and baselines are available under: https://github.com/alisafaya/mukayese.</p></span>Deniz Yurethttp://www.blogger.com/profile/00578023665603100985noreply@blogger.com0tag:blogger.com,1999:blog-8540876.post-11593590076677416802022-04-21T11:18:00.002+03:002022-04-25T15:15:40.796+03:00Türkçe Dil Deposu (TDD) Lansman Videosu<iframe width="400" height="222" src="https://www.youtube.com/embed/Z2ZCFH558NM" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe><p>TDD (<a href="https://tdd.ai">Turkish Data Depository</a>, <a href="https://twitter.com/turkcedildeposu">@turkcedildeposu</a>) lansman videomuz yayında:</p><ul><li> panel: <a href="https://youtu.be/Z2ZCFH558NM?t=1130">https://youtu.be/Z2ZCFH558NM?t=1130</a><li> içerik: <a href="https://youtu.be/Z2ZCFH558NM?t=4779">https://youtu.be/Z2ZCFH558NM?t=4779</a><li> uygulamalar: <a href="https://youtu.be/Z2ZCFH558NM?t=7438">https://youtu.be/Z2ZCFH558NM?t=7438</a><li> medium post: <a href="https://medium.com/@tropensourceplatform/t%C3%BCrk%C3%A7e-do%C4%9Fal-dil-i%CC%87%C5%9Fleme-i%C3%A7in-b%C3%BCy%C3%BCk-at%C4%B1l%C4%B1m-c8af18d8f05c">https://medium.com</a><li> website: <a href="https://tdd.ai">https://tdd.ai</a><li> duyurular: <a href="https://groups.google.com/g/tdd-group">https://groups.google.com/g/tdd-group</a></ul><span class="fullpost"> <div class="separator" style="clear: both;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi-65ZW6bvGcFYOgeWPZKi0UBBY5fHLz5tbHpGYfPoieq-m0rcAV_cr1iMgPTOoKf115klcYBorp-tU51Zl5bmzMjGqNUDt-qn3ylQBDl4Wy-lr5CzLa6JitvPYGK1QjpUvj0x81922Q9gi3spNEspiPbo4oL89LK6hSXBOhvW2RkFjQQIi7g/s680/FQtsVd0XEAo68Ry.jpeg" style="display: block; padding: 1em 0; text-align: center; "><img alt="" border="0" height="540" data-original-height="680" data-original-width="481" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi-65ZW6bvGcFYOgeWPZKi0UBBY5fHLz5tbHpGYfPoieq-m0rcAV_cr1iMgPTOoKf115klcYBorp-tU51Zl5bmzMjGqNUDt-qn3ylQBDl4Wy-lr5CzLa6JitvPYGK1QjpUvj0x81922Q9gi3spNEspiPbo4oL89LK6hSXBOhvW2RkFjQQIi7g/s320/FQtsVd0XEAo68Ry.jpeg"/></a></div></span>Deniz Yurethttp://www.blogger.com/profile/00578023665603100985noreply@blogger.com0tag:blogger.com,1999:blog-8540876.post-1119649271047308932022-04-16T12:22:00.000+03:002022-04-16T12:22:42.060+03:00Deniz Yuret's Math ProblemsThis is a list of elementary problems I like for one reason or another from various branches of mathematics. I cited the people I heard the problems from, they are not necessarily the originators. The unsolved flag just means that the problem is yet unsolved by me. Send me a solution if you find one. You can send me any interesting problems you find by adding a comment to this post. Also check out my other <a href="/search/label/Math">math posts</a>, especially <a href="/2005/06/probability-twisters.html">probability twisters</a>, <a href="/2005/06/unsolved-problems.html">unsolved elementary problems</a>, and <a href="/2005/06/math-links.html">links to other math sites</a>. <i>Last update: April 16, 2022.</i><span class="fullpost"> <ol reversed><li> (Alkan) There are four cities at the corners of a unit square. You are tasked with connecting them to each other using roads so that one can get from any of the four cities to any other. What is the shortest length of road you can do this with? (Hint: the answer is less than 2 sqrt(2)). <li> (Kleinberg and Tardos) Alice and Bob have n numbers each for a total of 2n distinct numbers. Each can tell you the k'th smallest number they have but cannot see each other's numbers. What is the minimum number of queries you can ask to find the median of these 2n numbers? <li> (<a href="https://www.quantamagazine.org/20160313-mathematicians-discover-prime-conspiracy">Quanta Magazine</a>) Show that if Alice tosses a fair coin until she sees a head followed by a tail, and Bob tosses a coin until he sees two heads in a row, then on average, Alice will require four tosses while Bob will require six tosses. <li> (Cihan Baran) If we sample k times with replacement from the set {1, 2, ..., n} (all members picked with equal probability) what is the probability that at least one member will not be picked? <li> (Bertsekas and Tsitsiklis) Suppose that n people throw their hats in a box and then each picks one hat at random. (Each hat can be picked by only one person, and each assignment of hats to persons is equally likely.) What is the expected value of X, the number of people that get back their own hat? <li> (Paul Lockhart) Must there always be two people at a party who have the same number of friends there? <li> (Volkan Cirik) Consider a game played with an array of 2n random numbers. The first player can pick the number at either end of the array, the second player can pick from either end of the remaining numbers etc. The players take turns picking numbers from either end until no more numbers remain. Whoever has the highest total wins. Show that the first player can always win or draw. <li> (Ahmed Roman) Nathan and Abi are playing a game. Abi always goes first. The players take turns changing a positive integer to a smaller one and then passing the smaller number back to their opponent. On each move, a player may either subtract one from the integer or halve it, rounding down if necessary. Thus, from 28 the legal moves are to 27 or to 14; from 27, the legal moves are to 26 or to 13. The game ends when the integer reaches 0. The player who makes the last move wins. For example, if the starting integer is 15, Abi might move to 7, Nathan to 6, Abi to 3, Nathan to 2, Abi to 1, and now Nathan moves to 0 and wins. (However, in this sample game Abi could have played better!) <ol><li>Assuming both Nathan and Abi play according to the best possible strategy, who will win if the starting integer is 1000? 2000? Prove your answer. <li>As you might expect, for some starting integers Abi will win and for others Nathan will win. If we pick a starting integer at random from all the integers from 1 to n inclusive, we can consider the probability of Nathan winning. This probability will fluctuate as n increases, but what is its limit as n tends to infinity? Prove your answer. </ol><li> (Serdar Tasiran) We have an n boxes, k of them have a ball in them (k<=n), the others are empty. We start opening the boxes in a random order and stop when we find a ball. What is the expected number of boxes we will open? <li> (Ertem Esiner) Two mathematicians meet an old friend who has two kids and ask her what their ages are. The mother writes the sum of the two ages on a piece of paper and gives it to the first mathematician, and gives the product of the two ages to the second mathematician. The mathematicians think for a minute and claim the information is not sufficient. The woman says "think again". At that moment the mathematician with the product figures out the answer. How old are the two kids? <li> (Ertem Esiner) Five pirates are trying to divide up 1000 gold pieces among themselves. They decide to take turns making offers. An offer by a pirate is accepted if at least half of the other pirates agree to it. Otherwise the pirate making the offer is killed, and the next one makes an offer. Each pirate is greedy, but none wants to die. If you are the first pirate, what offer do you make? <li> (Drake) A professor announces that there will be surprise exam the following week, specifically that the students will not know what day the exam is going to take place. The students reason that the exam cannot take place on Friday, because by Thursday night they would know what day the exam is going to take place. If the exam cannot be on Friday it cannot be on Thursday either, because they would know by Wednesday night and so on. They finally decide the exam cannot take place and forget to study. The professor gives the exam on Wednesday to everybody's surprise. What went wrong? </li> <li> (Mackay) Fred rolls an unbiased six-sided die once per second, noting the occasions when the outcome is a six. <ol><li> What is the mean number of rolls from one six to the next six? </li><li> Between two rolls, the clock strikes one. What is the mean number of rolls until the next six? </li><li> Now think back before the clock struck. What is the mean number of rolls, going back in time, until the most recent six? </li><li> What is the mean number of rolls from the six before the clock struck to the next six? </li><li> Is your first answer different from your last answer? Explain. </li></ol></li> <li> (Deniz) You have two independent random variables between 0 and 1. How do you decide which one is more likely to be larger than the other? </li> <li> (Deniz) You have two arbitrary random variables between 0 and 1. How do you decide if they are independent or not looking at their joint pdf density plot? </li> <li> (Dennis Eriksson) Find all solutions to the diophantine equation: 1+2+3+...+n=m^2, where n and m are positive integers. </li> <li> (Feyz) You have a deck of n cards numbered from 1 to n. dealt and shuffled randomly. What is the probability that none of the i-th card is on the i-th position? </li> <li> (Sonny) Prove that there is a natural number n, for which 2^n starts with the numbers 3141592, i.e., show that there is a number of the form 3141592.... which is a power of 2 (in base 10 representation). </li> <li> (Alkan) Let n, k be integers greater than 1. <ol><li> Show that 1/1 + 1/2 + 1/3 +...+ 1/n cannot be an integer. </li><li> Show that 1/k + 1/(k+1) + ... + 1/(k+n) cannot be an integer. </li></ol></li> <li> (Will,Minsky) These problems have something in common: <ol><li> A monk leaves to ascend to the temple on top of a mountain at 9am and arrives at 5pm. The next day he leaves the temple at 9am and arrives back at the foot of the mountain at 5pm. Prove that there is a point in time where he was at the same location on the path at the same time. </li><li> Prove that on a 2D earth, there exists a diameter such that the temperature at the endpoints is equal. </li><li> Prove that on a 3D earth, there exists a diameter such that the temperature and humidity of the endpoints are equal. </li><li> Does every convex closed curve in the plane contain all four vertices of some square? </li></ol></li> <li> (Ben) Two nice algorithm questions: Given a shuffled array of numbers from 1 to 10,000, find the three that are missing in one pass. Given an array of positive and negative integers, find the subarray with the highest sum in linear time. </li> <li> (Winston) Four people want to pass a bridge dark at night. They can walk the bridge in 1, 2, 9, and 10 minutes respectively. The bridge can carry at most two people at a time. There is a single flash-light, and they need the flash-light to walk on the bridge. What is the shortest time for all four to pass across? (This was apparently a popular Microsoft interview question). </li> <li> (Beril) You have a glass of tea and a glass of milk. You take a spoonfull of milk, mix it with the tea. Then you take a spoonfull of this mixture and mix it with the milk. Is there more milk in the tea or more tea in the milk at the end? </li> <li> (Mine) Construct a square from: (a) Three identical squares, (b) Any two squares, (c) A rectangle. (d) Divide a square into any given two squares with the same total area. (e) Divide a circle into 6, 7, 8, and 10 equal pie-slices. </li> <li> (IMO practice) m+n people are standing on a movie line. m people have 5 dollar bills, n people have 10 dollar bills. The movie is 5 dollars. The cashier opens with no money. It will close if it does not have enough change to give one person. How many possible lines are there that will get through without closing the cashier? <b>Note: </b>(Deniz, Mar 10, 1998) I just discovered that this problem is equivalent to finding the number of full binary trees with m leaves when n=m-1. A full binary tree is a tree where each node has 0 or 2 children. The number of binary trees is equivalent to the number of shift-reduce sequences that parse them. For such a sequence to be valid the number of shifts need to always be ahead of the number of reduces, which turns this into our movie problem. The binary tree problem can also be solved using a generating function and the relation b[n] = sum[k=1..n-1](b[k] b[n-k]). The movie problem can be solved by using random walks and the reflection principle. The two solutions seem to give different answers but they turn out to be equivalent. This constitutes an indirect proof of the following combinatorial identity: (2n-1)!! = 2n!/(n! 2^n). Everything is related to everything else in math :-) </li> <li> (Oguz) Find a function f on real numbers such that f(f(x)) = -x. </li> <li> (Boris) You meet many women in your life. After meeting each one, you decide how good she is and whether you want to marry her. If you decide to marry her, you lose your chance with future candidates. If you decide to move on, you lose your chance with her. Assuming you will meet at most n women, find the optimum strategy for marrying the best bride. <font size=-1> (The Azeri mathematician Gussein-Zade is apparently the first one to solve this problem.) </font> </li> <li> (Alkan) A small rectangle is cut out of a large rectangle. Find a line that divides the remaining figure into two equal areas using an unmarked ruler. </li> <li> (IMO) Let A be a set of ten two-digit integers. Prove that one can always find two subsets of A with the same sum. </li> <li> (IMO) 17 people correspond with each other. Each pair discusses one of three possible topics. Prove that there are three people that discuss the same topic with each other. </li> <li> (Alkan) Five couples meet in a party. Everyone starts shaking hands with everyone else except their partners. At some point the host stops them and asks how many handshakes each had. Everyone gives a different number. How many hands did the host's wife shake? </li> <li> (IMO-75/4) When 4444 <sup> 4444 </sup> is written in decimal notation, the sum of its digits is A. Let B be the sum of the digits of A. Find the sum of the digits of B. (A and B are written in decimal notation.) </li> <li> (Murat Fadiloglu) What is the probability of two randomly selected integers being mutually prime? </li> <li> (Alkan) An old lady buys a plane ticket to visit her son. She goes to the airport and people let her board first. Since she can't read her seat number, she sits on a random seat. Rest of the passengers sit on their own seats, unless it is occupied in which case they randomly choose one of the emtpy seats. What is the probability that the last passenger will sit on his own seat? </li> <li> (Alkan) sqrt(1 + 2*sqrt(1 + 3*sqrt(1 + 4*sqrt(1 + 5*sqrt(1 + ... ))...) = ? </li> <li> (Rota) Given a sequence of (n <sup> 2 </sup> + 1) distinct integers, show that it is possible to find a sequence of (n+1) entries which is increasing or decreasing. </li> <li> (Minkowsky) Consider a two dimensional lattice with grid points unit distance apart. Show that a convex shape that is symmetric around a grid point has to include at least three grid points if its area is 4. </li> <li> (Science Museum, unsolved) Which rectangles can be divided into unequal squares? </li> <li> (Alkan) Consider permutations of an array which contains n copies of each integer from 1 to n. Two permutations are defined as orthogonal if their corresponding elements form distinct pairs. What is the maximum number of permutations such that any two are orthogonal? For example, here is a possible set of mutually orthogonal permutations for n=3: {111222333, 123123123, 123231312, 123312231}. </li> <li> (Ian) My new favorite algorithm: Find an algorithm that discovers if a linked list has a cycle in it using bounded memory. </li> <li> (Uttrash) You are in prison and they give you n red balls, n green balls and two boxes. You are to place the balls in the two boxes in any way you like. The next day they will pick a ball from one of the boxes, and if it is green you will be set free. How do you arrange the balls? </li> <li> (Will) You randomly throw k balls into n bins. What is the expected number of occupied bins. </li> <li> (Will) You randomly throw k points on the unit interval. What is the expected length of the longest segment. </li> <li> (Michael, unsolved) You distribute 100 balls to 10 buckets. What is the expected value of the number of balls in the bucket with most balls. </li> <li> (Lon) Draw 2 circles, 1 completely inside the other (but not necessarily concentric.) What is the probablility that a line intersecting the outer circle also intersects the inner circle. Now, do the same with rectangles. </li> <li> (Thurston and Conway) An angel is stuck on an infinite sheet of graph paper, he can hop from square to adjacent square. Everytime the angel hops, the devil can knock out any square, so the angel can't ever go there. Can the devil trap the angel? What if the graph paper is a positive quadrant (i.e. is bounded on two sides). </li> <li> This is not really math, but here are my two favorite algorithms: (1) Find an algorithm for perfect shuffling of an array. (2) Find an algorithm that will pick a perfectly random element from a list in one pass without knowing the size of the list beforehand. </li> <li> (Alkan) Given two points find the point midway between them using only a compass (no ruler). </li> <li> (Alkan) You are sitting at point [0,0] and looking towards right into a tunnel bounded by y=1/x and y=-1/x curves. The walls of the tunnel are reflecting. Prove that if you send a light beam into the tunnel in any direction other than straight to the right, the beam will be reflected back towards left. </li> <li> (Deniz) Let x be a random variable which can take positive integer values. P(x)=1/2<sup>x</sup>. We draw n random elements from this distribution. What is the probability that the n+1st element will be different from the first n? </li> <li> (Alkan) Let A be the set of all rectangles that have one integer side. Prove that any rectangle constructed by concatenating rectangles from A will also be a member of A. </li> <li> (Neal) Take a randomly shuffled deck. Open the cards one by one. At one point stop and predict that the next card is red. Is there a strategy that has more than 1/2 chance. </li> <li> Pick two random points in the unit line segment. What is the expected distance between them? </li> <li> Pick two random points in the unit circle. What is the expected distance between them? </li> <li> (Umit) Suspend a rope from two points on the ceiling. What shape does it take? </li> <li> (Bernoulli brothers) A ball is rolling from point A to a lower point B. What is the ideal curve for the path between A and B that minimize the travel time? </li> <li> (Alkan) There are 100 light poles with switches on a street. In the beginning all lights are off. One person goes through pole number 1, 2, 3, ... and flips the switches. Then he goes back and goes through 2, 4, 6, ... and flips the switches. Then he goes back and goes through 3, 6, 9, ... and flips the switches. So at n'th round he flips the multiples of n. Which lights are on after 100 rounds? </li> <li> (Deniz) There is a set A of n0 elements, and we randomly pick a subset B of n1 elements. We know that r0 of the elements in A were red. We are interested in finding out the number of red elements in B, r1. To find out we start picking random elements from B. We pick n2 elements, and r2 of them turn out to be red. Now what is the best estimate for r1? </li> <li> (Minsky) I bring you three flipped cups and tell you there is gold under one of them. Furthermore, each cup has a number giving the probability that the gold is under that one. You immediately go to the one with highest probability. I tell you that you have amnesia and I may have tried this on you a million times. What is your best strategy? </li> <li> (Minsky) An ant leaves a repeated binary pattern behind as it walks through the desert. What is the length of the shortest pattern that would let you distinguish which way the ant was going? </li> <li> (Feyzu) Two points are chosen at random on a line AB, each point being chosen according to the uniform distribution on AB, and the choices being made independently of each other. The line AB may now be regarded as divided into three parts. What is the probability that they may be made into a triangle? </li> <li> (IMO practice) The entries for a competition is locked in a safe. There are 11 judges. We would like them to be able to open the safe when more than half get together. How many locks / keys do we need? </li> <li> (IMO practice) Given three parallel lines, show that an equilateral triangle can always be constructed with a vertex on each line. </li> <li> (IMO-72/6) Given four distinct parallel planes, prove that there exists a regular tetrahedron with a vertex on each plane. </li> <li> (Umit) A method for two people to share a pie fairly is to let one cut the other one pick. Generalize this method to n people. </li> <li> You are making a random walk on an n-dimensional grid. What is the probability that you will ever return to the origin? (Hint: It is 1 for 1-D and 2-D! It is 0.3405 for 3-D). </li> <li> (Ivanie) A rabbit hopping up the stairs can hop either one or two steps at a time. How many different ways can it climb n steps? </li> <li> Show that if you cut off two opposite corner squares of a chess board, you cannot cover the rest with dominoes. </li> <li> Show that n squares with total area less than 1/2 can always be fit into a unit square (non-overlapping). </li> <li> Show that n squares with total area greater than 3 can always cover the surface of the unit square (non-overlapping). </li> <li> You color all points of a plain with three colors. Show that I can always find two points of the same color that are a given distance apart. </li> <li> You color all points on an equilateral triangle with two colors. I try to find a right triangle with its vertices on the edges of your triangle and all vertices having the same color. Can you find a coloring that prevents this? </li> <li> How many 1's are there in the digits of numbers from 1 to 1 million? (one minute time limit). </li> <li> (Michael) There is a piece of candy on every node of a binary tree. Find the shortest path through the binary tree that lets you collect all of the candies. </li> <li> (Ihsan) Two men, x distance apart, start walking toward each other with speed v. At that instant a fly starts flying from one men's nose to the other with 2v speed. The fly keeps flying back and forth between the two noses until the guys meet. How much distance has the fly flown when they meet? (There is an easy way and a hard way to solve this). </li> <li> (Ihsan) A coin is flipped until the first head appears. If you get a head in n flips you win $2<sup>n</sup>. How much are you willing to pay to play this game? </li> <li> (Bilim ve Teknik) You need to paint the area under the curve 1/x. How can you do it with a finite amount of paint? </li> </ol></span>Deniz Yurethttp://www.blogger.com/profile/00578023665603100985noreply@blogger.com1tag:blogger.com,1999:blog-8540876.post-7419462242836700372022-03-22T11:16:00.005+03:002022-04-14T10:26:01.832+03:00Turkish Data Depository (TDD)<iframe width="400" height="222" src="https://www.youtube.com/embed/o6-nK5Bg6F8" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>Talk by Ali Safaya and Taner Sezer on the Turkish Data Depository (TDD) project, which aims to collect Turkish NLP resources such as corpora, labeled data, model weights under <a href="https://tdd.ai" target="_blank">https://tdd.ai</a>. You can register on the website and download resources, sign up for the <a href="https://groups.google.com/g/tdd-group" target="_blank">mailing list</a> to get updates. <span class="fullpost"></span>Deniz Yurethttp://www.blogger.com/profile/00578023665603100985noreply@blogger.com0tag:blogger.com,1999:blog-8540876.post-4955180797715852332022-03-14T17:55:00.001+03:002022-06-21T17:59:01.030+03:00Machine learning in and out of equilibriumMichael Hinczewski, Shishir Adhikari, Alkan Kabakcioglu, Alexander Strang, and Deniz Yuret. March 2022. <a href="https://meetings.aps.org/Meeting/MAR22/Session/F09.6" target="_blank">Bulletin of the American Physical Society</a>. <p><b>Abstract:</b>The algorithms used to train neural networks, like stochastic gradient descent (SGD), have close parallels to natural processes that navigate a high-dimensional parameter space—for example protein folding or evolution. Our study uses a Fokker-Planck approach, adapted from statistical physics, to explore these parallels in a single, unified framework. We focus in particular on the stationary state of the system in the long-time limit. In contrast to its biophysical analogues, conventional SGD leads to a nonequilibrium stationary state exhibiting persistent currents in the space of network parameters. The effective loss landscape that determines the shape of this stationary distribution sensitively depends on training details, i.e. the choice to minibatch with or without replacement. We also demonstrate that the state satisfies the integral fluctuation theorem, a nonequilibrium generalization of the second law of thermodynamics. Finally, we introduce an alternative ``thermalized'' SGD procedure, designed to achieve an equilibrium stationary state. Deployed as a secondary training step, after conventional SGD has converged, thermalization is an efficient method to implement Bayesian machine learning, allowing us to estimate the posterior distribution of network predictions.</p><span class="fullpost"></span>Deniz Yurethttp://www.blogger.com/profile/00578023665603100985noreply@blogger.com0tag:blogger.com,1999:blog-8540876.post-7584328529670195692022-02-11T13:01:00.031+03:002022-08-18T10:27:23.728+03:00Osman Mutlu, M.S. 2022<table><tr valign="top"> <td> <div class="separator" style="clear: both;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhisSBOe_Hvc-rzsge9slswmQeshGg-cYzXcM2ySQD-tWmxxVf3tQUC9UpAJy8Gxiqz6AtWgT03xl9dj_Rvu-0pcPTiNfHB6WzGTD9sR0yoIIodOUrYy8kaSWeEu59JObjEOGBlNqvqJBwBQ65tEmCDEL1A0hlag6rTHq1la1irPfsnn7nIBA/s326/osman-2-e1522243109846.jpg" style="display: block; padding: 1em 0; text-align: center; "><img alt="" border="0" height="120" data-original-height="326" data-original-width="273" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhisSBOe_Hvc-rzsge9slswmQeshGg-cYzXcM2ySQD-tWmxxVf3tQUC9UpAJy8Gxiqz6AtWgT03xl9dj_Rvu-0pcPTiNfHB6WzGTD9sR0yoIIodOUrYy8kaSWeEu59JObjEOGBlNqvqJBwBQ65tEmCDEL1A0hlag6rTHq1la1irPfsnn7nIBA/s200/osman-2-e1522243109846.jpg"/></a></div> </td> <td><br/> <b>Current position</b>: Project engineer, Koç University (<a href="https://www.linkedin.com/in/osman-mutlu-82944a92/">LinkedIn</a>, <a href="mailto:osmanmutlu92@gmail.com">Email</a>) <br/><b>MS Thesis</b>: Utilizing coarse-grained data in low-data settings for event extraction. February 2022. (<a href="https://drive.google.com/file/d/1AXXcjN69fM5P_PcwU5HAJ2syCilHtLyv/view?usp=sharing">PDF</a>, <a href="https://docs.google.com/presentation/d/1IYuZs6PWfKs3HXfdDVOQM96H23OSron3cK08IDsOe84/edit?usp=sharing">Presentation</a>, <a href="https://scholar.google.com.tr/citations?user=ZSut9GUAAAAJ&hl=en">Publications</a>, <a href="https://github.com/OsmanMutlu/">Code</a>). </td> </tr></table><span class="fullpost"><p><iframe src="https://docs.google.com/presentation/d/1IYuZs6PWfKs3HXfdDVOQM96H23OSron3cK08IDsOe84/embed?start=false&loop=false&delayms=3000" frameborder="0" width="420" height="275" allowfullscreen="true" mozallowfullscreen="true" webkitallowfullscreen="true"></iframe></p> <b>Thesis Abstract:</b>Annotating text data for event information extraction systems is hard, expensive, and error-prone. We investigate the feasibility of integrating coarse-grained data (document or sentence labels), which is far more feasible to obtain, instead of annotating more documents. We utilize a multi-task model with two auxiliary tasks, document and sentence binary classification, in addition to the main task of token classification. We perform a series of experiments with varying data regimes for the aforementioned integration. Results show that while introducing extra coarse-grained data offers greater improvement and robustness, a gain is still possible with only the addition of negative documents that have no information on any event. </span>Deniz Yurethttp://www.blogger.com/profile/00578023665603100985noreply@blogger.com0tag:blogger.com,1999:blog-8540876.post-87457883982498112182022-01-27T14:34:00.005+03:002022-02-08T14:38:44.660+03:00Aynı dilin yolcusu: insan ve yapay zekâ<iframe width="400" height="222" src="https://www.youtube.com/embed/0TjkiWNgxCk" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe><a href="https://www.iksv.org/" target="_blank">İstanbul Kültür ve Sanat Vakfı (İKSV)</a>, <a href="https://www.iksv.org/tr/haber/iksv-den-yeni-bir-podcast-serisi-dile-kolay" target="_blank">Dile Kolay Podcast Serisi</a><span class="fullpost"></span>Deniz Yurethttp://www.blogger.com/profile/00578023665603100985noreply@blogger.com0tag:blogger.com,1999:blog-8540876.post-11383600732266789552021-10-22T09:25:00.011+03:002021-11-07T09:31:15.342+03:00KUIS AI Tanıtımı, TR AI Week 2021<iframe width="400" height="222" src="https://www.youtube.com/embed/0JQ6n0VdjkE" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe><a href="https://turkiye.ai/events/tr-ai-week-2021/" target="_blank">TR AI Week 2021</a>, <a href="https://docs.google.com/presentation/d/1Z5FCF0b_xJnCWKOBkTbWZDlvrND5jfLYgxjs4z1DXuU/edit?usp=sharing">Sunum linki</a>Deniz Yurethttp://www.blogger.com/profile/00578023665603100985noreply@blogger.com0tag:blogger.com,1999:blog-8540876.post-82455721472131799052021-06-29T11:51:00.007+03:002022-03-23T15:06:59.841+03:00Üçüncü Yapay Zeka Devrimi (MSTAS 2021)<iframe width="400" height="222" src="https://www.youtube.com/embed/bxuajW_QhKE" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe><a href="https://mstas2021.itu.edu.tr/" target="_blank">Mimarlıkta Sayısal Tasarım XV. Ulusal Sempozyumu</a>, <a href="https://docs.google.com/presentation/d/1cc4oKb3n5dskwLHYO1fnr9XvyPg3fuc6xV4X_U-K-3Q/edit?usp=sharing">Sunum linki</a>Deniz Yurethttp://www.blogger.com/profile/00578023665603100985noreply@blogger.com0tag:blogger.com,1999:blog-8540876.post-44097793691351862012021-04-29T11:35:00.093+03:002022-08-18T10:32:49.472+03:00Ozan Arkan Can, Ph.D. 2021<table><tr valign="top"><td> <div class="separator" style="clear: both;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgZPEGUMIH2OwVlLoq1znxR6bbE1RRhM7_H9CZqo7kEB4cJ1EHeFjUMFIu704FMtAGLwZpq9Nz6z4ZkMJfRJqeQ59ljGdKQjjb9CWgokIcmZMTwPSeEzerYD9d-9zsq2p4cMD7mISy0AZZWixEuPI1X2AYcKhLq9ufMCa_z-l4M5A4ZDbpC9w/s500/ozanarkancan.png" style="display: block; padding: 1em 0; text-align: center; "><img alt="" border="0" width="120" data-original-height="500" data-original-width="500" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgZPEGUMIH2OwVlLoq1znxR6bbE1RRhM7_H9CZqo7kEB4cJ1EHeFjUMFIu704FMtAGLwZpq9Nz6z4ZkMJfRJqeQ59ljGdKQjjb9CWgokIcmZMTwPSeEzerYD9d-9zsq2p4cMD7mISy0AZZWixEuPI1X2AYcKhLq9ufMCa_z-l4M5A4ZDbpC9w/s200/ozanarkancan.png"/></a></div> </td><td><br/><b>Current position</b>: Applied Scientist - Amazon Search - Berlin (<a href="https://ozanarkancan.github.io">Homepage</a>, <a href="https://www.linkedin.com/in/ozan-arkan-can-69aba876">LinkedIn</a>, <a href="mailto:can.ozanarkan@gmail.com">Email</a>) <br/><b>PhD Thesis</b>: Cognitively-Inspired Deep Learning Approaches for Grounded Language Learning. April 2021. (<a href="https://drive.google.com/file/d/1Eit-9LRLhAzVr6a1hcWbwtOHVY4Ptw6_/view?usp=sharing">PDF</a>, <a href="https://docs.google.com/presentation/d/1T3aJAA0GBpcnRNCfQRiYCXVFTWHHDJgISox2nfJSAlY/edit?usp=sharing">Presentation</a>, <a href="https://scholar.google.com/citations?user=IN-CnBUAAAAJ&hl=en&oi=ao">Publications</a>, <a href="https://github.com/ozanarkancan?tab=repositories">Code</a>). </td></tr></table><span class="fullpost"><p><iframe src="https://docs.google.com/presentation/d/1T3aJAA0GBpcnRNCfQRiYCXVFTWHHDJgISox2nfJSAlY/embed?start=false&loop=false&delayms=3000" frameborder="0" width="420" height="275" allowfullscreen="true" mozallowfullscreen="true" webkitallowfullscreen="true"></iframe></p> <b>Thesis Abstract:</b><p>Designing machines that can perceive the surrounding world and interacting with us using human language is one of the long-standing goals of artificial intelligence. Although tremendous progress has been made to model the linguistic meanings computationally, how to best integrate linguistic and perceptual processing in multi-modal tasks is a significant open problem. This thesis explores several cognitively-inspired neural architectures that consider the different aspects of the language’s role in cognition, visual perception, and task execution. Proposed models incorporate design choices motivated by cognitive science studies and are based on the common patterns in vision-language tasks.</p> <p>We begin by presenting an encoder-decoder network with a novel channel-based perceptual attention mechanism and its application to the navigational instruction following task. The perceptual processing component of this architecture is designed to focus on individual objects and properties within the environment using the language priors while preserving the spatial relations. To benefit from the designed component, we also propose an improved agent-centric world representation to allow the model to reason over the perception spatially.</p> <p>Next, we explore the usage of the Neural Module Networks approach in a real robotic system for the first time. Since collecting large-scale real world data is a labor-intensive and expensive work, the system learns the language grounding on simulated data and the perceptual representation separately to overcome the scarce data problem. However, because of the separate learning processes, inconsistencies arise between the user’s and robot’s world models. To overcome this, we propose a Bayesian learning approach that uses the implicit information in the instruction to update the perceptual belief to align what the user sees and what the robot perceives.</p> <p>In both parts, we demonstrate systems that use the high-level effect of language on visual processing, which operates on high-level representations. In addition to this, in the last part, we investigate the effect of language on low-level visual processing. To this end, we condition one or both low-level and high-level visual processing branches of a backbone architecture on language using language filters and apply these models to the image segmentation from referring expression task. Experiments show that modulating both low-level and high-level visual processing with language significantly improves the language grounding performance.</p> </span>Deniz Yurethttp://www.blogger.com/profile/00578023665603100985noreply@blogger.com0tag:blogger.com,1999:blog-8540876.post-59304495205432646762021-02-20T06:35:00.001+03:002021-02-20T06:59:39.290+03:00Yapay Zekâ Geleceğimizi Nasıl Şekillendirecek: "Bana Yarından Bahseder misin?" Podcast<div class="separator" style="clear: both;"><a href="http://spoti.fi/37u8AAI" style="display: block; padding: 1em 0; text-align: center; "><img alt="" border="0" width="320" data-original-height="1080" data-original-width="1080" src="https://1.bp.blogspot.com/-KFhX2smpHwQ/YDCCOJ16lpI/AAAAAAAAuM4/olUFjSdPIhE98wqXu67Rjr4fAiq-EN-dACLcBGAsYHQ/s320/image00006-min.png"/></a></div><a href="http://spoti.fi/37u8AAI">spoti.fi/37u8AAI</a><span class="fullpost"></span>Deniz Yurethttp://www.blogger.com/profile/00578023665603100985noreply@blogger.com0tag:blogger.com,1999:blog-8540876.post-27980021195452714882021-02-01T07:02:00.003+03:002021-02-20T07:08:34.554+03:00Programlama pratik yaparak öğrenilir: Kumbara Dergisi Röportajİş Bankası, Koç Üniversitesi işbirliğinde ülkemizin bilimsel ve akademik faaliyetlerine katkıda bulunmak amacıyla “Yapay Zekâ Uygulama ve Araştırma Merkezi” kurdu. Bu merkezle yapay zekâ alanında ileri düzeyde çalışmalar gerçekleştirilmesi amaçlanıyor. Kumbara Dergisi olarak, Koç Üniversitesi İş Bankası Yapay Zekâ Uygulama ve Araştırma Merkezi Direktörü Prof. Dr. Deniz Yuret’e yapay zekâ alanı ve Yapay Zekâ Uygulama ve Araştırma Merkezi hakkında sorular sorduk. <p><a href="https://kumbaradergisi.com/icerikler/yapay-zeka-merkezi-roportaj">Tüm Röportaj</a></p>Deniz Yurethttp://www.blogger.com/profile/00578023665603100985noreply@blogger.com0tag:blogger.com,1999:blog-8540876.post-22889726860279965892020-12-31T14:41:00.003+03:002021-01-03T14:43:06.684+03:00Q&AI Podcast: KUIS AI, A Researcher's Sanctum<iframe src="https://anchor.fm/q-ai/embed/episodes/KUIS-AI-A-Researchers-Sanctum-eodojs/a-a48b79d" height="102px" width="400px" frameborder="0" scrolling="no"></iframe><span class="fullpost"></span>Deniz Yurethttp://www.blogger.com/profile/00578023665603100985noreply@blogger.com0tag:blogger.com,1999:blog-8540876.post-17263039347317072492020-12-29T13:03:00.003+03:002022-08-18T10:37:13.065+03:00Ulaş Sert, M.S. 2020<table><tr valign="top"><td> <div class="separator" style="clear: both;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgQS4dvIyc5k70aOD_81GO86YsEYCsuG744D89OtyS19gB2fwgMxYT2RY_6kdP5KXHfaXoUUrk8Xs11Cqkx0MG9fEa4IsemSyLJZLVkRosIQKWjAFMnIiOKObvOmSXerSK5fcp8PVH9l3NlTPB3kpHe3NJmCGCGkiu-vgsjGtK4Ss6M3RKaoQ/s200/ulassert.jpeg" style="display: block; padding: 1em 0; text-align: center; "><img alt="" border="0" width="120" data-original-height="200" data-original-width="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgQS4dvIyc5k70aOD_81GO86YsEYCsuG744D89OtyS19gB2fwgMxYT2RY_6kdP5KXHfaXoUUrk8Xs11Cqkx0MG9fEa4IsemSyLJZLVkRosIQKWjAFMnIiOKObvOmSXerSK5fcp8PVH9l3NlTPB3kpHe3NJmCGCGkiu-vgsjGtK4Ss6M3RKaoQ/s200/ulassert.jpeg"/></a></div> </td><td><br/><b>Contact info</b>: <a href="https://www.linkedin.com/in/ulas-sert/?originalSubdomain=tr">LinkedIn</a><br/><b>M.S Thesis</b>: Training a Bridge Bidding Agent using Minimal Feature Engineering and Deep Reinforcement Learning, Koç University, Department of Computer Engineering. December 2020. (<a href="https://drive.google.com/file/d/1fwSS97X4K4WK19Mp-otSXkvuBMuTLIGW/view?usp=sharing">PDF</a>, <a href="https://drive.google.com/file/d/1_K-GEGxQ7h6tqQoFr_HApSvc0nGQujB-/view?usp=sharing">Presentation</a>, <a href="https://github.com/Sophylax/BridgeBidding.jl">Code</a>). </td></tr></table><span class="fullpost"><p><iframe src="https://docs.google.com/viewer?srcid=1_K-GEGxQ7h6tqQoFr_HApSvc0nGQujB-&pid=explorer&efh=false&a=v&chrome=false&embedded=true" width="410px" height="242px"></iframe></p><p><b>Thesis Abstract:</b><br/>The game of contract bridge, or just bridge, is a four-player imperfect information card game where two partnerships of two players compete against each other. It has two main phases: bidding and play. While the computer players have approached human-level performance two decades ago in the playing phase, bidding is still a very challenging problem. This makes bridge one of the last popular games where computers still lag behind the expert human-level performance. During bidding, players only know their own cards while participating in a public auction. Performing well in this phase requires the players to figure out how to communicate with their partners using the limited vocabulary of bids to decide on a joint contract. This communication is restricted by the strict ordering of legal bids and can be negatively interfered by bids made by the opponent partnership. In this thesis, we experiment with several novel architectures with minimal feature engineering and evaluate them by using supervised training over a data set of expert-level human games. After that, we further study different forms of deep reinforcement learning to refine the resulting model by simulated gameplay. Lastly, we propose an oracle evaluation metric that can measure the quality of any bidding sequence with respect to the game-theoretical optimum. </p></span>Deniz Yurethttp://www.blogger.com/profile/00578023665603100985noreply@blogger.com0tag:blogger.com,1999:blog-8540876.post-29482062077574692562020-12-24T10:02:00.022+03:002021-01-18T12:30:29.496+03:00The Third Deep Learning Revolution: A brief history of the last 50 years of AI research<iframe width="400" height="222" src="https://www.youtube.com/embed/7osulihDNrA" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe><a href="https://glodem.ku.edu.tr/events/third-deep-learning-revolution" target="_blank">Glodem AI & CSS Seminar Series</a>, <a href="https://docs.google.com/presentation/d/17VV4aM_jFXe3J5_nK6F2aExKeFgw-geN-8rDf1t37M8/edit?usp=sharing">Link to slides</a>Deniz Yurethttp://www.blogger.com/profile/00578023665603100985noreply@blogger.com0