My main research area is natural language processing and my current focus is on grounded natural language learning systems based on neural network models trained end-to-end. To accelerate my research, I develop and maintain Knet, the Koç University deep learning framework, which has become the tool of choice for hundreds of researchers and students across the globe (937 github stars as of July 2019).
General AI ResearchIn addition to natural language processing, other areas of artificial intelligence I have studied include genetic algorithms and optimization [1, 2, 3, 4], game search [5, 6], computational economics and finance [7, 8, 9, 10, 11, 12, 13], computational biology [14, 15, 16, 17, 18, 19], multimedia processing [20, 21], machine learning algorithms and frameworks [22, 23, 24, 25, 26, 27].
Rule-based NLPMy natural language research has spanned the three eras of rule-based, statistical, and neural systems. I started with Boris Katz’s rule-based natural language question answering system START (the longest running NLP system on the Internet) and developed its OmniBase component which allowed access to structured websites like IMDB and World Factbook using a uniform interface [28, 29, 30]. I later co-founded a company, Inquira Inc., which commercialized question answering technology for customer self-service applications of large companies like Apple and Ebay [31, 32].
Statistical NLPNatural languages are suffused with ambiguities and exceptions which makes development of robust rule-based systems difficult. Statistical models gradually replaced rule-based systems in the 1990s and 2000s as a more robust alternative. During this period I developed statistical models for supervised and unsupervised dependency parsing [33, 34], word sense disambiguation and induction [35, 36, 37, 38, 39], child word category acquisition [40, 41, 42, 43], morphological disambiguation [44, 45, 46], semantic role labeling [47], statistical language modeling [48, 49], and machine translation [50, 51, 52, 53].
A major portion of this work focused on unsupervised models because the large amounts of labeled data required for supervised models are expensive to collect, difficult to get agreement on, and not required by infants learning language. Nevertheless, supervised models play an important role in today’s NLP applications, so to help create labeled datasets and perform evaluations that push the state-of-the-art forward, I co-organized the CoNLL-2007 Shared Task on Dependency Parsing [54], the SemEval-2007 Shared Task on Classification of Semantic Relations between Nominals [55, 56], the SemEval-2010 Shared Task on Parser Evaluation Using Textual Entailments [57, 58], the SemEval-2012 and SemEval-2013 Semantic Evaluation Exercises [59, 60].
NLP and Neural NetworksIn 2010s, neural network based natural language processing systems started catching up in performance with their statistical counterparts. More importantly, layers of (morphological, syntactic, semantic) representations designed by linguists and used to train statistical models have been gradually replaced with features automatically learned by deep models from data (A similar transition took place in computer vision where hand-designed HOG/SIFT type features have beenreplaced with convolutional layers trained from data). Feature engineering no longer plays the central role it did with statistical models: discrete features are replaced by continuous embedding vectors, hand-designed feature combinations or kernels are replaced by adaptive basis functions automatically learned by neural networks. For the first time, it became possible to train end-to-end models, e.g. neural machine translation or image captioning systems are trained on nothing but example input-output pairs.
During this period, I developed a novel continuous representation of word context based on the distribution of possible substitutes for a word rather than its neighbors. My students and I showed that this “paradigmatic” word context representation generalized better and improved the state-of-the-art on problems such as unsupervised part of speech induction [61, 62, 63], unsupervised word sense induction [64] and semantic word similarity [65]. Using neural models with little feature engineering we also developed a named entity recognizer that achieves state-of-the-art results in 7 languages [66] and a dependency parser that parsed 81 treebanks in 49 languages and was ranked 7th out of the 33 systems participating in the CoNLL-2017 UD Shared Task [67]. Neural models generally require more data compared to statistical models which poses a problem in low-resource settings. We showed one way to mitigate this problem using transfer learning: a low-resource Turkish-English machine translation model performs 50% better when initialized with weights from a high-resource French-English model compared to random initialization [68]. Another disadvantage of neural models is their lack of interpretability, which we tried to address in [69] where we discovered hidden units that count various features in a sequence-to-sequence RNN model.
I am most excited about the potential of deep neural network models for grounded language learning, i.e. learning the meanings of words, phrases, and sentences by observing natural interactions. In a preliminary study, we showed that a neural model can learn to follow instructions for arranging blocks on a table-top by observing humans giving and following instructions [70]. I am currently running two funded projects to explore this topic further [71, 72] and our ongoing studies are promising [73]. I suspect robust natural language understanding systems of the future will be end-to-end trained rather than hand engineered.
Sample PapersThe following papers are a representative sample of my work:
- In “The Noisy Channel Model for Unsupervised Word Sense Disambiguation” [39], we use plain text and WordNet sense frequencies to build a generative probabilistic model for WSD without sense-tagged data.
- In “Learning Syntactic Categories Using Paradigmatic Representations of Word Context” [62] we show how to construct a context vector based on the substitute distribution of a word and get state-of-the-art results on part-of-speech induction.
- In “Transfer Learning for Low-Resource Neural Machine Translation” [68], we show how a neural machine translation model for a low-resource language pair can be significantly improved by borrowing parameters from a model for a high-resource language pair.
- In “Natural Language Communication with Robots” [70], we propose a grounded language learning task of arranging blocks on a table-top in response to natural language instructions and train a baseline model end-to-end using data collected from Amazon Turk.
- In “Knet: beginning deep learning with 100 lines of Julia” [24], I show how a high level programming language can be used as a deep learning framework when supported with automatic differentiation and GPU kernels.
References (My bibtex page has PDFs for most)
[1] Deniz Yuret and Michael de la Maza. Dynamic hill climbing: Overcoming the limitations of optimization techniques. In The Second Turkish Symposium on Artificial Intelligence and Neural Networks, 1993.
[2] Michael de la Maza and Deniz Yuret. Dynamic hill climbing. AI Expert, 1994.
[3] Deniz Yuret. From genetic algorithms to efficient optimization. Technical Report 1569, MIT AI Laboratory, 1994.
[4] Michael de la Maza and Deniz Yuret. Seeing clearly: Medical imaging now and tomorrow. In Clifford A. Pickover, editor, Future Health: Computers and Medicine in the 21st Century. St. Martin’s Press, 1995.
[5] Deniz Yuret. The principle of pressure in chess. In The Third Turkish Symposium on Artificial Intelligence and Neural Networks (TAINN ’94), 1994.
[6] David Allen McAllester and Deniz Yuret. Alpha-beta-conspiracy search. ICGA Journal, 25(1):16–35, 2002.
[7] Michael de la Maza and Deniz Yuret. A futures market simulation with non-rational participants. In Rodney Allen Brooks and Pattie Maes, editors, Proceedings of the Fourth International Workshop on the Synthesis and Simulation of Living Systems, 1994.
[8] Deniz Yuret and Michael de la Maza. A genetic algorithm system for predicting the OEX. Technical Analysis of Stocks and Commodities, 1994.
[9] Michael de la Maza and Deniz Yuret. Experimenting with a market simulation. The Magazine of Artificial Intelligence in Finance, 1(3), 1994.
[10] Michael de la Maza and Deniz Yuret. A model of stock market participants. In Jörg Biethahn and Volker Nissen, editors, Evolutionary Algorithms in Management Applications. Springer, 1995.
[11] Michael de la Maza and Deniz Yuret. Neural network applications: A critique. The Magazine of Artificial Intelligence in Finance, 2(1), 1995.
[12] Michael de la Maza, Ayla Oğuş, and Deniz Yuret. How do firms transition between monopoly and competitive behavior? An agent-based economic model. In Proceedings of the Sixth International Conference on Artificial Life, 1998.
[13] Ayla Oğuş, Michael de la Maza, and Deniz Yuret. Modeling the economics of internet companies. In Computing in Economics and Finance, Proceedings of the Fifth International Conference of the Society for Computational Economics, 1999.
[14] Özlem Keskin, Deniz Yuret, Attila Gürsoy, Metin Türkay, and Burak Erman. Relationships between amino acid sequence and backbone torsion angle preferences. Proteins: Structure, Function, and Bioinformatics, 55(4):992–998, June 2004.
[15] Ersin Yurtsever, Deniz Yuret, and Burak Erman. Quantum mechanical calculations of tryptophan and comparison with conformations in native proteins. J. Phys. Chem. A, 110(51):13933–13938, December 2006.
[16] Alkan Kabakçıoğlu, Deniz Yuret, Mert Gür, and Burak Erman. Anharmonicity, mode-coupling and entropy in a fluctuating native protein. Physical Biology, 7:046005, October 2010.
[17] Onur Varol, Deniz Yuret, Burak Erman, and Alkan Kabakçıoğlu. Mode coupling points to functionally important residues in myosin II. Proteins: Structure, Function, and Bioinformatics, 82(9):1777–1786, September 2014.
[18] Alkan Kabakcioglu, Onur Varol, Deniz Yuret, and Burak Erman. Functionally important residues from mode coupling during short-time protein dynamics. In APS Meeting Abstracts, volume 1, page 48009, 2015.
[19] Onur Varol, Deniz Yuret, Burak Erman, and Alkan Kabakcioglu. Functionally important residues from mode coupling during short-time protein dynamics. Biophysical Journal, 108(2):377a, 2015.
[20] Barış Bozkurt, Ozan Baysal, and Deniz Yuret. A dataset and baseline system for singing voice assessment. In The 13th International Symposium on Computer Music Multidisciplinary Research (CMMR), September 2017.
[21] Saman Zia, Yücel Yemez, and Deniz Yuret. RGB-D object recognition using deep convolutional neural networks. In The IEEE International Conference on Computer Vision (ICCV), pages 896–903, October 2017.
[22] Deniz Yuret and Michael de la Maza. The greedy prepend algorithm for decision list induction. In A. Levi et al., editors, ISCIS 2006, LNCS 4263, pages 37–46, Berlin Heidelberg, November 2006. Springer-Verlag.
[23] Ergun Biçici and Deniz Yuret. Locally scaled density based clustering. In B. Beliczynski et al., editors, ICANNGA 2007, Part I, LNCS 4431, pages 739–748, Berlin Heidelberg, April 2007. Springer-Verlag.
[24] Deniz Yuret. Knet: beginning deep learning with 100 lines of Julia. In Machine Learning Systems Workshop at NIPS 2016, December 2016.
[25] Enis Berk Çoban, Deniz Yuret, and Didem Unat. Multidimensional broadcast operation on the GPU. In 5. Ulusal Yüksek Başarımlı Hesaplama Konferansı, İstanbul, September 2017.
[26] Doğa Dikbayır, Enis Berk Çoban, İlker Kesen, Deniz Yuret, and Didem Unat. Fast multidimensional reduction and broadcast operations on GPU for machine learning. Concurrency and Computation: Practice and Experience, May 2018.
[27] Mike Innes, Deniz Yuret, et al. On machine learning and programming languages. In SysML Conference, Stanford, CA, Feb 2018.
[28] Boris Katz, Deniz Yuret, et al. Blitz: a preprocessor for detecting context-independent linguistic structures. In Proceedings of the 5th Pacific Rim International Conference on Artificial Intelligence (PRICAI ’98), 1998.
[29] Boris Katz, Deniz Yuret, et al. Integrating web resources and lexicons into a natural language query system. In Proceedings of the 6th IEEE International Conference on Multimedia Computing and Systems (IEEE ICMCS’99), 1999.
[30] Boris Katz, Sue Felshin, Deniz Yuret, et al. Omnibase: Uniform access to heterogeneous data for question answering. In NLDB 2002, LNCS 2553, pages 230–234. Springer-Verlag, 2002.
[31] Deniz Yuret. Method of utilizing implicit references to answer a query. US Patent Number 6957213, Oct 2005.
[32] Edwin Riley Cooper, Gann Bierner, Laurel Kathleen Graham, Deniz Yuret, James Charles Williams, and Filippo Beghelli. Ontology for use with a system, method, and computer readable medium for retrieving information and response to a query. US Patent Number 8612208, 9747390, Dec 2013.
[33] Deniz Yuret. Discovery of linguistic relations using lexical attraction. PhD thesis, MIT, 1998.
[34] Deniz Yuret. Dependency parsing as a classification problem. In Proceedings of the Tenth Conference on Computational Natural Language Learning (CoNLL-X), June 2006.
[35] Özlem Uzuner, Boris Katz, and Deniz Yuret. Word sense disambiguation for information retrieval. In Proceedings of the 1999 16th National Conference on Artificial Intelligence (AAAI-99), 1999.
[36] Deniz Yuret. Some experiments with a Naive Bayes WSD system. In Rada Mihalcea and Phil Edmonds, editors, Senseval-3: Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text, pages 265–268, Barcelona, Spain, July 2004. Association for Computational Linguistics.
[37] Ergun Biçici and Deniz Yuret. Clustering word pairs to answer analogy questions. In Proceedings of the Fifteenth Turkish Symposium on Artificial Intelligence and Neural Networks (TAINN 2006), June 2006.
[38] Deniz Yuret. KU: Word sense disambiguation by substitution. In Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007), pages 207–214, Prague, Czech Republic, June 2007. Association for Computational Linguistics.
[39] Deniz Yuret and Mehmet Ali Yatbaz. The noisy channel model for unsupervised word sense disambiguation. Computational Linguistics, 36(1):111–127, March 2010.
[40] Deniz Yuret, A. Engin Ural, F. Nihan Ketrez, Dilara Kocbas, and Aylin C. Kuntay. Morphological cues vs. number of nominals in learning verb types from child-directed speech. In Boston University Conference on Language Development (BUCLD33), October 2008.
[41] A. Engin Ural, Deniz Yuret, Nihan Ketrez, Dilara Kocbas, and Aylin Kuntay. Morphological cues vs. number of nominals in learning verb types in turkish: Syntactic bootstrapping mechanism revisited. Language and Cognitive Processes, 24(10):1393–1405, December 2009.
[42] Mehmet Ali Yatbaz, Volkan Cirik, Aylin Küntay, and Deniz Yuret. Paradigmatic representations outperform syntagmatic representations in distributional learning of grammatical categories. In BUCLD, November 2014.
[43] Mehmet Ali Yatbaz, Volkan Cirik, Aylin Küntay, and Deniz Yuret. Learning grammatical categories using paradigmatic representation: Substitute words for language acquisition. In COLING, December 2016.
[44] Deniz Yuret and Ferhan Türe. Learning morphological disambiguation rules for turkish. In HLT-NAACL 06, June 2006.
[45] Mehmet Ali Yatbaz and Deniz Yuret. Unsupervised morphological disambiguation using statistical language models. In NIPS 2009 Workshop on Grammar Induction, Representation of Language and Language Learning, Vancouver, Canada, December 2009.
[46] Deniz Yuret and Ergun Biçici. Modeling morphologically rich languages using split words and unstructured dependencies. In ACL-IJCNLP, Singapore, August 2009.
[47] Deniz Yuret, Mehmet Ali Yatbaz, and Ahmet Engin Ural. Discriminative vs. generative approaches in semantic role labeling. In Conference on Computational Natural Language Learning (CoNLL), Manchaster, UK, Aug 2008.
[48] Deniz Yuret. Smoothing a tera-word language model. In Proceedings of ACL-08: HLT, Short Papers, pages 141–144, Columbus, Ohio, June 2008. Association for Computational Linguistics.
[49] Deniz Yuret. Fastsubs: An efficient and exact procedure for finding the most likely lexical substitutes based on an n-gram language model. Signal Processing Letters, IEEE, 19(11):725–728, November 2012.
[50] E. Bicici and D. Yuret. L 1 regularized regression for reranking and system combination in machine translation. In Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR, pages 282–289. Association for Computational Linguistics, July 2010.
[51] Ergun Biçici and Deniz Yuret. Instance selection for machine translation using feature decay algorithms. In Proceedings of the Sixth Workshop on Statistical Machine Translation, pages 272–283, Edinburgh, Scotland, July 2011. Association for Computational Linguistics.
[52] Ergun Biçici and Deniz Yuret. Regmt system for machine translation, system combination, and evaluation. In Proceedings of the Sixth Workshop on Statistical Machine Translation, pages 323–329, Edinburgh, Scotland, July 2011. Association for Computational Linguistics.
[53] Ergun Biçici and Deniz Yuret. Optimizing instance selection for statistical machine translation with feature decay algorithms. IEEE Transactions on Audio, Speech and Language Processing, 23(2):339–350, February 2015.
[54] Joakim Nivre, Johan Hall, Sandra Kübler, Ryan McDonald, Jens Nilsson, Sebastian Riedel, and Deniz Yuret, editors. The CoNLL 2007 Shared Task on Dependency Parsing, Prague, Czech Republic, June 2007.
[55] Roxana Girju, Preslav Nakov, Vivi Nastase, Stan Szpakowicz, Peter Turney, and Deniz Yuret. Semeval-2007 task 04: Classification of semantic relations between nominals. In SemEval-2007: 4th International Workshop on Semantic Evaluations, June 2007.
[56] Roxana Girju, Preslav Nakov, Vivi Nastase, Stan Szpakowicz, Peter Turney, and Deniz Yuret. Classification of semantic relations between nominals. Language Resources and Evaluation, 43(2):105–121, June 2009.
[57] D. Yuret, A. Han, and Z. Turgut. Semeval-2010 task 12: Parser evaluation using textual entailments. In Proceedings of the 5th International Workshop on Semantic Evaluation, pages 51–56. Association for Computational Linguistics, July 2010.
[58] Deniz Yuret, Laura Rimell, and Aydin Han. Parser evaluation using textual entailments. Language Resources and Evaluation, 47(3):639–659, September 2012.
[59] Deniz Yuret and Suresh Manandhar, editors. Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012), 2012.
[60] Deniz Yuret and Suresh Manandhar, editors. Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013), 2013.
[61] Mehmet Ali Yatbaz and Deniz Yuret. Unsupervised part of speech tagging using unambiguous substitutes from a statistical language model. In Proceedings of the 23rd International Conference on Computational Linguistics: Posters, pages 1391–1398. Association for Computational Linguistics, August 2010.
[62] Mehmet Ali Yatbaz, Enis Sert, and Deniz Yuret. Learning syntactic categories using paradigmatic representations of word context. In Proceedings of the 2012 Conference on Empirical Methods in Natural Language Processing (EMNLP-CONLL 2012), Jeju, Korea, July 2012. Association for Computational Linguistics.
[63] Deniz Yuret, Mehmet Ali Yatbaz, and Enis Sert. Unsupervised instance-based part of speech induction using probable substitutes. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pages 2303–2313, Dublin, Ireland, August 2014. Dublin City University and Association for Computational Linguistics.
[64] Osman Başkaya, Enis Sert, Volkan Cirik, and Deniz Yuret. AI-KU: Using substitute vectors and co-occurrence modeling for word sense induction and disambiguation. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013), pages 300–306, Atlanta, Georgia, USA, June 2013. Association for Computational Linguistics.
[65] Oren Melamud, Ido Dagan, Jacob Goldberger, Idan Szpektor, and Deniz Yuret. Probabilistic modeling of joint-context in distributional similarity. In Proceedings of the Eighteenth Conference on Computational Natural Language Learning, pages 181–190, Ann Arbor, Michigan, June 2014. Association for Computational Linguistics.
[66] Onur Kuru, Ozan Arkan Can, and Deniz Yuret. Charner: Character-level named entity recognition. In COLING, December 2016.
[67] Ömer Kırnap, Berkay Furkan Önder, and Deniz Yuret. Parsing with context embeddings. In Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pages 80–87, Vancouver, Canada, August 2017. Association for Computational Linguistics.
[68] Barret Zoph, Deniz Yuret, Jon May, and Kevin Knight. Transfer learning for low-resource neural machine translation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 1568–1575, Austin, Texas, November 2016. Association for Computational Linguistics.
[69] Xing Shi, Kevin Knight, and Deniz Yuret. Why neural translations are the right length. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 2278–2282, Austin, Texas, November 2016. Association for Computational Linguistics.
[70] Yonatan Bisk, Deniz Yuret, and Daniel Marcu. Natural language communication with robots. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 751–761, San Diego, California, June 2016. Association for Computational Linguistics.
[71] Yücel Yemez and Deniz Yuret. Dilbilimsel ve görsel i̇puçlarını birlikte kullanarak gezinim dilinin öğrenilmesi. TÜBİTAK 1001 Project, 2015–2018.
[72] Luc De Raedt, Deniz Yuret, and Alessandro Saffiotti. Relational symbol grounding through affordance learning (ReGROUND). CHIST-ERA Project on Human Language Understanding: Grounding Language Learning, 2015–2018.
[73] Laura Antanas, Ozan Arkan Can, Jesse Davis, Luc De Raedt, Amy Loutfy, Andreas Persson, Alessandro Saffiotti, Emre Ünal, Deniz Yuret, and Pedro Zuidberg dos Martires. Relational symbol grounding through affordance learning: An overview of the ReGround project. In Grounding Language Understanding (GLU 2017) ISCA Satellite Workshop of Interspeech 2017. Stockholm University, August 2017.
No comments:
Post a Comment