I am an associate professor of Computer Engineering at Koç University in Istanbul working at the Artificial Intelligence Laboratory. Previously I was at the MIT AI Lab and later co-founded Inquira, Inc. My research is in natural language processing and machine learning. For prospective students here are some research topics, papers, classes, blog posts and past students.
Koç Üniversitesi Bilgisayar Mühendisliği Bölümü'nde öğretim üyesiyim ve Yapay Zeka Laboratuarı'nda çalışıyorum. Bundan önce MIT Yapay Zeka Laboratuarı'nda çalıştım ve Inquira, Inc. şirketini kurdum. Araştırma konularım doğal dil işleme ve yapay öğrenmedir. İlgilenen öğrenciler için araştırma konuları, makaleler, verdiğim dersler, Türkçe yazılarım, ve mezunlarımız.

March 19, 2016

How to write a technical paper

This is the evolving set of recommendations I share with my graduate students for technical writing...

  1. Empathy: This is the single most important principle of technical writing.  Try reading what you write from the perspective of somebody who has not spent the last few months working on your problem.  Better yet, find such a person and see if they understand everything you are talking about.  Don’t just take their word for it, ask them to tell you what they understand in their own words.  See where they struggle and debug your paper: Do they get lost in too much detail and miss the main point?  Do they get disoriented because you jump around too much?  Are there terms they do not understand?  Fix the paper using the following techniques until a dedicated freshman can understand all the important points.
  2. Winston’s Onion Rule: The document should state the most important points first, and expand on them gradually. It is a mistake to keep any important points until the end of the paper.  Only details and supporting material should be left to the end.  If I stop reading the document at any point, everything I haven't read so far should be less important than everything I have read up to this point:
    1. The title should be descriptive of the main point.
    2. The first sentence should state the main point.
    3. The first paragraph should expand on the first sentence.
    4. The first section should expand on the first paragraph.
    5. The first chapter should expand on the first section.
    6. The whole paper/thesis should expand on the first chapter, etc.
  3. Yuret’s Fractal Rule: Parts at every level of your document, down to each paragraph, should have their own introduction / conclusion to keep the reader oriented (i.e. stop them from asking “What is this person talking about now, and why?”):
    1. The first chapter of a paper/thesis should state the topic of the paper/thesis and the last chapter should summarize its point.
    2. The first section of a chapter should state the topic of the chapter and the last section should summarize its point.
    3. The first paragraph of a section should state the topic of the section and the last paragraph should summarize its point.
    4. The first sentence of a paragraph should state the topic of the paragraph and the last sentence should summarize its point.
  4. No undefined terms: Any technical term your nine year old niece would not understand should be defined before first use.  Any acronym should first be given in parentheses next to its long form before first use.  All variables in equations, all axes in graphs should be explained at the first opportunity.  Tables and Figures should have descriptive captions that can be understood stand-alone.  Technical terms and mathematical notation should be used consistently, no confusing variations allowed (i.e. calling the same thing context vector somewhere and word context vector elsewhere will confuse the reader into thinking these are two separate things).
  5. Replicability: Science is based on replicable results.   Your paper should provide enough detail (possibly in the appendices), and links to its code and data, to replicate each of its results.  In particular, for each set of experiments you should have:
    1. Data table: e.g. in a natural language processing experiment, things like number of words and sentences in train, dev, test; vocabulary size, tagset size, tag frequencies, out-of-vocabulary rate, average sentence length, i.e. any data statistic relevant to the task should go to a data table.
    2. Parameter table: things like the model structure, the training algorithm used, the hyperparameters used, number of training epochs, and any other details related to experimental replication should go to a table.
    3. Result table: the results (table or plot) should clearly indicate the evaluation metric, sensible lower bound baselines, upper bounds (e.g. inter-annotator agreement) if available, and current state of the art in published work to put your results in perspective.

Full post...

February 01, 2016

Learning Navigational Language from Linguistic and Visual Cues (2016-2018)

TUBITAK 1001 Project 114E628. "Dilbilimsel ve Görsel İpuçlarını Birlikte Kullanarak Gezinim Dilinin Öğrenilmesi." (2016-02-01 -- 2018-08-01)
Full post...

December 01, 2015

ReGROUND: Relational symbol grounding through affordance learning (2015-2018)

Project accepted by Call 2014 of the CHIST‐ERA ERA‐NET for the topic "Human Language Understanding: Grounding Language Learning". (2015-12-01 -- 2018-12-01). Partners: KU Leuven (Belgium), Koç University (Turkey), Örebro University (Sweden).
Full post...

November 19, 2015

Osman Baskaya, MS 2015

Current position: Research Engineer, Huawei R&D. (Linkedin)
M.S. Thesis: Analysis of Context Embeddings in Word Sense Induction. Koç University, Department of Computer Engineering. November, 2015. (PDF, Presentation, Code)
Publications: bibtex.php




Abstract

There exist several drawbacks of representing the word senses with a fixed list of definitions of a manually constructed lexical database. There is no guarantee that they reflect the exact meaning of a target word in a given context, since they usually contain definitions that are too general. More so, lexical databases often include many rare senses while missing corpus/domain-specific senses. Word Sense Induction (WSI) focuses on discriminating the usages of a polysemous word without using a fixed list of definitions or any hand-crafted resources.

In contrast to the most common approach in WSI, which is to apply clustering or graph partitioning on a representation of first- or second-order co-occurrences of a word, my method obtains a probability distribution for each context suggested by a statistical model. This distribution helps to create context embeddings using the co-occurrence framework that represents the context with low-dimensional, dense vectors in Euclidean space. Then, these context embeddings are clustered by k-means clustering algorithm to discriminate usages (senses) of a word. This method proved its usefulness in Unsupervised Part-of-Speech Induction, and supervised tasks such as Multilingual Dependency Parsing. I examine this method on SemEval 2010 and SemEval 2013 Word Sense Induction lexical sample tasks, and the dataset I created using OntoNotes 5.0. This new lexical sample dataset has high inter-annotator agreement (IAA) (>90%) and number of instances for each word type is more than any previous lexical sample tasks (>500 instances).

The contributions in this thesis are as follows: (1) I suggest a method to attack the Word Sense Induction problem. (2) I provide a comprehensive analysis (a) in embedding step by comparing other popular word embeddings by transforming each of them to context embeddings using substitute word distributions for each context, and (b) in clustering step by comparing different clustering algorithms (kmeans, Spectral Clustering, DBSCAN) and different clustering approaches (local approach where instances of each word type clustered separately, and part-of-speech based approach where instances tagged with same-part-of-speech clusters independently).

The code to replicate the results in this thesis can be found at https://github.com/osmanbaskaya/wsid.


Full post...

July 10, 2015

Parsing with word vectors

Slides for my talk at the ISI NL Seminar.
Full post...