April 29, 2009

Natural Language Processing summer course at Sabanci University

This summer Kemal Oflazer, Dilek Hakkani-Tur ve Gokhan Tur are offering a Statistical Natural Language Processing course at Sabanci University. A draft syllabus is included below.

STATISTICAL NLP CLASS:


Kemal Oflazer:
  • Overview of NLP (2 hours)
    • NLP Applications
    • Processing pipeline: Basic steps and how they feed into each other and how they are used by applications
  • Morphological Analysis (could be skipped or shortened) (2 hours)
  • Introduction to Statistical Models, n-gram language modeling, (2hours)
    • Applications to simple sequence problems (tagging English and/or deascifier)
  • Morphological Disambiguation (applications to Turkish)
  • HMMs (formal treatment (backward-forward + viterbi) + applications to tagging) (2-3 hours)
  • CFGs and Probabilistic CFGs (3-4 hours)
    • Inside-outside algorithm for training PCFGs
    • Parsing with PCFGs
  • Machine Translation (MT) (3-4 Hours)
    • Brief overview Classical Symbolic MT
    • Statistical Machine Translation
      • Word-based Models
      • Phrase-based Models
      • Syntax-based models
    • Dealing with Morphology in SMT


Dilek Hakkani-Tur:
  • Elements of Information Theory / Advanced Language Modeling and Applications
    • Entropy/Perplexity/Mutual Information
    • Noisy Channel Model
      • Sequence classification / HMM
      • Sample classification / Naive Bayes
    • Smoothing
    • Adaptation
  • Named Entity Extraction (NE)
    • Using HMM for NE
    • Using CRF for NE
    • Using Boosting/MaxEnt/SVM for NE
  • Spoken Language Understanding (SLU) as Template Filling
    • HMM approaches (AT&T vs BBN)
    • Hidden Vector State Models
    • Latent Semantic Analysis
    • Sample-classification based (Boosting/MaxEnt/Decision Trees)
  • Summarization
    • Greedy Algorithms, MMR
    • TextRank/LexRank
    • Classification based extractive summarization
    • Global Models for Summarization: Linear Programming approaches
  • Question Answering
  • Spoken Dialog Systems and Dialog Management (DM)
    • Dialog Systems
    • DM
      • Finite State Models
      • Agent Models
      • Reinforcement Learning


Gokhan Tur
  • Topic Classification
    • Discriminative classification: SVM/Boosting
    • Generative classification: language model, document similarity, vector-space-model
    • Feature selection/transformation (LDA)
    • Latent semantic indexing
  • SLU as Intent Determination
    • Semantic Role Labeling
    • Robustness to ASR
  • Topic Clustering
    • K-Means
    • Top/Down vs. Bottom/Up
  • Topic Segmentation
    • HMM
    • TextTiling
    • Markov Chains
  • Sentence Segmentation
    • HMM
    • CRF
    • Hybrid
  • Active Learning/Semi-Supervised Learning/Unsupervised Learning/Model Adaptation/Robustness


Full post...

April 04, 2009

Dennett in Istanbul

As part of Sabanci University's Darwin Year Celebration activities, Prof. Daniel C. Dennett is going to give a talk entitled "Darwin's strange inversion of reasoning" at the Sakip Sabanci Museum on April 10, 2009 at 16:00.

Dennett's 1995 book Darwin's Dangerous Idea argues that natural selection is a blind and algorithmic process sufficiently powerful to account for the generation and evolution of life, minds, and societies. I am looking forward to his talk, an earlier version of which I had seen at MIT when the book first came out.

These days he has taken on religious fundamentalism (see Breaking the Spell). One of his proposed solutions to fight ignorance and intolerance is to teach children about ALL of world's religions instead of brainwashing them with a single system of thought, or leaving them vulnerable by not teaching them about religion at all.

If you have not had the pleasure of listening to Dennett before, I recommend his many recorded talks available at the following websites: TED talks, Wikipedia, Reitstoen.com, and his homepage.

Dennett is quite popular in the Artificial Intelligence / Cognitive Science community due to his refreshingly rational explanations of perplexing issues like consciousness and free will. You may not agree with the specifics of his theories, but at least he makes a convincing case that there is no need for "magic dust" to explain these natural phenomena. Talking about interesting psychological results in Sweet Dreams he says:

I often discover skeptics who are quite confident that I am simply making these facts up! But we must learn to treat such difficulties as measures of our frail powers of imagination, not insights into impossibility.

Yet some of his adversaries take the failure of their imagination for a physical model of the mind as evidence of its impossibility, succumb to mysterianism, take comfort in the assumption that some questions will never be answered, or look for magic dust in the depths of quantum theory. Dennett thinks one day we will find a psychological explanation for their defect.

Many of Dennett's books that I have mentioned in this blog are on the mysteries of the mind. If you get sleepy when you read philosophy, then I especially recommend Mind's I which is one of my favorite collections of philosophical fiction. Here is a list of his books which I hope to turn into an annotated bibliography at some point:

Content and Consciousness (1969)
Brainstorms: Philosophical Essays on Mind and Psychology (1978)
Mind's I: Fantasies and Reflections on Self and Soul (1981)
Elbow Room: The Varieties of Free Will Worth Wanting (1984)
The Intentional Stance (1987)
Consciousness Explained (1991)
Darwin's Dangerous Idea: Evolution and the Meanings of Life (1995)
Kinds of Minds: Towards and Understanding of Consciousness (1996)
Brainchildren: Essays on Designing Minds (1998)
Freedom Evolves (2003)
Sweet Dreams: Philosophical Obstacles to a Science of Consciousness (2005)
Breaking the Spell: Religion as a Natural Phenomenon (2006)

Full post... Related link