Deniz Yuret's Homepage: Natural Language Processing summer course at Sabanci University

April 29, 2009

Natural Language Processing summer course at Sabanci University

This summer Kemal Oflazer, Dilek Hakkani-Tur ve Gokhan Tur are offering a Statistical Natural Language Processing course at Sabanci University. A draft syllabus is included below.

STATISTICAL NLP CLASS:

Kemal Oflazer:

Overview of NLP (2 hours)

NLP Applications
Processing pipeline: Basic steps and how they feed into each other and how they are used by applications

Morphological Analysis (could be skipped or shortened) (2 hours)

Introduction to Statistical Models, n-gram language modeling, (2hours)

Applications to simple sequence problems (tagging English and/or deascifier)

Morphological Disambiguation (applications to Turkish)

HMMs (formal treatment (backward-forward + viterbi) + applications to tagging) (2-3 hours)

CFGs and Probabilistic CFGs (3-4 hours)

Inside-outside algorithm for training PCFGs
Parsing with PCFGs

Machine Translation (MT) (3-4 Hours)

Brief overview Classical Symbolic MT
Statistical Machine Translation

Word-based Models
Phrase-based Models
Syntax-based models

Dealing with Morphology in SMT

Dilek Hakkani-Tur:

Elements of Information Theory / Advanced Language Modeling and Applications

Entropy/Perplexity/Mutual Information
Noisy Channel Model

Sequence classification / HMM
Sample classification / Naive Bayes

Smoothing
Adaptation

Named Entity Extraction (NE)

Using HMM for NE

Using CRF for NE
Using Boosting/MaxEnt/SVM for NE

Spoken Language Understanding (SLU) as Template Filling

HMM approaches (AT&T vs BBN)
Hidden Vector State Models
Latent Semantic Analysis
Sample-classification based (Boosting/MaxEnt/Decision Trees)

Summarization

Greedy Algorithms, MMR
TextRank/LexRank
Classification based extractive summarization
Global Models for Summarization: Linear Programming approaches

Question Answering
Spoken Dialog Systems and Dialog Management (DM)

Dialog Systems
DM

Finite State Models
Agent Models
Reinforcement Learning

Gokhan Tur

Topic Classification

Discriminative classification: SVM/Boosting
Generative classification: language model, document similarity, vector-space-model
Feature selection/transformation (LDA)
Latent semantic indexing

SLU as Intent Determination

Semantic Role Labeling
Robustness to ASR

Topic Clustering

K-Means
Top/Down vs. Bottom/Up

Topic Segmentation

HMM
TextTiling
Markov Chains

Sentence Segmentation

HMM
CRF
Hybrid

Active Learning/Semi-Supervised Learning/Unsupervised Learning/Model Adaptation/Robustness

Deniz Yuret's Homepage

April 29, 2009

Natural Language Processing summer course at Sabanci University

No comments:

Labels

Popular Posts

My Blog List

Archive