Deniz Yuret's Homepage: Learning Syntactic Categories Using Paradigmatic Representations of Word Context

July 12, 2012

Learning Syntactic Categories Using Paradigmatic Representations of Word Context

Mehmet Ali Yatbaz, Enis Sert, Deniz Yuret. EMNLP 2012. (Download the paper, presentation, code, fastsubs paper, lm training data (250MB), wsj substitute data (1GB), scode output word vectors (5MB), scode visualization demo (may take a few minutes to load). More up to date versions of the code can be found at github.)

Abstract: We investigate paradigmatic representations of word context in the domain of unsupervised syntactic category acquisition. Paradigmatic representations of word context are based on potential substitutes of a word in contrast to syntagmatic representations based on properties of neighboring words. We compare a bigram based baseline model with several paradigmatic models and demonstrate significant gains in accuracy. Our best model based on Euclidean co-occurrence embedding combines the paradigmatic context representation with morphological and orthographic features and achieves 80% many-to-one accuracy on a 45-tag 1M word corpus.

Deniz Yuret's Homepage

July 12, 2012

Learning Syntactic Categories Using Paradigmatic Representations of Word Context

No comments:

Labels

Popular Posts

My Blog List

Archive