Mehmet Ali Yatbaz and Deniz Yuret. NIPS 2009 Workshop on Grammar Induction, Representation of Language and Language Learning. December 2009. (PDF, Poster)
In this paper, we present a probabilistic model for the unsupervised morphological disambiguation problem. Our model assigns morphological parses T to the contexts C instead of assigning them to the words W. The target word $w \in W$ determines the possible parse set $T_w \subset T$ that can be used in $w$'s context $c_w \in C$. To assign the correct morphological parse $t\in T_w$ to $w$, our model finds the parse $t\in T_w$ that maximizes $P(t|c_w)$. $P(t|c_w)$'s are estimated using a statistical language model and the vocabulary of the corpus. The system performs significantly better than an unsupervised baseline and its performance is close to a supervised baseline.