May 28, 2013

AI-KU: Using Substitute Vectors and Co-Occurrence Modeling For Word Sense Induction and Disambiguation

Baskaya, Osman and Sert, Enis and Cirik, Volkan and Yuret, Deniz. Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013). June, 2013. Atlanta, Georgia, USA. (Download PDF, see the proceedings).

Word sense induction aims to discover different senses of a word from a corpus by using unsupervised learning approaches. Once a sense inventory is obtained for an ambiguous word, word sense discrimination approaches choose the best-fitting single sense for a given context from the induced sense inventory. However, there may not be a clear distinction between one sense and another, although for a context, more than one induced sense can be suitable. Graded word sense method allows for labeling a word in more than one sense. In contrast to the most common approach which is to apply clustering or graph partitioning on a representation of first or second order co-occurrences of a word, we propose a system that creates a substitute vector for each target word from the most likely substitutes suggested by a statistical language model. Word samples are then taken according to probabilities of these substitutes and the results of the co-occurrence model are clustered. This approach outperforms the other systems on graded word sense induction task in SemEval-2013.

Full post...

May 19, 2013

Pitfalls of studying language in isolation

Studies of language acquisition and language understanding display a remarkable lack of attention to the subject matter of the utterances being studied.  This is probably because nobody knows how to represent and process meaning whereas the forms of utterances are readily available.  Thus "language acquisition" have come to mean the study of learning how to construct utterances "of the right form" and studies of language understanding focus on translating forms of utterances into other symbolic forms equally devoid of the richness and detail of the things the utterance is supposed to convey.
A real theory of language acquisition should study how babies learn to decode form-meaning mappings in an environment where lots of things are going on in addition to what is being said.  A real theory of language understanding should study what kinds of rich interconnected concepts and embodied simulations get triggered by words and constructions, how we decide what to simulate given the scant detail in descriptions, and what inferences are made possible beyond what is explicitly stated.  

All this is AI-complete you say?  Well by limiting ourselves to study language in isolation, we may have come to the end of the line where the ~80% accuracy limit of machine learning based computational linguistics (on almost any linguistic problem you can think of) is preventing us from building truly transformative applications.  Maybe we are shooting ourselves in the foot, and maybe, just maybe, some problems that look difficult right now are difficult not because we are missing the right machine learning algorithm or sufficient labeled data but because we are ignoring the constraints imposed by the meaning side of things.  We may have finally run out of options other than to try and crack the real problem, i.e. modeling what utterances are ABOUT.

Full post...