WSD TOOLS

 

   Main Page

   Literature Survey

   Online Thesis

   Online Code-base

   WSD Tools

   WSD Related Links

 

 

 

PARSERS

MINIPAR is a broad-coverage parser for the English language. It represents its grammar as a network of nodes and links, where the nodes represent grammatical categories and the links represent types of dependency relationships.

The Collins-CFG parser provides an effective means of dealing with the sparse data problems inherent in the use of lexicalized context-free grammars. The parse tree is split up into a series of lexicalized CFG rules, which are then in turn split up into a sequence of decisions which build up each rule as a combination of a pair of lexicalized non-terminals.

An extensible, parallel parsing engine that accommodates many different types of generative, statistical parsing models (including an emulation of Mike Collins’ parsing model) and can easily be extended to new domains and new languages.

POS TAGGERS

Brill's part-of-speech tagger implements a simple rule based tagger using transform-based learning.

MXPOST is a JAVA (JDK 1.1) implementation of the part-of-speech tagger described in:

Adwait Ratnaparkhi. A Maximum Entropy Part-Of-Speech Tagger. In Proceedings of the Empirical Methods in Natural Language Processing Conference, May 17-18, 1996. University of Pennsylvania

THESAURUS

This is automatically constructed thesaurus by Dekang Lin. For each word, the thesaurus lists up to 200 most similar words and their similarities. The similar words are clustered (also automatically).

Similar to the above thesaurus. But similarity is computed based on the linear proximity relationship between words only, whereas the above thesaurus used dependency relationships extracted from a parsed corpus.

COMPUTATIONAL LEXICON

WordNet® is an online lexical reference system whose design is inspired by current psycholinguistic theories of human lexical memory. English nouns, verbs, adjectives and adverbs are organized into synonym sets, each representing one underlying lexical concept. Different relations link the synonym sets.
 

SIMILARITY PACKAGE

This is a CPAN module that implements a variety of semantic similarity measures that can be used in conjunction with WordNet. In particular, it supports the measures of Resnik, Lin, Jiang-Conrath, Leacock-Chodorow, Hirst-St.Onge, Wu-Palmer, Banerjee-Pedersen, and Patwardhan-Pedersen.