August 14, 2009
M.S. Thesis: Parser Evaluation Using Textual Entailments. Boğaziçi Üniversitesi Department of Computer Engineering, August 2009. (PDF).
Syntactic parsing is a basic problem in natural language processing. It can be deﬁned as assigning a structure to a sentence. Two prevalent approaches to parsing are phrase-structure parsing and dependency parsing. A related problem is parser evaluation. PETE is a dependency-based evaluation where the parse is represented as a list of simple sentences, similar to the Recognizing Textual Entailments task. Each entailment focuses on one relation. A priori training of annotators is not required. A program generates entailments from a dependency parse. Phrase-structure parses are converted to dependency parses to generate entailments. Additional entailments are generated for phrase-structure coordinations. Experiments are carried out with a function-tagger. Parsers are evaluated on the set of entailments generated from the Penn Treebank WSJ and Brown test sections. A phrase-structure parser obtained the highest score.
August 07, 2009
Tutorial: Kevin Knight, Philipp Koehn.
Topics in Statistical Machine Translation
MT: Phrase based, hierarchical, and syntax based approaches. Hiero is equivalent? to a syntax based approach with a single nonterminal. Minimum Bayes Risk (MBR) chooses not the best option but the one that has maximum expected BLEU. Approaches that work with lattices and forests. System combination provides significant gain. Integrating LM into decoding improves. Cube pruning makes hiero and syntax based more efficient. Throwing out 99% of the phrase table gives no loss. Factored models help when factors used as back off. Reordering tried before. The source and target can be string, tree, or forest. Arabic and Chinese seem most popular. Good test: can you put a jumbled up sentence in the right order. If we could output only grammatical sentences (simplified English?). Dependency LM for output language. Lattice translation. Giza alignments not very accurate, guess = 80%. Bleu between human translators is at the level of best systems, i.e. cannot use as upper bound.
Tutorial: Simone Paolo Ponzetto and Massimo Poesio.
State-of-the-art NLP Approaches to Coreference Resolution: Theory and Practical Recipes
Coref: ACE is current standard dataset. Also MUC and other new ones. Anaphora approx 50% proper nouns, 40% noun phrases, 10% pronouns. NPs most difficult. Tough to know when discourse-new. Have evaluation problems like other fields. Would think deciding on anaphora easier for annotators, but issues like whether to consider China and its population anaphoric.
P09-1001 [bib]: Qiang Yang; Yuqiang Chen; Gui-Rong Xue; Wenyuan Dai; Yong Yu.
Heterogeneous Transfer Learning for Image Clustering via the SocialWeb
ML: Qiang Yang gave the first invited talk. When the training and test sets have different distributions or different representations. Did not talk much about: when train and test have different labels. Link to causality. Unsupervised pre-learning boosting supervised learning curve.
P09-1002 [bib]: Katrin Erk; Diana McCarthy; Nicholas Gaylord.
Investigations on Word Senses and Word Usages
WSD: Annotators provide scores 1-5 for two tasks: how good a fit between a usage and sense, how close are two usages of same word. Claim forcing annotators to single decision detrimental. Also claim coarse senses insufficient to explain results.
P09-1010 [bib]: S.R.K. Branavan; Harr Chen; Luke Zettlemoyer; Regina Barzilay.
Reinforcement Learning for Mapping Instructions to Actions
Situated language: Best paper award. Good work goes beyond studying language in isolation. Reinforcement results sound incredibly good, number of features pretty small, how much prior info did they exactly use?
P09-1011 [bib]: Percy Liang; Michael Jordan; Dan Klein
Learning Semantic Correspondences with Less Supervision
Semantic representations: Learn semantic mappings in the domains of weather, robocup sportscasting, and NFL recaps when it is not clear what record and what field the text is referring to.
P09-1009 [bib]: Benjamin Snyder; Tahira Naseem; Regina Barzilay
Unsupervised Multilingual Grammar Induction
Syntax: A candidate constituent in one language may be split in another preventing wrong rules to be learnt.
P09-1024 [bib]: Christina Sauper; Regina Barzilay
Automatically Generating Wikipedia Articles: A Structure-Aware Approach
Summarization: I did not know summarization consists of cutting and pasting existing text.
P09-1025 [bib]: Neil McIntyre; Mirella Lapata
Learning to Tell Tales: A Data-driven Approach to Story Generation
Schemas: Learning a model of fairy tales to generate new ones. Nice idea but resulting stories not so good. Better models possible.
P09-1034 [bib]: Sebastian Pado; Michel Galley; Dan Jurafsky; Christopher D. Manning
Robust Machine Translation Evaluation with Entailment Features
MT: Compared to human judgement Meteor does best (significantly better than Bleu) among shallow evaluation metrics. Using RTE to see if the produced translation is an entailment or paraphrase of the reference does better.
P09-1040 [bib]: Joakim Nivre
Non-Projective Dependency Parsing in Expected Linear Time
Syntax: By adding one more operation that swaps tokens to the shift reduce parser, generation of nonprojective parses possible.
P09-1041 [bib]: Gregory Druck; Gideon Mann; Andrew McCallum
Semi-supervised Learning of Dependency Parsers using Generalized Expectation Criteria
Syntax: Instead of labeled data, use expectation constraints in training parser.
P09-1068 [bib]: Nathanael Chambers; Dan Jurafsky
Unsupervised Learning of Narrative Schemas and their Participants
Schemas: very nice work modeling structure of NYT stories. Could be improved by focusing on a particular genre and introducing narrative ordering to model (apparently time ordering is really difficult).
P09-1072 [bib]: Kai-min K. Chang; Vladimir L. Cherkassky; Tom M. Mitchell; Marcel Adam Just
Quantitative modeling of the neural representation of adjective-noun phrases to account for fMRI activation
Brain: continuing the work of brain imaging. Some success in guessing which adj-noun pair being thought. Better questions can be asked.
P09-2062 [bib]: Chris Biemann; Monojit Choudhury; Animesh Mukherjee
Syntax is from Mars while Semantics from Venus! Insights from Spectral Analysis of Distributional Similarity Networks
WSD: Qualitative differences between distributional similarity networks for semantics and syntax. Does it say anything about word meaning representation?
P09-2059 [bib]: Gumwon Hong; Seung-Wook Lee; Hae-Chang Rim
Bridging Morpho-Syntactic Gap between Source and Target Sentences for English-Korean Statistical Machine Translation
MT: Problems similar to Turkish. Collins '05 proposed reordering. Lee 06 removed useless function words. Hong inserts pseudo-words to xlate to Korean morphemes.
P09-1087 [bib]: Michel Galley; Christopher D. Manning
Quadratic-Time Dependency Parsing for Machine Translation
Syntax: nonprojective parsing tying each word to its most likely head. Why did this not work when I tried it in CoNLL? Gives O(n2). Could you adopt Nivre for linear? Unsupervised parsing? Using dependency LM as a feature.
P09-1088 [bib]: Phil Blunsom; Trevor Cohn; Chris Dyer; Miles Osborne
A Gibbs Sampler for Phrasal Synchronous Grammar Induction
MT: Bayesian magic. Look into SCFGs. Generates its own word alignment. Works better on non-monotonic language pairs, monotonic ones difficult to improve on.
P09-1089 [bib]: Shachar Mirkin; Lucia Specia; Nicola Cancedda; Ido Dagan; Marc Dymetman; Idan Szpektor
Source-Language Entailment Modeling for Translating Unknown Terms
MT: Generate paraphrases or entailments for unknown words using RTE.
P09-1090 [bib]: Ananthakrishnan Ramanathan; Hansraj Choudhary; Avishek Ghosh; Pushpak Bhattacharyya
Case markers and Morphology: Addressing the crux of the fluency problem in English-Hindi SMT
MT: Reordering and factored model. Fluency and adequacy manually evaluated in addition to BLEU.
P09-1116 [bib]: Dekang Lin; Xiaoyun Wu
Phrase Clustering for Discriminative Learning
WSD: cluster phrases instead of words. Much less ambiguous, so pure context. Use different size clusters together, let the learning algorithm pick. Similar to hierarchical. Improves NER and query classification. Any application where clustering words useful because of sparsity. Clusters derived from 700B web data. Are the clusters available?
P09-1117 [bib]: Katrin Tomanek; Udo Hahn
Semi-Supervised Active Learning for Sequence Labeling
ML: Self learning does not work because the instances with most confidence are not the useful ones. Active learning asks for labels of instances with least confidence. Boosting effect?
D09-1030 [bib]: Chris Callison-Burch
Fast, Cheap, and Creative: Evaluating Translation Quality Using Amazon’s Mechanical Turk
MT: This article has one answer to the BLEU upper bound question among other things. The following graph shows that professional humans still get higher Bleu compared to SMT systems (although this is using 10 reference translations). They mention Google MT got higher Bleu but probably the test set was used in training. Still gives relative performances. Also, amazing things apparently can be done with Amazon Turk. Should use them to judge Turkish alignment quality.
W09-2504 [bib]: Idan Szpektor; Ido Dagan
Augmenting WordNet-based Inference with Argument Mapping
RTE: Some lexical substitutions require other words to be shuffled. Automatic learning of shuffling rules using DIRT.
W09-2506 [bib]: Stefan Thater; Georgiana Dinu; Manfred Pinkal
Ranking Paraphrases in Context
WSD: Using lexsub dataset. No dictionary (I think). VSM semantic representation. Check Mitchell&Lapata, Erk&Pado for prior work.
W09-2510 [bib]: David Clausen; Christopher D. Manning
Presupposed Content and Entailments in Natural Language Inference
RTE: Example: "Mary lied about buying a car" -> Mary did not buy a car. "Mary regretted buying a car" -> Mary bought a car. "Mary thought about buying a car" -> Uncertain. Kartunnen 1975 presupposition projection. Check out NatLog system (natural logic).
D09-1058 [bib]: Jun Suzuki; Hideki Isozaki; Xavier Carreras; Michael Collins
An Empirical Study of Semi-supervised Structured Conditional Models for Dependency Parsing
Syntax: Take a look at earlier model in Suzuki, ACL'08. What is with the q function? Other work building on McDonald: Carreras '07, Koo '08. MIRA training.
D09-1060 [bib]: Wenliang Chen; Jun’ichi Kazama; Kiyotaka Uchimoto; Kentaro Torisawa
Improving Dependency Parsing with Subtrees from Auto-Parsed Data
Syntax: Self training, SSL for parser. Improvement, even though confidence in unlabeled text not well represented. Best system does 46% of the sentences completely correct (unlabeled).
D09-1065 [bib]: Brian Murphy; Marco Baroni; Massimo Poesio
EEG responds to conceptual stimuli and corpus semantics
Brain: Using EEG instead of fMRI in Mitchell style work. Why doesn't anybody try: (1) verbs, (2) grammaticality, (3) lie/truth, (4) agree/disagree, (5) complex grammatical constructs.
D09-1070 [bib]: Taesun Moon; Katrin Erk; Jason Baldridge
Unsupervised morphological segmentation and clustering with document boundaries
Mor: help unsupervised morphology by assuming same stem more likely to appear in same document.
D09-1071 [bib]: Jurgen Van Gael; Andreas Vlachos; Zoubin Ghahramani
The infinite HMM for unsupervised PoS tagging
Syntax: Use npbayes to pick the number of HMM states. Directly use learnt HMM states rather than trying to map them to existing tagset.
D09-1085 [bib]: Laura Rimell; Stephen Clark; Mark Steedman
Unbounded Dependency Recovery for Parser Evaluation
Syntax: same motivation as Onder's work. Focuses on a particular construct difficult for parsers (accuracy < 50%) and builds a test set. Same problem in many fields (infrequent senses ignored in WSD, rare issues ignored in RTE/Semantics, rare constructs ignored in syntax, etc. etc.)
D09-1086 [bib]: David A. Smith; Jason Eisner
Parser Adaptation and Projection with Quasi-Synchronous Grammar Features
Syntax: learn mapping between parsers with different output styles (e.g. how they connect auxiliary verbs).
D09-1088 [bib]: Reut Tsarfaty; Khalil Sima’an; Remko Scha
An Alternative to Head-Driven Approaches for Parsing a (Relatively) Free Word-Order Language
Syntax: Separate ordering information to get better coefficient stats in parser learning. Many issues same as Turkish.
D09-1105 [bib]: Roy Tromble; Jason Eisner
Learning Linear Ordering Problems for Better Translation
MT: Approximate solution to reordering problem for MT. Shows improvement. Does not make use of parse tree.
D09-1106 [bib]: Yang Liu; Tian Xia; Xinyan Xiao; Qun Liu
Weighted Alignment Matrices for Statistical Machine Translation
MT: Compact representation for an alignment distribution. Similar to forest for trees or lattice for segmentations.
D09-1107 [bib]: Matti Kääriäinen
Sinuhe – Statistical Machine Translation using a Globally Trained Conditional Exponential Family Translation Model
MT: New MT engine based on structured learning. Faster than moses with better TM scores, but overall lower BLEU.
D09-1108 [bib]: Hui Zhang; Min Zhang; Haizhou Li; Chew Lim Tan
Fast Translation Rule Matching for Syntax-based Statistical Machine Translation
MT: Compact representation with fast search for packed forests.
Full post... Related link
August 04, 2009
Abstract: We experiment with splitting words into their stem and sufﬁx components for modeling morphologically rich languages. We show that using a morphological analyzer and disambiguator results in a signiﬁcant perplexity reduction in Turkish. We present ﬂexible n-gram models, Flex-Grams, which assume that the n−1 tokens that determine the probability of a given token can be chosen anywhere in the sentence rather than the preceding n − 1 positions. Our ﬁnal model achieves 27% perplexity reduction compared to the standard n-gram model.
Full post... Related link