Parser Evaluation using Textual Entailments (PETE)
The purpose of this post is to encourage participation in the task "Parser Evaluation using Textual Entailments" in the 5th International Workshop on Semantic Evaluations, SemEval-2010 (http://semeval2.fbk.eu/semeval2.php) collocated with ACL-2010, July
This shared task should be of interest to researchers working on
* semantic role labeling
* recognizing textual entailments
Parser Evaluation using Textual Entailments (PETE) is a shared task in the SemEval-2010 Evaluation Exercises on Semantic Evaluation. The task involves recognizing textual entailments (RTE) based on syntactic information. Given two text fragments called 'Text' and 'Hypothesis', Textual Entailment Recognition is the task of determining whether the meaning of the Hypothesis is entailed (can be inferred) from the Text. The PETE task focuses on entailments that can be inferred using syntactic information alone.
- Text: The man with the hat was tired.
- Hypothesis-1: The man was tired. (YES)
- Hypothesis-2: The hat was tired. (NO)
Our goals in introducing this task are:
- To focus parser evaluation on semantically relevant phenomena.
- To introduce a parser evaluation scheme that is formalism independent.
- To introduce a targeted textual entailment task focused on a single linguistic competence.
- To be able to collect high quality evaluation data from untrained annotators.
The following criteria were used when constructing the entailments:
- They should be decidable using only syntactic inference.
- They should be easy to decide by untrained annotators.
- They should be challenging for state of the art parsers.
You can find more details about our entailment generation process in the PETE Guide. You can download the development and test datasets including gold answers and system scores here: PETE_gold.zip. There is no training data. The evaluation is similar to other RTE tasks. There is a Google group semeval-pete for task related messages.
Here are some links for publicly available parsers that can be used in this task. You do not have to use any of these parsers, in fact you do not have to use a conventional parsing algorithm at all -- outside the box approaches are highly encouraged. However, to get a quick baseline system using an existing parser may be a good way to start.
- Berkeley Parser
- Bikel Parser
- C&C CCG Parser
- Collins Parser
- Charniak Parser
- CMU Link Parser
- DeSR Parser
- Enju Parser
- RASP Parser
- Stanford Parser
- conll-entailments.pl: This is not a parser but a simple script to illustrate how short entailments may be generated from a dependency parse. Incomplete and buggy, use at your own risk.
- PETE Guide: A description of the entailment generation process (February, 2010).
- D09-1085.pdf: Rimell, L., S. Clark, and M. Steedman. Unbounded Dependency Recovery for Parser Evaluation. Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing (August, 2009).
- thesis.pdf: Onder Eker's MS thesis (August, 2009).
- semeval-abstract.pdf: The PETE task abstract (December, 2008).
- pete.pdf: The initial PETE task proposal (September, 2008).
- Workshop on Cross-Framework and Cross-Domain Parser Evaluation (August, 2008)
- natlog-wtep07-final.pdf: Bill MacCartney and Christopher D. Manning. 2007. Natural logic for textual inference. ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, pp. 193-200. (June, 2007).
- targeted textual entailments: On targeted textual entailments in general (June 2007).
- a blog post: On the consistency of Penn Treebank annotation (October, 2006).
- lre98.pdf: Carroll, J., E. Briscoe and A. Sanfilippo (1998) `Parser evaluation: a survey and a new proposal'. In Proceedings of the 1st International Conference on Language Resources and Evaluation, Granada, Spain. 447-454.
- Deniz Yuret firstname.lastname@example.org