M.S Thesis: Improving Generalization in Natural Language Inference by Joint Training with Semantic Role Labeling, Koç University, Department of Computer Engineering. June 2020. (PDF, Presentation).
Publications: BibTeX
Thesis Abstract:
Recently, end-to-end models have achieved near-human performance on natural language inference (NLI) datasets. However, they show low generalization on out-of-distribution evaluation sets since they tend to learn shallow heuristics due to the biases in the training datasets. The performance decreases dramatically on diagnostic sets measuring compositionality or robustness against simple heuristics. Existing solutions for this problem employ dataset augmentation by extending the training dataset with examples from the evaluated adversarial categories. However, that approach has the drawbacks of being applicable to only a limited set of adversaries and at worst hurting the model performance on other adversaries not included in the augmentation set. Instead, our proposed solution is to improve sentence understanding (hence out-of-distribution generalization) with joint learning of explicit semantics. In this thesis, we show that a BERT based model trained jointly on English semantic role labeling (SRL) and NLI achieves significantly higher performance on external evaluation sets measuring generalization performance.
Full post...