February 11, 2022

Osman Mutlu, M.S. 2022

Current position: Project engineer, Koç University (LinkedIn, Email)
MS Thesis: Utilizing coarse-grained data in low-data settings for event extraction. February 2022. (PDF, Presentation, Publications, Code).

Thesis Abstract: Annotating text data for event information extraction systems is hard, expensive, and error-prone. We investigate the feasibility of integrating coarse-grained data (document or sentence labels), which is far more feasible to obtain, instead of annotating more documents. We utilize a multi-task model with two auxiliary tasks, document and sentence binary classification, in addition to the main task of token classification. We perform a series of experiments with varying data regimes for the aforementioned integration. Results show that while introducing extra coarse-grained data offers greater improvement and robustness, a gain is still possible with only the addition of negative documents that have no information on any event.

No comments: