Ali Hürriyetoğlu,
Hristo Tanev,
Vanni Zavarella,
Jakub Piskorski,
Reyyan Yeniterzi,
Deniz Yuret and
Aline Villavicencio.
2021.
Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2021): Workshop and Shared Task Report. In
Proceedings of the 4th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2021), pp
1--9,
Online,
Aug.
Association for Computational Linguistics. [
ai.ku]
url abstract google scholar
This workshop is the fourth issue of a series of workshops on automatic extraction of socio-political events from news, organized by the Emerging Market Welfare Project, with the support of the Joint Research Centre of the European Commission and with contributions from many other prominent scholars in this field. The purpose of this series of workshops is to foster research and development of reliable, valid, robust, and practical solutions for automatically detecting descriptions of socio-political events, such as protests, riots, wars and armed conflicts, in text streams. This year workshop contributors make use of the state-of-the-art NLP technologies, such as Deep Learning, Word Embeddings and Transformers and cover a wide range of topics from text classification to news bias detection. Around 40 teams have registered and 15 teams contributed to three tasks that are i) multilingual protest news detection detection, ii) fine-grained classification of socio-political events, and iii) discovering Black Lives Matter protest events. The workshop also highlights two keynote and four invited talks about various aspects of creating event data sets and multi- and cross-lingual machine learning in few- and zero-shot settings.
Cemil Cengiz,
Ulaş Sert and
Deniz Yuret.
2019.
KU_ai at MEDIQA 2019: Domain-specific Pre-training and Transfer Learning for Medical NLI. In
Proceedings of the 18th BioNLP Workshop and Shared Task, pp
427--436,
Florence, Italy,
Aug.
Association for Computational Linguistics. [
ai.ku]
url abstract google scholar
In this paper, we describe our system and results submitted for the Natural Language Inference (NLI) track of the MEDIQA 2019 Shared Task. As KU{\_}ai team, we used BERT as our baseline model and pre-processed the MedNLI dataset to mitigate the negative impact of de-identification artifacts. Moreover, we investigated different pre-training and transfer learning approaches to improve the performance. We show that pre-training the language model on rich biomedical corpora has a significant effect in teaching the model domain-specific language. In addition, training the model on large NLI datasets such as MultiNLI and SNLI helps in learning task-specific reasoning. Finally, we ensembled our highest-performing models, and achieved 84.7{\%} accuracy on the unseen test dataset and ranked 10th out of 17 teams in the official results.