December 03, 2018

Building a Language and Compiler for Machine Learning

Post at Julia Blog by Mike Innes et al.
Mention at techrepublic.

Since we originally proposed the need for a first-class language, compiler and ecosystem for machine learning (ML), there have been plenty of interesting developments in the field. Not only have the tradeoffs in existing systems, such as TensorFlow and PyTorch, not been resolved, but they are clearer than ever now that both frameworks contain distinct “static graph” and “eager execution” interfaces. Meanwhile, the idea of ML models fundamentally being differentiable algorithms – often called differentiable programming – has caught on...


Full post... Related link

November 27, 2018

Lost in Math by Sabine Hossenfelder

The book makes great points, especially regarding the faulty incentive structure of scientists and the charade of “research proposals” for funding. My only qualm is that the reader may come away thinking there are no objective measures of goodness for models, which is not true. Statistical learning theory, Bayesian Occam factors, algorithmic complexity etc. each grapple with this problem, yet not much of this is mentioned.
Full post... Related link

November 06, 2018

Towards Generalizable Place Name Recognition Systems: Analysis and Enhancement of NER Systems on English News from India

Arda Akdemir, Ali Hürriyetoglu, Erdem Yörük, Burak Gürel, Çagri Yoltar and Deniz Yuret. 2018. In Proceedings of the 12th Workshop on Geographic Information Retrieval (GIR'18). ACM. (pdf, url, proceedings)

Abstract: Place name recognition is one of the key tasks in Information Extraction. In this paper, we tackle this task in English News from India. We first analyze the results obtained by using available tools and corpora and then train our own models to obtain better results. Most of the previous work done on entity recognition for English makes use of similar corpora for both training and testing. Yet we observe that the performance drops significantly when we test the models on different datasets. For this reason, we have trained various models using combinations of several corpora. Our results show that training models using combinations of several corpora improves the relative performance of these models but still more research on this area is necessary to obtain place name recognizers that generalize to any given dataset.


Full post...

October 31, 2018

Tree-stack LSTM in Transition Based Dependency Parsing

Ömer Kırnap, Erenay Dayanık and Deniz Yuret. 2018. In Proceedings of the {CoNLL} 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies. (paper, code, proceedings, 2017 version, Ömer's MS thesis)

Abstract: We introduce tree-stack LSTM to model state of a transition based parser with recurrent neural networks. Tree-stack LSTM does not use any parse tree based or hand-crafted features, yet performs better than models with these features. We also develop new set of embeddings from raw features to enhance the performance. There are 4 main components of this model: stack’s σ-LSTM, buffer’s βLSTM, actions’ LSTM and tree-RNN. All LSTMs use continuous dense feature vectors (embeddings) as an input. Tree-RNN updates these embeddings based on transitions. We show that our model improves performance with low resource languages compared with its predecessors. We participate in CoNLL 2018 UD Shared Task as the ”KParse” team and ranked 16th in LAS, 15th in BLAS and BLEX metrics, of 27 participants parsing 82 test sets from 57 languages.


Full post...

SParse: Koç University Graph-Based Parsing System for the CoNLL 2018 Shared Task

Berkay Furkan Önder, Can Gümeli and Deniz Yuret. 2018. In Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies. (paper, code, proceedings)

Abstract: We present SParse, our Graph-Based Parsing model submitted for the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies (Zeman et al., 2018). Our model extends the state-of-the-art biaffine parser (Dozat and Manning, 2016) with a structural meta-learning module, SMeta, that combines local and global label predictions. Our parser has been trained and run on Universal Dependencies datasets (Nivre et al., 2016, 2018) and has 87.48% LAS, 78.63% MLAS, 78.69% BLEX and 81.76% CLAS (Nivre and Fang, 2017) score on the Italian-ISDT dataset and has 72.78% LAS, 59.10% MLAS, 61.38% BLEX and 61.72% CLAS score on the Japanese-GSD dataset in our official submission. All other corpora are evaluated after the submission deadline, for whom we present our unofficial test results.


Full post...