November 27, 2018

Lost in Math by Sabine Hossenfelder

The book makes great points, especially regarding the faulty incentive structure of scientists and the charade of “research proposals” for funding. My only qualm is that the reader may come away thinking there are no objective measures of goodness for models, which is not true. Statistical learning theory, Bayesian Occam factors, algorithmic complexity etc. each grapple with this problem, yet not much of this is mentioned.
Full post... Related link

November 06, 2018

Towards Generalizable Place Name Recognition Systems: Analysis and Enhancement of NER Systems on English News from India

Arda Akdemir, Ali Hürriyetoglu, Erdem Yörük, Burak Gürel, Çagri Yoltar and Deniz Yuret. 2018. In Proceedings of the 12th Workshop on Geographic Information Retrieval (GIR'18). ACM. (pdf, url, proceedings)

Abstract: Place name recognition is one of the key tasks in Information Extraction. In this paper, we tackle this task in English News from India. We first analyze the results obtained by using available tools and corpora and then train our own models to obtain better results. Most of the previous work done on entity recognition for English makes use of similar corpora for both training and testing. Yet we observe that the performance drops significantly when we test the models on different datasets. For this reason, we have trained various models using combinations of several corpora. Our results show that training models using combinations of several corpora improves the relative performance of these models but still more research on this area is necessary to obtain place name recognizers that generalize to any given dataset.

Full post...