November 05, 2012

Parser Evaluation Using Textual Entailments

Deniz Yuret, Laura Rimell, Aydin Han. Language Resources and Evaluation. 2012. (PDF, URL, Task website, Related posts)

Abstract
Parser Evaluation using Textual Entailments (PETE) is a shared task in
the SemEval-2010 Evaluation Exercises on Semantic Evaluation. The
task involves recognizing textual entailments based on syntactic
information alone. PETE introduces a new parser evaluation scheme
that is formalism independent, less prone to annotation error, and
focused on semantically relevant distinctions. This paper describes
the PETE task, gives an error analysis of the top-performing Cambridge
system, and introduces a standard entailment module that can be used
with any parser that outputs Stanford typed dependencies.

Full post...

October 18, 2012

My comments on Norvig's comments on Chomsky's comments on Statistical Learning

Noam Chomsky made a few negative comments on statistical learning last year during the MIT150 Symposium on Brains, Minds, and Machines. Peter Norvig, who was also at the symposium, later published a provocative essay on Chomsky's comments.

I particularly enjoyed the statistical analysis of Chomsky's famous example: "Pereira (2001) showed that such a (statistical, finite-state) model, augmented with word categories and trained by expectation maximization on newspaper text, computes that (a) 'colorless green ideas sleep furiously' is 200,000 times more probable than (b) 'furiously sleep ideas green colorless'."

If interested in a colorful follow up to this discussion, see Straw men and Bee Science by Mark Liberman on the Language Log and the related comments.

Full post...

Writing a paper with a 1st year PhD student



For more see http://researchinprogress.tumblr.com.
Full post...

August 14, 2012

Onur Varol, M.S. 2012

Current position: Assistant Professor, Sabancı University, Istanbul. (website, email).
M.S. Thesis: Modal Analysis of Myosin II and Identification of Functionally Important Sites. Koç University, Department of Computer Engineering. June, 2012. (PDF, Presentation).

Abstract:
Analysis of protein dynamics uses structural and fluctuation based methods. Fluctuation analysis of protein dynamics has proven to be a rewarding venue of research. Mass and spring models are used in previous research commonly. However, fluctuations of this models are based on purely harmonic which has significant gap between the experimental results. Deviations from harmonicity mostly observe in slow, collective modes. Corrections like anharmonic modal decomposition are first step in order to minimize this gap. The contribution of the higher-order corrections is limited because of the interacting modes. Mode-coupling corrections which yield valuable information on means of energy transfer and allostery.

In this work, molecular dynamic results of Dictyostelium discoideum myosin II motor domain is used as test ground. Mode fluctuation distributions produced using MD results, fully harmonic models and a model with anharmonic corrections. Tensorial hermite polynomials are used in order to obtain distributions of modal fluctuations. Fluctuations on modal space are transformed back into real space and distribution of residual fluctuations is compared using KL divergence. Analysis results for ligand-bound and free myosin dynamics are used in order to demonstrate that the mode-coupling contributions alone highlight functionally important sites.


Full post...

July 15, 2012

ACL-EMNLP 2012 Highlights

We demonstrated 80% unsupervised part-of-speech induction accuracy in our EMNLP-2012 paper using paradigmatic representations of word context and co-occurrence modeling. Here are some interesting talks I attended at ACL-EMNLP this year.

Check out this tutorial:

Inderjeet Mani; James Pustejovsky
Qualitative Modeling of Spatial Prepositions and Motion Expressions
http://aclweb.org/supplementals/P/P12/P12-4001.Presentation.pdf

And this paper from today on learning language from navigation instructions and user behavior was interesting:

David Chen
Fast Online Lexicon Learning for Grounded Language Acquisition
http://aclweb.org/anthology-new/P/P12/P12-1045.pdf

All papers can be found on the ACL Anthology page:
http://aclweb.org/anthology-new/P/P12/

Here is a work that builds on CCM for unsupervised parsing that works better on longer sentences:

P12-2004 [bib]: Dave Golland; John DeNero; Jakob Uszkoreit
A Feature-Rich Constituent Context Model for Grammar Induction
http://aclweb.org/anthology-new/P/P12/P12-2004.pdf

Here is another interesting talk about grounded language acquisition, computer learning to follow instructions in a virtual world. They have collected nice corpora of instructions and behaviors of people following those instructions.

P12-2036 [bib]: Luciana Benotti; Martin Villalba; Tessa Lau; Julian Cerruti
Corpus-based Interpretation of Instructions in Virtual Environments
http://aclweb.org/anthology-new/P/P12/P12-2036.pdf

And here is another similar work on grounded language acquisition:

P12-1045 [bib]: David Chen
Fast Online Lexicon Learning for Grounded Language Acquisition
http://aclweb.org/anthology-new/P/P12/P12-1045.pdf

I nominate this as the best paper on computer humor generation of ACL 2012. Note the resources ConceptNet and SentiNet mentioned in this work which may be independently useful.

P12-2030 [bib]: Igor Labutov; Hod Lipson
Humor as Circuits in Semantic Networks
http://aclweb.org/anthology-new/P/P12/P12-2030.pdf

We have been working on reordering for SMT. Here is a paper that modifies distortion matrices instead to allow more flexible reorderings.

P12-1050 [bib]: Arianna Bisazza; Marcello Federico
Modified Distortion Matrices for Phrase-Based Statistical Machine Translation
http://aclweb.org/anthology-new/P12-1050.pdf

Interesting tree transformation operations.

Transforming Trees to Improve Syntactic Convergence
D. Burkett and D. Klein .
http://aclweb.org/anthology-new/D/D12/D12-1079.pdf

The following paper utilizes n-gram language models in unsupervised dependency parsing:

D12-1028 [bib]: David Mareček; Zdeněk Žabokrtský
Exploiting Reducibility in Unsupervised Dependency Parsing
http://aclweb.org/anthology-new/D/D12/D12-1028.pdf

Another reordering paper from EMNLP. Mentions a string to tree version of Moses that is publicly available.

D12-1077 [bib]: Graham Neubig; Taro Watanabe; Shinsuke Mori
Inducing a Discriminative Parser to Optimize Machine Translation Reordering
http://aclweb.org/anthology-new/D/D12/D12-1077.pdf

Must read paper from emnlp. Very likely our paradigmatic representation would do better here.

D12-1130 [bib]: Carina Silberer; Mirella Lapata
Grounded Models of Semantic Representation

Full post...

July 12, 2012

Learning Syntactic Categories Using Paradigmatic Representations of Word Context

Mehmet Ali Yatbaz, Enis Sert, Deniz Yuret. EMNLP 2012. (Download the paper, presentation, code, fastsubs paper, lm training data (250MB), wsj substitute data (1GB), scode output word vectors (5MB), scode visualization demo (may take a few minutes to load). More up to date versions of the code can be found at github.)



Abstract: We investigate paradigmatic representations of word context in the domain of unsupervised syntactic category acquisition. Paradigmatic representations of word context are based on potential substitutes of a word in contrast to syntagmatic representations based on properties of neighboring words. We compare a bigram based baseline model with several paradigmatic models and demonstrate significant gains in accuracy. Our best model based on Euclidean co-occurrence embedding combines the paradigmatic context representation with morphological and orthographic features and achieves 80% many-to-one accuracy on a 45-tag 1M word corpus.

Full post...

June 04, 2012

Language visualization

This is a presentation on our ongoing language visualization project by Emre Unal. We thank the Alice project at CMU for giving us the platform for 3D visualization. This work is inspired by the work of Patrick Winston's Genesis Group at MIT and Bob Coyne's WordsEye project. We are also working on going from vision to language as demonstrated in this video.
Full post...

April 09, 2012

Probabilistic Programming

The probabilistic programming language Church brings together two of my favorite subjects: Scheme and Probability. I highly recommend this tutorial to graduate students interested in machine learning and statistical inference. The tutorial explains probabilistic inference through programming starting from simple generative models with biased coins and dice leading up to hierarchical, non-parametric, recursive and nested models. Even at the undergraduate level, I have long thought probability and statistics should be taught in an integrated manner instead of their current almost independent treatment. One roadblock is that even the simplest statistical inference (e.g. three tosses of a coin with an unknown (uniformly distributed) weight results in H, H, T; what is the fourth toss?) requires some calculus at the undergraduate level. Using a programming language like Church may allow an instructor to introduce basic concepts without students getting confused about the details of integration.
Full post...

April 01, 2012

The wonderful xargs command

I finally found a way I like to run a whole bunch of commands N at a time on an N core machine (well maybe use N-1 to be polite):

1. Say you have a command rprun.pl that takes 4 arguments that you want to run with 1000 different argument combinations.

2. You write a script rprun-args.pl that generates all combinations you need.  Say its output looks like:

10      185364  25      0.166
12      92682   25      0.166
18      65536   32      0.166
12      65536   25      0.7071
14      16384   25      0.166
...

3. Now you can use xargs to run these 24 at a time as follows:

rprun-args.pl | xargs -n4 -P24 rprun.pl > rprun.out

-n4 is to feed the arguments 4 at a time.  So a typical command line will look like:

rprun.pl 14 16384 25 0.166

-P24 tells xargs to run through the list 24 at a time.  If you run ps you will see 24 copies of rprun running together.  As soon as the number drops to 23 another child is spawned.

Note that the command above combines the outputs of all runs (in the order they finish) in the same file, so make sure rprun.pl prints out its arguments as well as its result on its output.

Full post...

March 10, 2012

On skill acquisition

A couple of months ago, I ran into this video by the Japanese coin magician Ponta the Smith.  Its elegance awoke my long dormant interest in close-up sleight-of-hand magic which had started when I was a kid and had peaked in LA taking classes at the Magic Castle.  I am especially fond of coin magic because its effects are so simple and direct.  I started watching the masters and practicing again.  My hands started being able to do things that they were not able to do a few days ago.  It surprised me to remember how much fun it was to acquire a new physical skill, and that I had not done so in more than a decade!

Then my friend Alkan showed me a video of Terry Laughlin, a swim coach with a unique training style.  He compares dolphins at 80% efficiency with the best olympic athletes at 8% and claims there is a lot to gain from reducing drag compared to adding power to the strokes.  Ernest Maglischo's standard reference also has consistent advice on correct body alignment.  While scanning Maglischo's book I was shocked to discover that it was not clear whether Newtonian or Bernoulli forces dominate the analysis of the swim stroke!  (Hey physicists, when you take a break from looking for the Higgs boson maybe you can help out with this?)  I have been swimming all my life and no matter how hard I tried I could not break my efficiency barrier at 17 strokes for a 25m pool.  After watching a couple of Laughlin's videos I was able to do it in 13!

Continuing on a chain of skill-acquisition serendipities, I came across Moonwalking with Einstein by Joshua Foer.  I should cover it more fully in a separate blog post.  In addition to giving an excellent synopsis of our current understanding of memory, it introduced me to the work of Anders Ericsson on skill acquisition. Ericsson has achieved some recent fame thanks to his research showing that experts tend to require about ten thousand hours of training to achieve their word-class status.  However what got my attention was the finding that when ordinary skill acquisition hits a plateau and improvement stops, that is rarely the sign of an innate limit, but rather the result of the skill becoming compiled and autonomous.  The trick to going past your plateaus and improving further is to bring the activity back to consciousness in sessions of "deliberate practice" where you pay attention to your technique and get constant and immediate feedback on your performance.  This is consistent with my swimming experience: Laughlin's videos made me pay attention to every stroke, in effect made me re-learn how to swim, and the 25m stroke count feedback pointed me in the right direction.

I am currently debating whether I should continue my self experimentation in the domain of Go, using techniques championed for chess by my friend Michael de la Maza, or improve my Bridge game by deliberate practice on card memory.  This is just too much fun.

On a more serious note, all this shows how little we know about skill acquisition and education in general and how much room there might be for improvement.  It seems to me the only way out of this conundrum is to allow experimentation in the educational domain with proper feedback and reward for innovative educators.

(*) Some of my favorite coin masters: David Roth, Michael Rubinstein, Jay Sankey, Gregory Wilson, David Stone, Giacomo Bertini, Kainoa Harbottle, Curtis Kam, Homer Liwag, Apollo Robins, Shoot Ogawa.

Full post...