January 03, 2011

Plausibility vs. Inference

Here are two examples of common sense judgement:

(1) I saw the Statue of Liberty flying over New York. => I was flying over New York.
(2) I gave the book to Mary. => Mary has the book.

The first one involves syntactic disambiguation. Either the subject or the object could be doing the flying (consider "I saw the airplane flying over New York.") Our "common sense" tells us that the Statue is too big to fly, so I am the more likely one flying.

The second one involves a semantic inference. The meaning of give involves a physical transfer or a transfer of possession, as a consequence the item given ends up with the recipient. Our "common sense" is full of such little factoids (here is another: "Oswald killed Kennedy." => "Kennedy is dead.") which let us see beyond what is explicitly stated in the text.

I want to emphasize that these examples are qualitatively different and calling them both "common sense judgements" may be confusing. The first one is a plausibility judgement (which is more likely to fly: me or the statue?). The second one is an exact inference, i.e. "give" definitely causes transfer and "kill" definitely causes death. To solve the first one we need a model of what is more likely to be happening in the world. To solve the second one we need more traditional inference of what entails what.

Disambiguation problems in computational linguistics (word sense disambiguation, resolving syntactic ambiguities, etc.) rely on plausibility judgements, not exact inference. A lot of work in AI "common sense reasoning" will not help there because traditionally reasoning and inference work focus on exact judgements.

As far as I can see nobody is working on plausibility judgements explicitly. Researchers use corpus statistics as a proxy to solve disambiguation problems. This may be obscuring the real issue: I think the right way to do linguistic disambiguation is to have a model of what is plausible in the world.

Full post...