May 23, 2011

Semantic Structures by Ray Jackendoff

Jackendoff divides the study of the language faculty into three components: phonological, syntactic, and conceptual. In speech recognition a model that tells us what word sequences are likely is essential to recognition accuracy, because the acoustic information is often ambiguous (e.g. "She kissed this guy." vs. "She kissed the sky."). Stated in Jackendoff's terms, a probabilistic model of the syntactic component is helpful in disambiguating what is going on in the phonological component. Similarly I think a model of what is likely in the conceptual component is essential to resolving ambiguities in the syntactic component. If "what is likely to be said" is essential in interpreting "what is heard", then "what is likely to be meant" is similarly essential in interpreting "what is said".

Unfortunately we do not have good conceptual models yet, so computational linguists still try to make do with error prone hand tagging and shallow machine learning to disambiguate senses, references, and relations.

On a side note, each component in Jackendoff's work is modeled after the generative paradigm which, for the syntactic component, is described as follows:
  1. Speakers can understand and create an indefinite number of sentences they have never heard before.
  2. Therefore the repertoire of syntactic structures cannot be characterized as a finite list of sentences.
  3. Nor can it be characterized as an infinite list of possible sentences because we have finite brains.
  4. Thus it MUST be mentally encoded in terms of a finite set of primitives and a finite set of principles of combination that collectively generate the class of possible sentences.

Am I the only one befuddled by this argument? Primitives plus means of combination is certainly one way to create infinity using finite means, but why assume it is the only way? Dynamic systems, random processes, who knows what else can lead to infinite possible outcomes from a finite initial endowment. Why just present two strawmen, finite and infinite lists, as the only alternatives to discrete primitives and combination? Why after a couple of paragraphs further narrow the description to "the argument from creativity to the NECESSITY for principles or rules in syntactic knowledge"? Discrete primitives with finite and definite constraints and rules of combination is one way to build a representational system, unlikely to be the correct way for all three components of language, and certainly not the only way.

See also: Plausibility vs. Inference.

Full post... Related link