next up previous
Next: Lecture 12 - Word Up: CS3421: Natural Language Engineering Previous: ``Lecture'' 10 = Practical

Lecture 11 - Efficient Parsing

Reading: SLP Ch. 10, Parsing with CFGs, sections 1 & 4;
Ch. 12, Lexicalised and Probabilistic Parsing, sections 1, 3, 4;
Allen Ch. 7 section 5

Parsing with Phrase Structure Grammars

Parsing: determining the structure of a piece of language, usually a sentence, according to a grammar or a language model.

Most commonly a phrase structure grammar, made up of rewrite rules which map to trees.

These rules used to be written by hand (remember the examples in the first lecture). Syntactic parsing was more or less solved in the classical period, and the grammars written then form the basis of many which are still used. These days they can also be learned from tagged corpora.

See SLP 10.1, especially Figs 10.2, 10.3, 10.4 on basic top-down and bottom-up parsing using PS rules and parse trees.

``Earley's Algorithm'' or ``chart parsing''

Naive top-down or bottom-up parsing with exhaustive backtracking takes exponential time in the length of the string. A dynamic programming approach (remember the Viterbi algorithm) to efficient parallel top-down search reduces this to polynomial time, by eliminating the repetitive solution of sub-problems when backtracking - $O(N^3)$ where $N$ is the number of words in the input.

A data structure called a chart holds successfully parsed constituents within the string. The input is represented as a graph, where the edges are labelled with the phrase structure rule which describes the sub-string they span. The right-hand sides of the rules include a dot showing how many of their constituents are included in the sub-string.

$S \longrightarrow . VP, [0,0]$ predicts that a VP will start after 0, but no input has been recognised yet.

$NP \longrightarrow Det . Nominal, [1,2]$ is trying to build an NP from a Det and a Noun: it has found the Det from 1 to 2 and predicts that the next word will be a Noun.

$VP \longrightarrow V$ $NP ., [0,3]$ has found a complete VP, made of Verb and a NP, from 0 to 3.

The first two types are active edges: they are looking for further constituents to complete what they are trying to build. The third is an inactive edge: it is complete.

The fundamental rule of chart-parsing says, informally,

When an active edge meets an inactive edge of the desired category, add a new edge to the chart spanning both, and label it with the rule on the active edge, with its dot moved along to indicate the constituent which has been incorporated.

The best account of chart parsing is in Gazdar & Mellish, Natural Language Processing in LISP, Ch. 6.1-6.3.

Probabilistic Parsing

Phrase structure rules can be associated with probabilities just as POS tags can be. This can help resolve ambiguities in sentence structure, and is an important part of language modelling for tasks which involve prediction, such as speech recognition. See the example in SLP 12.1.

The overall probability of a tree for a string can be computed by multiplying the probabilities of the rules used, exactly as the probability of a string of categories is computed by multiplying the bigram probabilities.

Lexicalised Parsing

We saw that bare bigram probabilities gave better predictive results if combined with lexical probabilities. Here too, information about individual words complements information about bare rules. (See SLP 12.3).

``Knowledge about a language is knowledge about the words of that language and how they behave''

This principle is taken to its logical conclusion in Categorial Grammars, which encode the syntactic behaviour of a word directly in its lexical category, and thus do not need grammar rules. They are mentioned briefly in SLP 12.4.


next up previous
Next: Lecture 12 - Word Up: CS3421: Natural Language Engineering Previous: ``Lecture'' 10 = Practical
Mary McGee Wood 2002-12-10