Recognising the input line:

2 * 3 - 4 * 1 ;

First, LEX partially recognises the input, recognising numbers, passing through operators etc., and discarding spaces, tabs and newlines.

2 * 3 - 4 * 1 ; | | | | | | | | number | number | number | number |

Then, the various YACC grammar rules are used:

factor : number

2 * 3 - 4 * 1 ; | | | | | | | | number | number | number | number | | | | | | | | | factor | factor | factor | factor |

factors and *s are combined in terms:

term : term '*' factor

term | factor | term | factor | \ | / | \ | / | term | term |

but somehow, we have to transform the left-hand factors to terms.
We can do this using:

term : factor

2 * 3 - 4 * 1 ; | | | | | | | | number | number | number | number | | | | | | | | | factor | factor | factor | factor | | | | | | | | | term | / | term | / | \ | / | \ | / | term | term |

Similarly, terms and -s are combined in expressions:

exp : exp '-' term

and the first term can be converted to an exp by:

exp : term

2 * 3 - 4 * 1 ; | | | | | | | | number | number | number | number | | | | | | | | | factor | factor | factor | factor | | | | | | | | | term | / | term | / | \ | / | \ | / | term | term | | | | | exp | / | \ | / | exp |

and finally, the expression and the ; can be combined to make a line:

line : exp ';'

2 * 3 - 4 * 1 ; | | | | | | | | number | number | number | number | | | | | | | | | factor | factor | factor | factor | | | | | | | | | term | / | term | / | \ | / | \ | / | term | term | | | | | exp | / | \ | / | exp / \ / line

We can also show the associated values calculated for each node of the tree in the LEX and YACC actions:

2 * 3 - 4 * 1 ; | | | | | | | | number | number | number | number | (2) (3) (4) (1) | | | | | | | | factor | factor | factor | factor | (2) (3) (4) (1) | | | | | | | | term | / | term | / | (2) (4) \ | / | \ | / | term | term | (6=2*3) (4=4*1) | | | | exp | / | (6) \ | / | exp / (2=6-4) \ / line

So the output (from the action for line) is:

result is 2

Note that there is no value created for the line, as the corresponding action does not assign to $$.

BYACC will output a version of this if you turn on the full (yydebug=5) tracing:

.... look ahead at number `2' number <-- `2' factor term | .... look ahead at mul_div `*' | mul_div <-- `*' | | .... look ahead at number `3' | | number <-- `3' | | factor +-------+-------+ | term | .... look ahead at add_sub `-' exp | add_sub <-- `-' | | .... look ahead at number `4' | | number <-- `4' | | factor | | term | | | .... look ahead at mul_div `*' | | | mul_div <-- `*' | | | | .... look ahead at number `1' | | | | number <-- `1' | | | | factor | | +-------+-------+ | | | | | term | | | .... look ahead at ';' `;' +-------+-------+ | exp | ';' <-- `;' +-------+ | line .... look ahead at end-of-file `'

The tree above (without the associated values) is a
**concrete parse tree**, showing the details
of the grammar we have chosen to recognise expressions.

Normally, we use an **abstract parse tree**
to show the essentials of an expression:

2 3 4 1 ; \ / \ / | * * | \ / | - / \ / line

and this (but without the ";" and "line", drawn on its side and, depending on how you look at it, upside-down) is what is output by the program given in 7.2:

1 * 4 - 3 * 2

If we insert brackets around each operator node and its operands, we get the equivalent fully-bracketed expression:

((2*3)-(4*1))

This is the most compact representation of the essence of the expression, but we can get lost if there are many nested brackets, and so it is usually more awkward to extend this style to represent the structure of complete programs. However, some programming languages, like LISP, do exactly that!

2004-10-26