We looked at this grammar when I talked about
making precedence and associativity explicit
using %left
, %right
and %nonassoc
:
in 4.4
exp : exp '+' exp | exp '-' exp | exp '*' exp | exp '/' exp | number | '(' exp ')' | exp '^' exp | exp '=' exp | '!' exp ;
Without the extra information about precedence and associativity provided by
%left
etc., there is no single parse for even something as simple as
1 + 2 * 3
- this could be recognised either as
( 1 + 2 ) * 3
or as
1 + ( 2 * 3 )
.
A : B c d | E c f ; B : x y ; E : x y ;
For this grammar, an example input that starts x y c
is enough to
confuse an LR(1) parser, as it has to decide whether x y
matches
B
or E
after only seeing 1 symbol further (i.e. c
).
An LL(1) parser would also be confused, but at the x
- should it
expand A
to B c d
or to E c f
, as both can start with
x
. An LL(2) or LL(3) parser would have similar problems at the
y
or c
respectively.
An LR(2) parser would be able to also see the d
or f
that
followed the c
and so make the correct choice between B
and
E
.
An LL(4) parser would also be able to look far enough ahead to see the
d
or f
that followed the c
and so make the correct choice
between expanding A
to B c d
or to E c f
.
However, the following grammar would confuse any LR(n) or LL(n) parser with a fixed amount of look-ahead:
A : B C d | E C f ; B : x y ; E : x y ; C : c | C c ;
As usual, the simplest solution is to rewrite the grammar to remove the confusion e.g.:
A : BorE c d | BorE c f ; BorE : x y ;
A : BorE c tail ; tail : d | f ; BorE : x y ;