next up previous
Next: About this document ... Up: CS2121: The Implementation and Previous: More examples of Ambiguous

Subsections

More examples of shift/reduce and reduce/reduce conflicts with YACC

Conflicts from YACC

An example of a reduce/reduce conflict

The grammar that beats an LR(1) parser from $\S $H.2 causes YACC to report:

5: reduce/reduce conflict (reduce 3, reduce 4) on c
state 5
        B : x y .  (3)
        E : x y .  (4)

As we expected, when the parser sees an input of x y c it doesn't have enough information to be able to decide between reducing the x y to B or to E.

An example of multiple shift/reduce conflicts

The grammar from $\S $H.1 causes YACC to report 42 shift/reduce conflicts! (Roughly speaking, this is because each of the 6 operators causes problems with each of the 7 grammar rules that includes exp.)

Here is one of the groups of conflicts:

6: shift/reduce conflict (shift 7, reduce 9) on '+'
6: shift/reduce conflict (shift 8, reduce 9) on '-'
6: shift/reduce conflict (shift 9, reduce 9) on '*'
6: shift/reduce conflict (shift 10, reduce 9) on '/'
6: shift/reduce conflict (shift 11, reduce 9) on '^'
6: shift/reduce conflict (shift 12, reduce 9) on '='
state 6
        exp : exp . '+' exp  (1)
        exp : exp . '-' exp  (2)
        exp : exp . '*' exp  (3)
        exp : exp . '/' exp  (4)
        exp : exp . '^' exp  (7)
        exp : exp . '=' exp  (8)
        exp : '!' exp .  (9)

What this is trying to tell us is that when we have an input of the form exp op exp op exp (or ! exp op exp), when we see the second operator, we don't know if the first op should be reduced (because it has higher precedence i.e. (exp op exp) op exp) or if the second op should be shifted (because it has the higher precedence i.e. exp op (exp op exp)).

A longer example, with shift/reduce and reduce/reduce conflicts

Suppose we tried to modify the left recursive grammar from $\S $5 to detect various errors:

$CS2121/e*/idlist/idlist1_y.y

list    : id
        | list optcomma id
        | list ','               {yyerror ("extra comma ignored");}
        ;
optcomma: ','
        |                        {yyerror ("missing comma inserted");}
        ;

the y.output file from YACC includes the following:

4: reduce/reduce conflict (red'ns 3 and 4) on id
state 4 list : list ,_   (3)
        optcomma : ,_   (4)

        .=reduce 3

(_ indicates where we are up to in the input - BYACC uses .)

i.e. having read a ',' to get to state 4, if we then read an id we do not know which of the two rules to recognise the comma with:


\begin{picture}(32,13)
\put(10,12){list}
\put(15,12){*optcomma}
\put(10.8,11.8)...
...10,4){list}
\put(10.8,3.8){\line(1,-1){2.2}} %
\put(12,0){*list}
\end{picture}
$[$ $]$ = state number at this point in the input, * = the alternative reductions

We can sometimes get more information by finding the state(s) that precede(s) state 4 by looking for shift 4 or goto 4 in y.output (and repeating if necessary):

state 1 $accept : list_$end
        list : list_optcomma id
        list : list_,
        optcomma : _   (5)

        $end accept       , shift 4       id reduce 5       optcomma goto 3


\begin{picture}(32,17)
\put(15,16){list}
\put(14.8,15.8){\line(-1,-1){2.2}} % /...
...10,4){list}
\put(10.8,3.8){\line(1,-1){2.2}} %
\put(12,0){*list}
\end{picture}

Finally, if we reexamine the grammar, we can see that the only way the second parse can recognise a list followed by an id is if we can insert an optcomma between them, which we can onll do by recognising an empty optcomma:


\begin{picture}(32,20)
\put(15,20){list}
\put(14.8,19.8){\line(-1,-1){2.2}} % /...
...12,4){*list}
\put(14.8,3.8){\line(1,-1){2.2}} %
\put(18,0){list}
\end{picture}

so to get to state 4 we have read a list followed by a ',', and we can not tell if this is an optcomma or not just by looking at the next symbol (id) because the id can either belong to this list (i.e. reduce rule 4) or it can belong to a surrounding list optcomma id with an empty optcomma (i.e. reduce rule 3 then rule 5) - in either case, we next shift the id and reduce rule 2.

i.e. the parse tree for an input of id , id can be:

\begin{picture}(16,9)
\put(7,8){list}
\put(6.8,7.8){\line(-2,-1){4.4}} % /
\pu...
...put(7.8,3.8){\line(0,-1){2.2}} % \vert
\put(0,0){id}\put(7,0){','}
\end{picture}

i.e. ((id) (,) id)
or:

\begin{picture}(20,13)
\put(11,12){list}
\put(10.8,11.8){\line(-2,-1){4.4}} % /...
...t(7,4){','}
\put(0.8,3.8){\line(0,-1){2.2}} % \vert
\put(0,0){id}
\end{picture}

i.e. (((id) ,) () id) with two error messages:
   extra comma ignored
   missing comma inserted

The problem may be that we are forcing the parser to decide what to do with the ',' too early on. We can try to delay the decision by getting rid of the optcomma grammar rule and putting extra alternatives into the rule for list:

$CS2121/e*/idlist/idlist2_y.y

list    : id
        | list ',' id
        | list id                {yyerror ("missing comma inserted");}
        | list ','               {yyerror ("extra comma ignored");}
        ;

Unfortunately, looking at the y.output file from YACC for this grammar, we have simply replaced the reduce-reduce conflict by a shift-reduce conflict:

3: shift/reduce conflict (shift 5, reduce 4) on id
state 3 list : list ,_id
        list : list ,_   (4)

        id shift 5        . reduce 4
i.e. after inputting and reducing a list, and shifting a ',' if the parser then encounters another id, it does not know whether to include this id in the current list (shift) or leave this id to become part of a later list (reduce). i.e. the parse tree for an input of id , id can be:

\begin{picture}(16,9)
\put(7,8){list}
\put(6.8,7.8){\line(-2,-1){4.4}} % /
\pu...
...t(14,4){id}
\put(0.8,3.8){\line(0,-1){2.2}} % \vert
\put(0,0){id}
\end{picture}

i.e. ((id) , id)
or:

\begin{picture}(20,13)
\put(11,12){list}
\put(10.8,11.8){\line(-2,-1){4.4}} % /...
...t(7,4){','}
\put(0.8,3.8){\line(0,-1){2.2}} % \vert
\put(0,0){id}
\end{picture}

i.e. (((id) ,) id) with the same two error messages as before

Looking at the possible parses, this is still pretty much the same problem as before, so unfortunately the change we made to the grammar has not made any real improvement.

Although we can't tell just from the error messages reporting the conflicts, because we can create two alternative parse trees that both completely match a possible legal input (i.e. id , id), this grammar and the previous version are both ambiguous, and can never be recognised correctly by a parser tool.

The only thing to do is to take a step back and describe afresh what we are trying to recognise. I will concentrate on the identifiers, as they are what we are really interested in. We want to recognise a list of identifiers, each pair separated by zero or more commas (although we expect exactly one), and the whole list perhaps preceded and trailed by some commas (although we expect none):

$CS2121/e*/idlist/idlist3_y.y

list     : no_commas idlist no_commas
         ;
idlist   : id
         | idlist one_comma id
         ;
one_comma: commas
         |                 {yyerror ("missing comma inserted");}
         ;
no_commas: commas          {yyerror ("extra comma ignored");}
         |
         ;
commas   : commas ','      {yyerror ("extra comma ignored");}
         | ','
         ;


next up previous
Next: About this document ... Up: CS2121: The Implementation and Previous: More examples of Ambiguous
Pete Jinks
2004-10-26