CS2112: Exercises + Answers from $7 Debugging Yacc grammars

* You will need your version of the calculator for the exercise of $5. Remove the "%left"s and see what error messages you get from Yacc. Do you understand how they arise? (Use the "-v" flag with Yacc to obtain the file y.output.)

The grammar no longer specifies the precedence or associativity of the operators
so there is no unique parse-tree (i.e. meaning) equivalent to any expression with more than one operator
so the grammar is ambiguous
so Yacc will report shift-reduce and/or reduce-reduce conflicts.
e.g. "a + b * c" could be "(a + b) * c" or "a + (b * c)"

If you only had prefix operators, or only postfix operators, or insisted on brackets around each expression, the grammar would be unambiguous, as it is the position of the implicit brackets that Yacc is unable to calculate.

* Check that I am outputting the correct number and kind of error messages in the final version of the identifier-list grammar above. Try running it with some correct and incorrect inputs.

I hope you agree that I got it right - if not, let me know!

* Modify the calculator in $4.1 to make the input format more flexible: create a grammar rule that recognises none, one or more spaces, and insert it where appropriate in the original grammar. You are likely to get lots of unexpected shift-reduce and reduce-reduce conflicts, so run your answer through Yacc to check it.

spaces : | spaces ' ' ;

 . . .

+ The grammar: s = 'a' 'b' | 'a' s 'b' ;
will recognise a string of 'a's followed by an equal number of 'b's. This grammar has been extended to:
s = 'a' 'b' | 'a' s 'b' | 'a' s {error} | s 'b' {error} ;
to detect errors but this gives shift-reduce and reduce-reduce conflicts (try it for yourself). Write a conflict-free grammar to detect errors.

This is similar to the example in the handout - we have to rethink the grammar. How about:

 s : anbn
   | anbn bs {yyerror("too many b's");}
   | as anbn {yyerror("too many a's");}
   ;
 anbn : 'a' 'b' | 'a' anbn 'b' ;
 as : 'a' | as 'a' ;
 bs : 'b' | bs 'b' ;

How would you cope with b's and a's in the wrong order? Think about it for a moment - how badly disarranged could the a's and b's become? How could you uniquely distinguish the correct part, surrounded by extra junk? What if you had an input of "aabbabababaaabbb"?

Depending on how much mixing you allowed, I suspect the only safe answer is to input any number of a's and b's in any order and then use an action to check whether they are correct or not.

+ Can you think of a simpler grammar for the identifier-list problem in $7?

The alternatives suggested so far turned out to be essentially the same. Let me know if you think you have a different solution.

+ The syntax for ANSI C given by Kernighan & Ritchie is in $CS2112/e*/c_grammar/* Run it through Yacc to discover the conflicts, and try to work out why they are there and how you could improve the syntax to remove them.

One shift-reduce error comes from the dangling else. The other errors come from:

  type_identifier : identifier ;

where this is used in declarations. The problem is that, in C, it is hard to distinguish a use of a typename in a variable declaration and a new declaration - see $9.

+ How would you use Lex and Yacc to recognise a language like SML, where the precedence, associativity and fix of operators can be varied.

The only thing I can think of is to define a series of tokens like:
left_assoc_infix_precedence_1
right_assoc_prefix_precedence_2
no_assoc_postfix_precedence_3
etc. (actually, we would need all possible combinations of associativity, fix & precedence), build a very general yacc grammar using them, and then map the various operators to these tokens, using a table of names that can be modified as we see the precedence etc. change. It would actually be very difficult to allow for operators of different associativity at the same precedence level, but it is doable if we ignore any error messages about shift-reduce or reduce-reduce conflicts, and not allow the user to have more than one sort of associativity at a single precedence level at a time.

To be honest, if I was really going to do this, I would probably not use yacc, but use some more general shift-reduce algorithm instead.