CS2112: Exercises + Answers from $5 Using Yacc 2

* Modify the calculator in $4.1 to accept one or more expressions.

 . . .
 %start lines
 . . .
 lines	: line
 	| lines line
 	;
 . . .

* Design a grammar for boolean expressions, involving operators AND, OR and NOT, brackets ( and ), and operands TRUE, FALSE and C-like identifiers. An example of the sort of expression allowed is:

	a AND (B OR Abcd_99 AND NOT (FRED OR TRUE))
You can use any grammar notation you like initially.

There is no need to distinguish TRUE and FALSE from the other identifiers, although if we used a dictionary to contain identifiers we ought to preload TRUE and FALSE.

 exp : term | exp 'OR' term ;
 term : factor | term 'AND' factor ;
 factor : 'NOT' factor | '(' exp ')' | identifier ;
 identifier : [a-zA-Z_][0-9a-zA-z_]*

Split your grammar into the portions that should be dealt with by Lex and by Yacc, and write the corresponding Lex and Yacc programs.

I am omitting the obvious code that you should include at the beginning and end. The Yacc grammar uses explicit precedence & associativity rules, so it is not the same grammar as above, but it does illustrate another possibility.

Lex:

 . . .
 %%
 OR				{return or;}
 AND				{return and;}
 NOT				{return not;}
 [a-zA-Z_][0-9a-zA-z_]*	{return id;}
 [()]				{return yytext[0];}
 . . .

Yacc:

 . . .
 %token or and not id
 %left or
 %left and
 %right not
 %%
 exp : exp and exp
      | exp or exp
      | not exp
      | '(' exp ')'
      | id
     ;
 . . .

What is the parse tree you expect for the example expression above?

	 AND
	 /     \
	a     OR
	     /    \
	   B   AND
	       /       \
	 Abcd_99    NOT
	                   |
	                 OR
	                 /    \
	          FRED    TRUE

(confession: I gave the OR & AND operators surrounding Abcd_99 the wrong precedence the first time I did the tree.)

* Modify the Yacc part of the calculator in $4.6 to use "%left" etc. You will need to merge and modify the rules for "exp", "term" and "factor".

 %type <a_number> exp
 %left '+' '-'
 %left '*' '/'
 %%

 line	: exp '='		{printf ("result is %d\n", $1);}
 	;
 exp	: exp '+' exp	  	{$$ = $1 + $3;}
 	| exp '-' exp	 	{$$ = $1 - $3;}
 	| exp '*' exp	 	{$$ = $1 * $3;}
 	| exp '/' exp 		{$$ = $1 / $3;}
 	| number		{$$ = $1;}
 	| '(' exp ')'		{$$ = $2;}
 	;

See what happens when you alter the precedence and/or associativity of the operators in various ways.

e.g. try swapping the "%left" lines, or giving '+' and '-' different priorities, etc.

* Check your favourite compilers to see how they cope with simple syntax errors like missing or extra "{", "}", "," and ";" symbols. Is the error reported correctly, and is it located accurately? Are there any subsequent spurious error messages? How complicated a syntax error can it cope with sensibly?

e.g. deleting the ';' after the declaration of 'stack' in $3.1

gcc:

 a.c4: parse error before `static'
 a.c11: parse error before `1'
 a.c11: warning: data definition has no type or storage class

cc: (after modifying the function declarations to old C)

 "a.c", line 4: syntax error at or near word "static"
 "a.c", line 4: unknown size
 "a.c", line 11: syntax error at or near constant 1

I find the "syntax error" messages from the older compiler particularly irritating. Line 4 is the next line of code after the missing ';', so an error message there is understandable, but I don't know why the old compiler gives a second error there, or what the message means. Line 11 is "exit(1);" and I have no idea why the compilers think there is an error there, or what the messages mean.

e.g. changing the declaration of "sp" to "s" in the same code

gcc:

 a.c: In function `push':
 a.c:6: `sp' undeclared (first use this function)
 a.c:6: (Each undeclared identifier is reported only once
 a.c:6: for each function it appears in.)
 a.c: In function `pop':
 a.c:17: `sp' undeclared (first use this function)

cc:

 "a.c", line 6: sp undefined
 "a.c", line 17: sp undefined

So at least both compilers are trying to suppress multiple error messages from undeclared variables (there are 2 uses of "sp" in each routine).

+ Write Lex and Yacc code to recognise grammars written in BNF and/or EBNF, like the examples in $5.1. (i.e. write a grammar that describes grammars!)

There is an example Yacc program to recognise Yacc programs in appendix B (pp.25-26) of Johnson's paper. I have been playing off and on with programs to convert between BNF and EBNF. Feel free to chat to me about this.