# CS2112: Exercises + Answers from \$5 Using Yacc 2

* Modify the calculator in \$4.1 to accept one or more expressions.

``` . . .
%start lines
. . .
lines	: line
| lines line
;
. . .
```

* Design a grammar for boolean expressions, involving operators AND, OR and NOT, brackets ( and ), and operands TRUE, FALSE and C-like identifiers. An example of the sort of expression allowed is:

```	a AND (B OR Abcd_99 AND NOT (FRED OR TRUE))
```
You can use any grammar notation you like initially.

There is no need to distinguish TRUE and FALSE from the other identifiers, although if we used a dictionary to contain identifiers we ought to preload TRUE and FALSE.

``` exp : term | exp 'OR' term ;
term : factor | term 'AND' factor ;
factor : 'NOT' factor | '(' exp ')' | identifier ;
identifier : [a-zA-Z_][0-9a-zA-z_]*
```

Split your grammar into the portions that should be dealt with by Lex and by Yacc, and write the corresponding Lex and Yacc programs.

I am omitting the obvious code that you should include at the beginning and end. The Yacc grammar uses explicit precedence & associativity rules, so it is not the same grammar as above, but it does illustrate another possibility.

Lex:

``` . . .
%%
OR				{return or;}
AND				{return and;}
NOT				{return not;}
[a-zA-Z_][0-9a-zA-z_]*	{return id;}
[()]				{return yytext[0];}
. . .
```

Yacc:

``` . . .
%token or and not id
%left or
%left and
%right not
%%
exp : exp and exp
| exp or exp
| not exp
| '(' exp ')'
| id
;
. . .
```

What is the parse tree you expect for the example expression above?

```	 AND
/     \
a     OR
/    \
B   AND
/       \
Abcd_99    NOT
|
OR
/    \
FRED    TRUE
```

(confession: I gave the OR & AND operators surrounding Abcd_99 the wrong precedence the first time I did the tree.)

* Modify the Yacc part of the calculator in \$4.6 to use "%left" etc. You will need to merge and modify the rules for "exp", "term" and "factor".

``` %type <a_number> exp
%left '+' '-'
%left '*' '/'
%%

line	: exp '='		{printf ("result is %d\n", \$1);}
;
exp	: exp '+' exp	  	{\$\$ = \$1 + \$3;}
| exp '-' exp	 	{\$\$ = \$1 - \$3;}
| exp '*' exp	 	{\$\$ = \$1 * \$3;}
| exp '/' exp 		{\$\$ = \$1 / \$3;}
| number		{\$\$ = \$1;}
| '(' exp ')'		{\$\$ = \$2;}
;
```

See what happens when you alter the precedence and/or associativity of the operators in various ways.

e.g. try swapping the "%left" lines, or giving '+' and '-' different priorities, etc.

* Check your favourite compilers to see how they cope with simple syntax errors like missing or extra "{", "}", "," and ";" symbols. Is the error reported correctly, and is it located accurately? Are there any subsequent spurious error messages? How complicated a syntax error can it cope with sensibly?

e.g. deleting the ';' after the declaration of 'stack' in \$3.1

gcc:

``` a.c4: parse error before `static'
a.c11: parse error before `1'
a.c11: warning: data definition has no type or storage class
```

cc: (after modifying the function declarations to old C)

``` "a.c", line 4: syntax error at or near word "static"
"a.c", line 4: unknown size
"a.c", line 11: syntax error at or near constant 1
```

I find the "syntax error" messages from the older compiler particularly irritating. Line 4 is the next line of code after the missing ';', so an error message there is understandable, but I don't know why the old compiler gives a second error there, or what the message means. Line 11 is "exit(1);" and I have no idea why the compilers think there is an error there, or what the messages mean.

e.g. changing the declaration of "sp" to "s" in the same code

gcc:

``` a.c: In function `push':
a.c:6: `sp' undeclared (first use this function)
a.c:6: (Each undeclared identifier is reported only once
a.c:6: for each function it appears in.)
a.c: In function `pop':
a.c:17: `sp' undeclared (first use this function)
```

cc:

``` "a.c", line 6: sp undefined
"a.c", line 17: sp undefined
```

So at least both compilers are trying to suppress multiple error messages from undeclared variables (there are 2 uses of "sp" in each routine).

+ Write Lex and Yacc code to recognise grammars written in BNF and/or EBNF, like the examples in \$5.1. (i.e. write a grammar that describes grammars!)

There is an example Yacc program to recognise Yacc programs in appendix B (pp.25-26) of Johnson's paper. I have been playing off and on with programs to convert between BNF and EBNF. Feel free to chat to me about this.