.NH S 3 Yacc

An infix calculator written using Lex and Yacc


($CS5031/e*/infix2/*)

We normally use Lex and Yacc together; Lex for the simple parts (e.g. numbers, white space, comments) and Yacc for more complex parts (e.g. expressions). For this example, we will keep the same operators as the postfix calculator, but we will give them their usual associativity (left), precedence (* and / before + and -) and we will also need brackets.

Yacc part

 %{
 #include <stdio.h>	C declarations used in actions
 %}

 %union {int a_number;} 	Yacc definitions
 %start line
 %token <a_number> number
 %type <a_number> exp term factor

 %%

 descriptions of expected inputs	corresponding actions (in C)

 line	: exp ';'		{printf ("result is %d\n", $1);}
	;
 exp	: term          	{$$ = $1;}
	| exp '+' term  	{$$ = $1 + $3;}
	| exp '-' term 	{$$ = $1 - $3;}
	;
 term	: factor        	{$$ = $1;}
	| term '*' factor 	{$$ = $1 * $3;}
	| term '/' factor 	{$$ = $1 / $3;}
	;
 factor	: number		{$$ = $1;}
	| '(' exp ')'		{$$ = $2;}
	;

 %%			C code

 int main (void) {return yyparse ( );}

 void yyerror (char *s) {fprintf (stderr, "%s\n", s);}

Lex part

 %{
 #include "y.tab.h"
 %}

 %%

 [0-9]+		{yylval.a_number = atoi(yytext); return number;}
 [ \t\n]			;
 [-+*/( );]		{return yytext[0];}
 .			{ECHO; yyerror ("unexpected character");}

 %%

 int yywrap (void) {return 1;}

input descriptions

The format of the grammar rules for Yacc is:

name	: names and 'single character's
	| alternatives
	;

Yacc definitions

"%start line" means the whole input should match "line".

%union: all possible types for values associated with parts of the grammar

%type: individual type for each part of the grammar

%token: declare each grammar rule used by Yacc that is recognised by Lex & give type of value

Actions, C declarations & code:

$$: resulting value for any part of the grammar

$1, $2, etc.: values from sub-parts of the grammar

yyparse: routine created by Yacc from (expected input, action) lists. (It actually returns a value indicating if it failed to recognise the input.)

yyerror: routine called by yyparse whenever it detects an error in its input.

More Lex declarations and actions

y.tab.h: gives Lex the names & type declarations etc. from Yacc

yylval: name used for values set in Lex
e.g. yylval.a_number = atoi (yytext)
e.g. yylval.a_name = findname (yytext)

How Lex and Yacc are used together:


flex : calcl.l -> calcl.c

gcc : calcl.c -> calcl.o

byacc : calcy.y -> calcy.c

gcc : calcy.c -> calcy.o

ld : calcl.o calcy.o -> calc


calc : expression -> result

Precedence and associativity

The example calculator above gives different precedence to the operators +, -, *, /, ( and ) by using a separate grammar rule for each precedence level. The operator associativities are also defined by the grammar rules, as you will investigate in the examples sheet.

The problem with these methods is that they tend to complicate the grammar and constrain language design. Yacc has an alternative mechanism that decouples the declarations of precedence and associativity from the grammar, using "%left", "%right" and "%nonassoc". e.g.:

  %nonassoc '=' 		 v
  %left '+' '-'			 v increasing
  %left '*' '/'			 v
  %right not			 v precedence
  %right '^'			 v		(i.e. exponentiation)
%%
  exp	: exp '+' exp | exp '-' exp | exp '*' exp | exp '/' exp
	| number | '(' exp ')' | exp '^' exp  | exp '=' exp | not exp
	;

Yacc can also give different precedences to overloaded operators; e.g. if - is used both for negation and for subtraction and we want to give them different precedences (and negation the same precedence as for not):

	| '-' exp %prec not

Yacc has other facilities, but those described above are among the most important. You can refer to the Yacc manual in the departmental library or at URL
http://www.cs.man.ac.uk/~pjj/cs2111/yacc/yacc.html
if necessary.