We normally use Lex and Yacc together; Lex for the simple parts (e.g. numbers, white space, comments) and Yacc for more complex parts (e.g. expressions). For this example, we will keep the same operators as the postfix calculator, but we will give them their usual associativity (left), precedence (* and / before + and -) and we will also need brackets.
%{ #include <stdio.h> C declarations used in actions %} %union {int a_number;} Yacc definitions %start line %token <a_number> number %type <a_number> exp term factor %% descriptions of expected inputs corresponding actions (in C) line : exp ';' {printf ("result is %d\n", $1);} ; exp : term {$$ = $1;} | exp '+' term {$$ = $1 + $3;} | exp '-' term {$$ = $1 - $3;} ; term : factor {$$ = $1;} | term '*' factor {$$ = $1 * $3;} | term '/' factor {$$ = $1 / $3;} ; factor : number {$$ = $1;} | '(' exp ')' {$$ = $2;} ; %% C code int main (void) {return yyparse ( );} void yyerror (char *s) {fprintf (stderr, "%s\n", s);}
%{ #include "y.tab.h" %} %% [0-9]+ {yylval.a_number = atoi(yytext); return number;} [ \t\n] ; [-+*/( );] {return yytext[0];} . {ECHO; yyerror ("unexpected character");} %% int yywrap (void) {return 1;}
The format of the grammar rules for Yacc is:
name : names and 'single character's | alternatives ;
"%start line" means the whole input should match "line".
%union: all possible types for values associated with parts of the grammar
%type: individual type for each part of the grammar
%token: declare each grammar rule used by Yacc that is recognised by Lex & give type of value
$$: resulting value for any part of the grammar
$1, $2, etc.: values from sub-parts of the grammar
yyparse: routine created by Yacc from (expected input, action) lists. (It actually returns a value indicating if it failed to recognise the input.)
yyerror: routine called by yyparse whenever it detects an error in its input.
y.tab.h: gives Lex the names & type declarations etc. from Yacc
yylval: name used for values set in Lex
e.g. yylval.a_number = atoi (yytext)
e.g. yylval.a_name = findname (yytext)
flex : calcl.l -> calcl.c
gcc : calcl.c -> calcl.o
byacc : calcy.y -> calcy.c
gcc : calcy.c -> calcy.o
ld : calcl.o calcy.o -> calc
calc : expression -> result
The example calculator above gives different precedence to the operators +, -, *, /, ( and ) by using a separate grammar rule for each precedence level. The operator associativities are also defined by the grammar rules, as you will investigate in the examples sheet.
The problem with these methods is that they tend to complicate the grammar and constrain language design. Yacc has an alternative mechanism that decouples the declarations of precedence and associativity from the grammar, using "%left", "%right" and "%nonassoc". e.g.:
%nonassoc '=' v %left '+' '-' v increasing %left '*' '/' v %right not v precedence %right '^' v (i.e. exponentiation) %% exp : exp '+' exp | exp '-' exp | exp '*' exp | exp '/' exp | number | '(' exp ')' | exp '^' exp | exp '=' exp | not exp ;
Yacc can also give different precedences to overloaded operators; e.g. if - is used both for negation and for subtraction and we want to give them different precedences (and negation the same precedence as for not):
| '-' exp %prec not
Yacc has other facilities, but those described above are among the most
important. You can refer to the Yacc manual in the departmental library or at
URL
http://www.cs.man.ac.uk/~pjj/cs2111/yacc/yacc.html
if necessary.