next up previous contents
Next: YACC: Further usage Up: CS2111: Design and Implementation Previous: LEX   Contents

Subsections


YACC


YACC program for an infix calculator

We will keep the same operators as the postfix calculator, but we will give them their usual associativity (left), precedence (* and / before + and -) and we will also need brackets. ($CS2111/e*/infix1/*)

 %{ /* C declarations used in actions */
 #include <stdio.h>
 %}

 /* yacc definitions */

 %union {int a_number;}
 %start line
 %type <a_number> exp term factor number digit

 %%
 /*descriptions of expected inputs       corresponding actions (in C)*/

 line   : exp ';' '\n'                   {printf ("result is %d\n", $1);}
        ;
 exp    : term                           {$$ = $1;}
        | exp '+' term                   {$$ = $1 + $3;}
        | exp '-' term                   {$$ = $1 - $3;}
        ;
 term   : factor                         {$$ = $1;}
        | term '*' factor                {$$ = $1 * $3;}
        | term '/' factor                {$$ = $1 / $3;}
        ;
 factor : number                         {$$ = $1;}
        | '(' exp ')'                    {$$ = $2;}
        ;
 number : digit                          {$$ = $1;}
        | number digit                   {$$ = $1*10 + $2;}
        ;
 digit  : '0'                            {$$ = 0;}
        | '1'                            {$$ = 1;}
        | '2'                            {$$ = 2;}
        | '3'                            {$$ = 3;}
        | '4'                            {$$ = 4;}
        | '5'                            {$$ = 5;}
        | '6'                            {$$ = 6;}
        | '7'                            {$$ = 7;}
        | '8'                            {$$ = 8;}
        | '9'                            {$$ = 9;}
        ;

 %%                        /* C code */

 int main (void) {return yyparse ( );}

 int yylex (void) {return getchar ( );}

 void yyerror (char *s) {fprintf (stderr, "%s\n", s);}

Format of the grammar rules for YACC


name    : names and 'single character's
        | alternatives
        ;

YACC definitions

%start line means the whole input should match line
%union lists all possible types for values associated with parts of the grammar and gives each a field-name
%type gives an individual type for the values associated with each part of the grammar,
  using the field-names from the %union declaration
<>

Actions, C declarations and code

$$ resulting value for any part of the grammar
$1, $2, etc. values from sub-parts of the grammar
yyparse routine created by YACC from (expected input, action) lists.
  (It actually returns a value indicating if it failed to recognise the input.)
yylex routine called by yyparse for all its input.
  We are using getchar, which just reads characters from the input.
yyerror routine called by yyparse whenever it detects an error in its input.
<>

How YACC is used

BYACC : calc.y $\rightarrow$ calc.c
GCC : calc.c $\rightarrow$ calc
calc : expression $\rightarrow$ result

YACC has other facilities, some of which we will use elsewhere, but those described above are among the most important. Further details and examples can be found in the readings ($\S $4.10).


Using LEX and YACC together

Unfortunately, YACC cannot represent numbers as [0-9]+ nor easily obtain the corresponding value, nor can it easily be used to ignore white space and comments. Therefore, we need to use both LEX and YACC together; LEX for the simple parts (e.g. numbers, white space, comments) and YACC for more complex parts (e.g. expressions).

YACC code for postfix calculator using LEX and LYACC ($CS2111/e*/infix2/*):


 %{
 #include <stdio.h>
 %}

 %union {int a_number;}
 %start line
 %token <a_number> number
 %type <a_number> exp term factor

 %%

 line   : exp ';'          {printf ("result is %d\n", $1);}
        ;
 exp    : term             {$$ = $1;}
        | exp '+' term     {$$ = $1 + $3;}
        | exp '-' term     {$$ = $1 - $3;}
        ;
 term   : factor           {$$ = $1;}
        | term '*' factor  {$$ = $1 * $3;}
        | term '/' factor  {$$ = $1 / $3;}
        ;
 factor : number           {$$ = $1;}
        | '(' exp ')'      {$$ = $2;}
        ;
 %%

 int main (void) {return yyparse ( );}

 void yyerror (char *s) {fprintf (stderr, "%s\n", s);}
LEX code for postfix calculator using LEX and YACC ($CS2111/e*/infix2/*):

 %{
 #include "y.tab.h"
 %}

 %%

 [0-9]+                {yylval.a_number = atoi(yytext); return number;}
 [ \t\n]               ;
 [-+*/( );             {return yytext[0];}
 .                     {ECHO; yyerror ("unexpected character");}

 %%

 int yywrap (void) {return 1;}

YACC declarations

%token declare each grammar rule used by YACC that is recognised by LEX and give type of value
<>

LEX declarations and actions

y.tab.h gives LEX the names and type declarations etc. from YACC
yylval name used for values set in LEX e.g.
  yylval.a_number = atoi (yytext);
  yylval.a_name = findname (yytext);
<>

How LEX and YACC are used together:

FLEX : calcl.l $\rightarrow$ calcl.c
GCC : calcl.c $\rightarrow$ calcl.o
BYACC : calcy.y $\rightarrow$ calcy.c
GCC : calcy.c $\rightarrow$ calcy.o
LD : calcl.o calcy.o $\rightarrow$ calc
calc : expression $\rightarrow$ result

Language design and implementation

Using a tool like YACC, infix, postfix and prefix expressions are equally simple to implement - it automatically checks that we have the correct number and layout of operands. We will see in the next section that YACC can also cope with precedence and associativity.

However, we must be careful if we are evaluating expressions that contain lazy operators ($\S $2.6) as the example calculators in $\S $3 and $\S $4 above are eager. To deal with lazy operators, we have to wait until the expression has been completely recognised before we start to evaluate it. Typically, we generate parse trees ($\S $8) or code ($\S $10) that can be used later to evaluate the expressions.

Exercises

($\dag $ indicates harder problems)


Readings

Louden: chs. 4.2, 4.6
Johnson
Levine, Mason & Brown: chs. 1, 3, 7
Capon & Jinks: chs. 8.1, 8.4, 8.5
Aho, Sethi & Ullman: chs. 4.9, 2.1-2.5
next up previous contents
Next: YACC: Further usage Up: CS2111: Design and Implementation Previous: LEX   Contents
Pete Jinks
1999-09-30