DURATION 1 lab session


This exercise is designed to familiarise you with the use of flex and byacc together so that, when you come to use them later, you can concentrate on the important aspects of the problem.


On successful completion of this exercise a student will:
Be able to use flex and byacc together to recognise patterns in a piece of text, and in particular be able to use the communication between them correctly.
Be able to use flex and byacc together to write a simple compiler for expressions, that inputs a program written in a tiny computer language and generates equivalent code for an interpreter.
Be aware of infix and postfix notations for expressions, and how to convert an expression written in either to the other form.
Be able to make simple use of two calculator programs, bc and dc.


To create an infix-to-postfix filter using flex and byacc, which turns out to be a simple example of code generation.


dc is a calculator program available as part of the unix utilities. Unfortunately, it uses reverse-polish notation (postfix). bc is a front-end for dc that reads infix expressions from the user, translates them into postfix, and hands them on to dc, which performs the arithmetic. (Information about infix and postfix notations is available via the web pages.) You should read the man information for bc and dc and try running them both. You are going to create a simple version of bc, using flex and byacc.

Just like the previous exercise, get started using the "makefile" but this time from "$CS2121/p*/ex4" i.e. make sure you are in your directory CS2121/ex4, and then type:

        cp $CS2121/p*/ex4/makefile .
	make start
For this exercise, this copies byacc and flex programs, and test data for each part ( 1.data , 2.data , 3.data , 4.data ).

This is the flex and byacc calculator used in the lectures, slightly changed to be usable in a pipeline. You can compile and run it against the test data for part 1 by:

	make 1
but (once compilation is finished) you will only see:
	looking at generated code
	./bc <1.data
	result is 7
	syntax error
	make: *** [1] Error 1
until you have modified the grammar descriptions and actions in the byacc file as outlined below.

a) Simple compilation

Part 1 Start by modify the grammar rules to allow more than one calculation, each consisting of an expression followed by a semicolon.
Next, as you don't want to calculate directly in your bc but instead want to output commands to dc instead, you should get rid of the %type declaration and replace all the existing actions in the byacc file. To convert infix to postfix, your new actions must:
* output each number as it is recognised by byacc (i.e in the rule for a factor).
* output each operator after the pair of numbers (or terms or factors or whatever) that it operates on i.e. by the byacc rules for expression and term.
* output "p\n" (to print the answer) after recognising each ";" i.e. by the byacc rule for input.
* output "q\n" (to exit dc) at the end of input. To do this, you need an extra rule that matches all the input and then outputs "q\n" as its action. Set the "%start" rule to be this new rule.

For example, (infix) input of:

        1 * 2 + 3 / 4 ;
        1 * ( 2 + 3 ) / 4 ;
should generate (postfix) output of:
        1 2 * 3 4 / + p
        1 2 3 + * 4 / p

b) Improving the language

This section contains problems that are more difficult than those in the previous section, and where marks are harder to obtain. You should complete section (a) before attempting this section. If you are running out of time, you would do better to start exercise 5.

Save a copy of your answer to part 1 before you modify it for part 2.

Part 2 Modify the grammar, in bcy.y, to allow for monadic plus and minus operators e.g. inputs like:

1 ; + 2 ; - 3 ; - ( - 4 ) ;
(which should output: 1 2 -3 4)

You should not try to read "+ 2" or "- 3" as signed numbers; these are the "+" monadic operator (i.e. no action) acting on a sub-expression with the value "2", and the "-" monadic operator (i.e. negate) acting on a sub-expression with the value "3".

You will probably need to refer to "man dc" to find out about its "r" operation.

Save a copy of your answer to part 2 before you modify it for part 3.

Part 3 Modify the grammar, in bcl.l and bcy.y, to allow for the use of variables with names consisting of a single upper-case or lower-case letter. e.g. inputs like:

a = 1 + 2 ; a + 1 ;
b = a * a ; b - 1 ;
D = E = 1 + 1 ; D * E ;
(which should output: 3 4 9 8 2 4)

The variables "a" to "z" and "A" to "Z" are already built into dc. You will probably need to refer to "man dc" to find out about its "d", "s" and "l" operations.

Save a copy of your answer to part 3 before you modify it for part 4.

Part 4 Modify the grammar, in bcl.l, to allow for the use of variables with names consisting of any upper-case or lower-case letters. e.g. inputs like:

One = two = THREE = 4 ;

You should write a "lookup" routine in c, and include it in bcl.l. This routine should maintain a simple list of names (pointers to strings). You can implement the list by using a fixed-size array and a count of how many elements are currently in use. Each time "lookup" is given a name, it should search the list for it (using "strcmp"), append it to the list if it is not already present (using "strdup"), and then return a variable-name suitable for dc - e.g. return "A" for the first name, "B" for the second name, etc. You may assume that there will not be more than 26 different names in the list.


This exercise is worth 10 marks in total. It is necessary to produce working programs to demonstrate that you have achieved the Learning Outcomes of this exercise. However, it is also important that you are able to show that you understand the details of what you have written. You are therefore required to demonstrate your work to a member of the laboratory staff.


Demonstrate your working program to a demonstrator or lab supervisor by typing "make marka" for section (a) and "make markb" for section (b).

You should labmail the file "labmail_me", containing all your regular expressions and lex programs, which you should create by typing "make labmail".

Your program for part 1 will be marked as follows:
1 mark - modifying original types and actions
1 mark - modifying program to deal with multiple expressions
2 marks - generating numbers correctly
1 mark - generating operators correctly
1 mark - generating p command correctly
1 mark - generating q command correctly

Section (b) will be marked as follows:
1 mark - part 2 working correctly
1 mark - part 3 working correctly
1 mark - part 4 working correctly