next up previous contents
Next: Syntax-directed Translation Up: CS2111: Design and Implementation Previous: Building a Compiler: recognising   Contents

Subsections


Building a Compiler: generating the output

Code generation

input:
enhanced parse trees or equivalent for executable code
output:
object code

Actions can be directly embedded in YACC, or we can be making passes over parse trees created earlier.

code generator tools

There are no tools as ubiquitous and as useful as LEX and YACC, but there are various possibilities which may help:

executor program

Use an existing code generator, or create a simple code generator and write a simple executor program to obey it.
run-time slow
easy to write

back-end translator

Use a simple code generator in the compiler, or even just create parse trees, and run the output through a second translator that creates real machine code:
compilers are easier to write
need to create an extra (back-end) translator
can make one compiler per language and one back-end translator per computer and pair them to make all possible combinations i.e. O(n+m) instead of O(n*m).

automatic generation of back-end translator

Given the formal semantics of the input and output, create the back-end translator automatically: current research, not completely successful yet.

Code optimisation

There are various sorts of code optimisations, some of which (global) normally take place before code generation during repeated scans of the entire program or large parts of the program using e.g. parse trees, and some (peephole) during a single scan of the code, looking at short sequences while or after it is generated.

Both global and peephole optimisations can be further classified into machine dependent (e.g. looking at register usage) and machine independent (e.g. looking at the order of evaluation of expressions) optimisations.

machine dependent peephole
e.g. optimise register loading: keep track of what is in registers and avoid reloading them. e.g. for A= B; C= B; D= A*E;

  acc= B         [acc = B
  acc=> A        [acc = B,A
  acc= B         unnecessary
  acc=> C        [acc = B,A,C
  acc= A         unnecessary
  acc* E         [acc = ?
  acc=> D        [acc = D
machine independent peephole
e.g. constant folding: expressions containing constants can be partially evaluated at compile time. This is especially true of address calculations e.g. A[B+1] - we can add 1 (or whatever) to the address of A at compile time.
machine dependent global
e.g. optimising variable storage: find the most frequently used variables (per program or routine or loop or whatever) and keep them in spare registers.
machine independent global
e.g. loop optimisations:
loop invariants
remove as much calculation as possible from the body of the loop, to be performed once and for all before the loop starts.
induction variables
e.g.

  for (i= 0; i<10; i++) a[i]= 0;
  becomes:
  for (i= 0; i<20; i+=2) *(a+i) /* but i already scaled! */ = 0;
  or:
  for (?= 0; ?<10; ?++) *(a++)= 0;
loop unrolling
e.g.

  for (?= 0; ?<5; ?++) {*(a++)= 0; *(a++)= 0}
Optimisation is expensive (in machine time) so make efficient use of it e.g.:

Making best use of registers

Although the simple methods described in $\S $10 generate the best possible code for simple computers, they do not make best use of sets of registers, as more complex access patterns can be used with a set of registers than with a stack. Also, if different registers are being used, modern computers can usually be executing more than one instruction at once.

e.g. suppose we are using
S+= U*V + X*Y;
to sum values in a loop, and Y is invariant but U, V, and X are changing. We would want to at least store Y and S in registers:


     reg1 <- U                    }          these pairs
     reg2 <- regY * X             } }        of instructions
     reg1 <- reg1 * V               } }      could be
     regS <- reg2 + regS              }      overlapped
     regS <- reg1 + regS
Outline algorithm:

Constructing a compiler from tasks: Passes

It is normally best to go for the fewest number of passes, but the analysis may be more complex (e.g. not all things declared before use) or the later phases, particularly code generation and optimisation, may require extra passes, or even some tools we are using (e.g. back-end translators) may require extra passes. We might even use a pipeline if we do not have enough memory to fit the whole compiler in at once. (c.f. $\S $8.1)

e.g. 1-pass compiler: use LEX and YACC, write a set of routines to create and use a dictionary, and to generate (and optimise) code, and call the routines from the YACC actions (as we have seen in examples).

e.g. multi-pass compiler: use LEX and YACC and write semantic and code-generation routines as above, but also write tree creation and tree-walking routines. Call the tree-creation routines from YACC, then after yyparse in the main program call the tree-walking routines to perform semantic analysis and code-generation in one or more traversals of the tree.

e.g. multi-pass compiler: create a pipeline, with a separate program for each of the analysis and coding phases.

Exercises

Readings

Capon & Jinks: chs. 3.4-6, 12.1, 10, 14
Aho, Sethi & Ullman: chs. 2.8-9, 8, 9.9-10, 10.1-2, 11
next up previous contents
Next: Syntax-directed Translation Up: CS2111: Design and Implementation Previous: Building a Compiler: recognising   Contents
Pete Jinks
1999-09-30