Next: Control structures
Up: CS2111: Design and Implementation
Previous: Language Design
  Contents
Subsections
Expressions: Code Generation
So now we have our parse trees, what are we going to do with them? If we are
processing expressions (and maybe also assignments) then often we want to
generate code from them - either actual machine code, or maybe some
interpreted code
Each different type of computer has a different type of assembly code, so
what general principles can we pick out? The following classification will
indicate a method of code generation for most, but by no means all,
computers. The method selected will usually need refining to produce code of
a reasonable quality.
- the communication between main memory and the arithmetic unit in the
CPU:
- some computers have to put values from the main memory into a
small, fast memory within the CPU before they can be used
- most computers can normally use one operand in main memory per
instruction.
- some computers can use more.
- fast memory within the CPU used for arithmetic:
- a single accumulator register
- several arithmetic registers
- a stack
Only five of the nine possible combinations normally occur:
>
|
|
main memory
arithmetic |
|
(operands) |
none |
one |
many |
stack |
(0) |
RPN |
|
|
accumulator |
(1) |
|
MU0 |
|
registers |
(2&3) |
Load-Store |
680x0,80x86 |
CISC |
<>
We will ignore types, and any actions necessary to turn a variable name into
an address; we will examine these in later sections.
Also, we will not try to write actual code, such as ADD S
, that will
vary greatly between computers, but instead show the underlying action:
acc <- acc + S
The simplest style of code to generate uses a stack. This is straightforward
for computers with a stack in the CPU.
For computers with a single arithmetic register, we use this register as if
it is the top of the stack, and keep the rest of the stack in main memory.
If we have a stack-pointer register this is simple; if not (as with MU0) we
put the first value into location (e.g.) 1001, the second into 1002, and so
on.
For computers with a set of arithmetic registers, we can put the first value
into register one, the next into register two, and so on.
If we run out of registers we can overflow into main memory as above. We are
ignoring features found in many modern register-based computers that we
should exploit to produce better code:
- special instructions: move, increment, operate into memory etc.
- a set of registers can be accessed in more complex patterns than a
stack, particularly by 3-operand instructions.
- executing several instructions simultaneously: pipelining, multiple
arithmetic units, VLIW etc.
These problems are discussed in a later section about code optimisation.
..
Code sequence for S= U*V + X*Y on a RPN/postfix/stack computer
No direct link between main memory and arithmetic.
Fast CPU memory is a stack.
|
push U
push V
* [pop twice, multiply, push result
push X
push Y
*
+
pop S
|
|
..
Code sequence for S= U*V + X*Y on a
Load-Store (RISC, CDC 6600/7600 etc.) computer
No direct link between main memory and arithmetic.
Fast CPU memory is a set of registers.
|
reg1 <- U
reg2 <- V
reg1 <- reg1 * reg2
reg2 <- X
reg3 <- Y
reg2 <- reg2 * reg3
reg1 <- reg1 + reg2
reg1 -> S
|
|
..
Code sequence for S= U*V + X*Y on a 680x0 (or 80x86, some RISC chips etc.)
Direct link between main memory and arithmetic, that can deliver one value per
instruction.
Fast CPU memory is registers.
|
reg1 <- U
reg1 <- reg1 * V
reg2 <- X
reg2 <- reg2 * Y
reg1 <- reg1 + reg2
reg1 -> S
|
|
..
Code sequence for S= U*V + X*Y on MU0
Direct link between main memory and arithmetic, that can deliver one value per
instruction.
Fast CPU memory is only a single register, the accumulator.
|
acc <- U
acc <- acc * V
acc -> stack [i.e. push
acc <- X
acc <- acc * Y
acc <- acc + unstack [i.e. pop
acc -> S
|
|
..
Code sequence for S= U*V + X*Y on a CISC (e.g. VAX) computer
Direct link between main memory and arithmetic, that can deliver several values
per instruction.
Fast CPU memory is registers.
|
reg1 <- U * V
reg2 <- X * Y
S <- reg1 + reg2
|
|
..
($CS2111/e*/codegen/*
)
The simplest algorithms do not even use parse trees:
As each operand or operator in an expression is analysed, generate the
corresponding push/load or arithmetic instruction. If the expression is in an
assignment statement, generate a pop/store at the end. (c.f. lab exercise 4)
This can be suitable for generating code for later interpretation. However,
if we do not use parse trees, it is hard to generate anything but the most
naive code, suitable only for simple machines. In general, we walk the parse
trees bottom-up, left-to-right to generate the code. The example assignment
statement used above will give rise to the following parse tree:
walk (tree) is switch (tree->nodekind)
case assign:
walk (tree->rhs)
plant (pop tree->lhs.operand)
case operator:
walk (tree->lhs)
walk (tree->rhs)
plant (tree->operator)
case operand:
plant (push tree->operand)
Code generation algorithm for RPN computers
(similar to 1-pass algorithm)
(each case
should be terminated by a break
)
..
walk (tree) is switch (tree->nodekind)
case assign:
initialise /*global variable*/ n to e.g. 0
walk (tree->rhs)
plant (reg n -> tree->lhs.operand)
case operator:
walk (tree->lhs)
walk (tree->rhs)
plant (reg n-1 <- reg n-1 tree->operator reg n); n- -
case operand:
n++; plant (reg n <- tree->operand)
Code generation algorithm for Load-Store computers
(similar to RPN)
..
. . .
case operator:
walk (tree->lhs)
if (tree->rhs.nodekind == operand)
plant (reg n <- reg n tree->operator tree->rhs.operand)
else
walk (tree->rhs)
plant (reg n-1 <- reg n-1 tree->operator reg n); n- -
. . .
Code generation algorithm for 680x0 computers
(optimised version of Load-Store algorithm)
..
walk (tree) is switch (tree->nodekind)
case assign:
acc= unused
walk (tree->rhs)
plant (acc -> tree->lhs.operand)
case operator:
walk (tree->lhs)
if (tree->rhs.nodekind == operand)
plant (acc <- acc tree->operator tree->rhs.operand)
else
walk (tree->rhs)
plant (acc <- unstack tree->operator acc) /*see note below*/
case operand:
if (acc==used)
plant (acc -> stack)
plant (acc <- tree->operand)
acc= used
Code generation algorithm for MU0
(similar to 680x0)
for symmetric operations we can plant e.g.:
acc acc + unstack
but for asymmetric operations we need to plant e.g.:
acc unstack - acc
which some, but not all computers can do. Alternatively, we can plant e.g.:
acc temp
acc unstack
acc acc - temp
(We might even be able to look ahead and decide to traverse the left and right
hand sides of the tree in the opposite order.)
..
walk (tree, dest) is switch (tree->nodekind)
case assign:
initialise n
walk (tree->rhs, tree->lhs.operand)
case operator:
save n
if (tree->lhs.nodekind == operand)
lh= tree->lhs.operand
else
/* ? if (dest == register) lh= dest else*/
n++; lh= reg n
walk (tree->lhs, lh)
if (tree->rhs.nodekind == operand)
rh= tree->rhs.operand
else
/* ? if (dest == register && dest != lh) rh= dest else*/
n++; rh= reg n
walk (tree->rhs, rh)
plant (dest <- lh tree->operator rh)
reset n
case operand:
plant (dest <- tree->operand) /*only used for e.g. S= T */
Code generation algorithm for CISC computers
(apply 680x0 to MU0 optimisation to both subtrees)
Although Load-Store computers can usually have 3 different operands in an
instruction, we have not exploited this. However, we have to for CISC, so we
use a destination
parameter for walk
, which indicates where
the value of the (sub)tree should be loaded.
( indicates harder problems)
Capon & Jinks: chs. 3.4, 12.1, 10
Aho, Sethi & Ullman: chs. 2.8, 8, 9.10
Next: Control structures
Up: CS2111: Design and Implementation
Previous: Language Design
  Contents
Pete Jinks
1999-09-30