CS2112: Exercises + Answers from $10 Expressions: Code Generation

* For each of the code generation schemes and machine types, what would be the code generated for each of the following:

x= a + b * c

push a	reg1 <- a         	reg1 <- a         	acc <- a
	                	                    	acc -> stack
push b	reg2 <- b         	reg2 <- b         	acc <- b
push c	reg3 <- c
*	reg2 <- reg2 * reg3	reg2 <- reg2 * c  	acc <- acc * c     	reg1 <- b * c
+	reg1 <- reg1 + reg2	reg1 <- reg1 + reg2	acc <- acc + unstack
pop x	reg1 -> x         	reg1 -> x         	acc -> x           	x <- a + reg1


x= a + b * (c + d)

push a	reg1 <- a         	reg1 <- a         	acc <- a
	                	                    	acc -> stack
push b	reg2 <- b         	reg2 <- b         	acc <- b
	                	                    	acc -> stack
push c	reg3 <- c         	reg3 <- c         	acc <- c
push d	reg4 <- d
+	reg3 <- reg3 + reg4	reg3 <- reg3 + d   	acc <- acc + d     	reg1 <- c + d
*	reg2 <- reg2 * reg3	reg2 <- reg2 * reg3	acc <- acc * unstack	reg1 <- b * reg1
+	reg1 <- reg1 + reg2	reg1 <- reg1 + reg2	acc <- acc + unstack
pop x	reg1 -> x         	reg1 -> x          	acc -> x           	x <- a + reg1


x= a * b + c

push a	reg1 <- a         	reg1 <- a       	acc <- a
push b	reg2 <- b
*	reg1 <- reg1 * reg2	reg1 <- reg1 * b	acc <- acc * b	reg1 <- a * b
push c	reg2 <- c
+	reg1 <- reg1 + reg2	reg1 <- reg1 + c	acc <- acc + c
pop x	reg1 -> x         	reg1 -> x       	acc -> x      	x <- reg1 + c


x= a + (b + (c + (d + e)))

push a	reg1 <- a         	reg1 <- a         	acc <- a
	                	                    	acc -> stack
push b	reg2 <- b         	reg2 <- b         	acc <- b
	                	                    	acc -> stack
push c	reg3 <- c         	reg3 <- c         	acc <- c
	                	                    	acc -> stack
push d	reg4 <- d         	reg4 <- d         	acc <- d
push e	reg5 <- e
+	reg4 <- reg4 + reg5	reg4 <- reg4 + e  	acc <- acc + e     	reg1 <- d + e
+	reg3 <- reg3 + reg4	reg3 <- reg3 + reg4	acc <- acc + unstack	reg1 <- c + reg1
+	reg2 <- reg2 + reg3	reg2 <- reg2 + reg3	acc <- acc + unstack	reg1 <- b + reg1
+	reg1 <- reg1 + reg2	reg1 <- reg1 + reg2	acc <- acc + unstack
pop x	reg1 -> x         	reg1 -> x         	acc -> x           	x <- a + reg1

+ The algorithms above assume that we are dealing with expressions in assignment statements. How can they be modified to cope with C, where assignment is just another operator, or with operators like += ?

e.g. x= a + (y= b * c)

push a	reg1 <- a         	reg1 <- a         	acc <- a
	                	                    	acc -> stack
push b	reg2 <- b         	reg2 <- b         	acc <- b
push c	reg3 <- c
*	reg2 <- reg2 * reg3	reg2 <- reg2 * c  	acc <- acc * c     	y <- b * c
pop y	reg2 -> y         	reg2 -> y         	acc -> y
push y	
+	reg1 <- reg1 + reg2	reg1 <- reg1 + reg2	acc <- acc + unstack
pop x	reg1 -> x         	reg1 -> x         	acc -> x           	x <- a + y
e.g. x+= a * b
push a	reg1 <- a         	reg1 <- a       	acc <- a
push b	reg2 <- b
*	reg1 <- reg1 * reg2	reg1 <- reg1 * b	acc <- acc * b	reg1 <- a * b
push x	reg2 <- x
+	reg1 <- reg1 + reg2	reg1 <- reg1 + x	acc <- acc + x
pop x	reg1 -> x         	reg1 -> x       	acc -> x      	x <- x + reg1
We might expect some optimisation, such as replacing:
 	reg1 <- reg1 + x
 	reg1 <- x
by:
 	x <- x + reg1
as in the 68000 instruction: ADD D1, x
and maybe replacing:
 	y <- b * c
 	x <- a + y
by:
 	reg1 <- b * y
 	reg1 -> y
 	x <- a + reg1
This doesn't cause any extra worries about the quality of the generated code, as we must optimise it anyway; the user is perfectly entitled to write "x= x + a * b" and the compiler should produce equally good code.

We can always expand

 e.g.	frotz[- -j + i++] += - -y;
 to	y = y - 1;
 	j = j - 1;
 	frotz[j + i] = frotz[j + i] + y;
 	i= i + 1;
so the real question is, can we avoid rewriting the parse tree? If we remember to perform the initialisation, currently in the "assign" case, at the start of the expression, then there is no reason why we can't deal with each "=" or "+=" operator, just as we deal with operators now, but generating code like that above.

What about expressions without assignment e.g. in control structures or function calls (or in SML)?

The existing algorithms should work, as long as we remember to perform any initialisation currently done at the "assign" node.

* For any computer you are familiar with, what special cases should we take into account for code optimisation?

e.g. 68000:
ADDQ, SUBQ, MOVEQ etc.
CLR x for x = 0 (similarly TST x etc.)
MOVE y,x for x = y
ADD Dn,x or ADDQ #1,x for x += expression or x++
DBF Dn,label
using ASL/R for */ a power of 2. (carefully!)
etc.

+ Do you know of any computer that would not fit reasonably well into the classification used above? How would you generate code for it?

I don't know of any such computers, but I am sure they exist.