Translating Computer Languages


Reading: Brookshear ch. 5.2 or Greenfield ch. 8.3

Assemblers & compilers translate for later execution by real hardware or by software interpreters. They are application-specific programs just like any other, best written in HLLs, especially those specific to the application area

Translation

Assembler

simplifies the task of writing machine code programs

error messages, listings etc.
LLL program -> [ assembler ]
binary program

1) build words from characters, discard spaces & comments

2) check legal statement

3) check user-defined names (e.g. labels),
keep list of names & addresses
4) translate [one-to-one]

Formally:
1) Lexical (word) analysis
2) Syntactic (sentence structure) analysis
3) Semantic (meaning) analysis
4) Code generation

Language definitions


Syntax + Lex: Representation

Semantics: meaning
In natural languages, sentences/statements only have meaning in a context.
Computers have no common sense or understanding as to what is going on, so context has to be carefully defined:
* definitions for single words (identifiers/names)
dictionary: the words of a language alphabetically arranged, with their meanings (type, location etc.)
* meanings for statements & structures

Compiler

Makes it seem as if the high-level language is the machine language.

error messages, listings etc.
HLL program -> [ compiler ]
LLL or binary program

e.g.

/* division by repeated subtraction */
/* declarations */
 Label again;
 Var ans=-1, a=99, b=6;
/* commands */
 again:
 ans=ans+1; a=a-b;
 If a>=0 JumpTo again;
 Stop;
Same first 4 steps:
1) identify words etc. e.g
[ Label ] [ again ] [ ; ] [ Var ] [ ans ] [ = ] [ \- ] [ 1 ] [ , ] [ a ] [ = ] [ 99 ]
[ , ] [ b ] [ = ] [ 6 ] [ ; ] [ again ] [ : ] [ ans ] [ = ] [ ans ] [ + ] [ 1 ] [ ; ]
[ a ] [ = ] [ a ] [ \- ] [ b ] [ ; ] [ If ] [ a ] [ >= ] [ 0 ] [ JumpTo ] [ again ]
[ ; ] [ Stop ] [ ; ]
2) check against grammar
3) check user-defined names
(declared, var/label, int/real etc.)
keep list of names, addresses, types etc.
4) translate [one-to-many] e.g.
 0	again:	LDA ans		100B
 1		ADD one		200C
 2		STO ans		000B
 3		LDA a		1009
 4		SUB b		300A
 5		STO a		1009
 6		LDA a		1009
 7		JGE again	5000
 8		STP		7000
 9	a:	99		0063
10	b:	6		0006
11	ans:	-1		FFFF
12 one: 1 0001
and:
5) Code optimisation

Libraries & Linker

Library: increases set of operations available to programmer.



Translate these operations once, separately from user programs.
Include list of operation names & addresses.

Linker searches list(s) of names & addresses to locate required operations and combines operations with user programs.


When translating library, don't know where it will end up, so linker also has to relocate the code by changing addresses in it.

library =

exported names & addresses

imported names & where used

relocation information

code

Execution: Interpreter

Why interpret?
resources (size, cost) - normally no longer a problem
only execute once (e.g. ksh etc.)
very high level languages - but we get better at compiling & debugging them

interpreters are slow, simple & small

Programming Environments