Next: Lexical analysis - Lex
Up: ho
Previous: Introduction
  Contents
Subsections
Assemblers and compilers translate for later execution by real hardware or by
software interpreters. They are application-specific programs just like any
other, best written in HLLs, especially those specific to the application
area
simplifies the task of writing machine code programs
LLL program |
error messages, listings etc.
binary program
|
|
- build words from characters, discard unimportant spaces & comments
- check legal statement
- check user-defined names (e.g. labels)
- need to keep list of names & addresses
- translate [one-to-one]
Formally:
- Lexical (word) analysis
- Syntactic (sentence structure) analysis
- Semantic (meaning) analysis
- Code generation
These steps (and some others mentioned below) are known as
phases - logical subdivisions of the overall translation task.
special notations e.g. Backus Normal Form (BNF)
In natural languages, sentences/statements only have meaning in a context.
Computers have no common sense or understanding as to what is going on,
so context has to be carefully defined:
- definitions for single words (identifiers/names)
dictionary: the words of a language
alphabetically arranged, with their meanings (type, location etc.)
- meanings for statements & structures
Makes it seem as if the high-level language is the machine language.
HLL program |
error messages, listings etc.
LLL or binary program
|
|
e.g.
/* division by repeated subtraction */
int main (void)
{
int ans=-1, a=99, b=6;
do
{ans=ans+1; a=a-b;}
while (a>=0);
return ans;
}
First 4 steps the same:
- identify words etc.
e.g
- check against grammar
- check user-defined names
(declared, var/label, int/real etc.)
- keep list of names, addresses, types etc.
- translate [many-to-many?] e.g.
main MVN a1, #0 ; a1 = ans
MOV a2, #&63 ; a2 = a
MOV a3, #6 ; a3 = b
L ADD a1, a1, #1
SUBS a2, a2, a3
BPL L
MOV pc, lr
- Code optimisation - before (Global), during, & after (Peephole) step 4.
Increases set of operations available to programmer, by giving expansions
for the more complicated operations.
Translate operations once, separately from user programs, to
create library.
Libraries include a list of operation names & addresses.
Searches libraries to locate required operations and
combines with user programs.
When creating library, don't know where it will end up, so linker also
has to relocate the code by changing
addresses in it.
- Execution
- obey real machine instructions directly on a real computer.
- Interpret
- obey instructions indirectly via another program.
Why interpret?
- resources e.g. cost,
size
- only execute once (e.g. ksh etc.)
- very high level languages
- but we get better at compiling & debugging them
interpreters are slow, simple & small
- The following is the syntax of a simple language, ECO:
name identifies an element of the language,
name : something ; defines the left hand name as equal to the right hand something,
| separates alternative definitions,
{ something } means that the enclosed something occurs none or more times,
"text" indicates characters present in an actual program in the language.
program : statement { statement } ;
statement : declarative ";" | executable ";" ;
declarative : type identifier { "," identifier } ;
type : "CARNIV" | "VEGET" | "MALE" | "FEMALE" | "VEGETABLE" ;
executable : "BEGIN" { statement } "END" | action { "AND" action } ;
action : identifier op rest ;
rest : identifier { "WITH" identifier } ;
identifier : letter { letter } ;
op : "EATS" | "IS EATEN BY" | "HATES" | "LOVES" ;
letter : "a" | "b" | . . . | "z" ;
(i.e. all lower case characters of the English alphabet)
New lines and spaces are ignored everywhere (e.g. do not separate words).
Find all the syntax errors in the following ECO programs:
(a) CARNIV lion, tiger, cat, man, dog;
VEGET minnow, man, dog;
lion EATS man WITH dog;
lion WITH tiger EATS man;
minnow IS EATEN BY cat
(b) VEGETABLE onions, leeks;
VEGET fred, albert;
fred HATES onions WITH leeks;
BEGIN fred EATS onions; albert EATS leeks; END
CARNIV lion; albert IS EATEN BY lion AND lion EATS onions; onions IS EATEN BY
fred WITH leeks;
(c) MALE jack; FEMALE joy; CARNIV joy;
jack LOVES joy; joy HATES jack;
jack IS EATEN BY joy; lion EATS jack WITH joy;
lion LOVES jack with joy;
jack AND joy HATES lion;
- The following are the static semantic rules for ECO:
- Each identifier must appear in (one or
more) declaratives before it appears in any
executables, and it must not appear in
declaratives after it has appeared in
executables.
- An identifier may appear in several
declaratives and then is of several types
simultaneously.
- identifier EATS (or HATES or LOVES)
rest means
identifier
EATS (or HATES or LOVES)
each of the identifiers
in rest
identifier IS EATEN BY
rest means each of the
identifiers in
rest EATS
identifier
- An identifier declared to be only CARNIV
may only EAT identifiers declared to be one or
more of CARNIV or MALE or FEMALE, but not those also declared to be one or
more of VEGET or VEGETABLE.
- An identifier declared to be VEGET may
only EAT identifiers declared to be VEGETABLE.
- An identifier declared to be both CARNIV
and VEGET may EAT any identifier.
- Any identifier can LOVE or HATE any
identifier. The
identifiers must have been declared.
- Anything not explicitly permitted above is illegal.
Find all the static semantic errors in the ECO programs above.
Next: Lexical analysis - Lex
Up: ho
Previous: Introduction
  Contents
Pete Jinks
2001-02-21