Lab Ex 5 Hint: Formalising the Grammar
Here is an example solution of a similar problem.
You must read the
description of the exercise
first.
Here is the informal description of the grammar (using state-numbers) given
in the lab manual, but reformatted to show the structure more clearly:
Faust program =
unordered [list of] one or more
state declarations
definition of the start state: '*' followed by a state_number
state declaration =
state-number
colon (ordinary state) or "@" (accept/terminal state)
list of one or more
pairs of
a symbol-string and
a state-number
separated by commas
terminated by a semicolon
symbol-string =
one or more characters between a pair of apostrophes (')
(only two characters are not permitted here)
states are refered to by unsigned numbers.
states are numbered consecutively, starting at 1, but can be declared in any
order, and refered to before and after they are declared.
Spaces, tabs and newlines
can appear between any components of the input,
but not inside a state-number.
A space or tab (or both)
can appear inside a symbol-string, in which case it is significant,
but a newline cannot.
You now need to formalise this, using BNF (or equivalently, the usual
notation for Yacc). You will probably want to describe some of the simpler
parts of the grammar, such as numbers and strings, using regular expressions
(e.g. the notation for Lex).
Here is the example Faust program given in the lab manual, reformatted
to show the structure more clearly, and with some
explanations:
list of state-declarations for states 1-7:
7: ' ' 7,
'+' 1,
'\-' 1,
'0123456789' 2;
1: '0123456789' 2;
2@ state 2 is an accept state.
'0123456789' 2,
'.' 3,
'E' 4;
When we run the DFSA described here, whenever we get to state 2:
if we see a digit, we stay in state 2,
if we see a dot, we go to state 3, and
if we see a capital E we go to state 4.
Because state 2 is an accept state,
if we see an end of line,
we have successfully recognised the input line, and
if there is any more input the program will restart the
DFSA from the start state to read the next input line.
If we see any other character, the input line is wrong.
3@ '0123456789' 3,
'E' 4;
4: '+\-' 5,
'0123456789' 6;
5: '0123456789' 6;
6@ '0123456789' 6;
*7 the start state is state 7.
You should draw the corresponding state diagram and identify the parts
dealing with integers, fractions and exponents. What kinds of numbers are
recognised at each of the accept states (2, 3 and 6)?