non_expression = sub_exp | sub_exp '=' sub_exp
left_expression = sub_exp | left_expression '+' sub_exp
right_expression = sub_exp | sub_exp '^' right_expression
->
non_expression = sub_exp ( '=' sub_exp )?
left_expression = sub_exp ( '+' sub_exp )*
right_expression = sub_exp ( '^' right_expression )?
EXP : EXP + TERM
| EXP - TERM
| TERM
;
TERM : TERM * FACTOR
| TERM / FACTOR
| FACTOR
;
FACTOR : ( EXP )
| identifier
;
-> [after left-factorisation and removing left-recursion]
EXP : TERM EXP1
;
EXP1: + TERM EXP1
| - TERM EXP1
|
;
TERM : FACTOR TERM1
;
TERM1: * FACTOR TERM1
| / FACTOR TERM1
|
;
FACTOR : ( EXP )
| identifier
;
stat : target '=' exp ';'
| target '(' explist ')' ';'
;
target : id
| target '.' id
;
-> [left factorisation]
stat : target assign_or_call ';'
;
assign_or_call : '=' exp
| '(' explist ')'
;
target : id
| target '.' id
;
lvalue : ident
| '{' channels '}'
| lvalue '.' ident
| lvalue '[' expr ']'
| lvalue '[' expr '..' expr ']'
| lvalue '[' 'over' type ']'
| lvalue '@' lvalue
;
('@' and '.' and '[' are all intended to be left associative, with '@' having the lowest precedence and '['
the highest precedence.)
becomes
lvalue : lvalue2 ( '@' lvalue2 )*
;
lvalue2 : ( ident
| '{' channels '}'
)
( '.' ident
| '[' ( expr ( ']'
| '..' expr ']'
)
| 'over' type ']'
)
)*
;
me : ue | be | ke ;
ue : uod us ;
uod: p | ue ;
be : bod bs uod ;
bod: uod | be ;
ke : bod ( k bod )+ ;
-> [removing left recursion,
using a: b | a c [or a: b | t; t: a c;] -> a: b c*
and also using c* c -> c+ ]
me : ue | be | ke ;
ue : p us+ ;
uod: p us* ;
be : uod ( bs uod )+ ;
bod: uod ( bs uod )* ;
ke : bod ( k bod )+ ;
-> [substituting, using a* -> a+? and left-factoring ]
me : p ( us+ | us+? bet | us+? bet? kp+ ) ;
bet: ( bs p us+? )+ ;
kp : k p us+? bet? ;
-> [more left-factoring ]
me : p ( us+ | us+? t ) ;
t : bet | bet? kp+
bet: ( bs p us+? )+ ;
kp : k p us+? bet? ;
-> [continuing to left-factor, using a? b -> a b | b ]
me : p ( us+ | us+ t | t ) ;
t : bet | bet kp+ | kp+
bet: ( bs p us+? )+ ;
kp : k p us+? bet? ;
-> [completing left-factoring, using ( | t ) -> t? ]
me : p ( us+ t? | t ) ;
t : bet kp* | kp+
bet: ( bs p us* )+ ;
kp : k p us* bet? ;
E : id | "&" id | "(" E ")" | E "[" int "]" | E "=" E
My first attempt was:
E : id | "&" id | "(" E ")" | E ( "[" int "]" | "=" E )
-> remove left recursion
E : ( id | "&" id | "(" E ")" ) ( "[" int "]" | "=" E )*
-> BNF
E : Ehead Etail
;
Ehead : id
| "&" id
| "(" E ")"
;
Etail :
| "[" int "]" Etail
| "=" E Etail
;
However, as a brighter person than me pointed out, this is not LL(1) as it is
ambiguous! Expressions like "id=id=id" or "id=id[int]" could be parsed in more
than one way, as there is nothing to say what the associativity of the "="
operator is, or the relative precedence of the "=" and "[ ]" operators (see
Parsing Questions and Answers).
Starting again:
E : id | "&" id | "(" E ")" | E "[" int "]" | E "=" E
-> disambiguate - I have made some random but reasonable choices
E : term
| atom "=" E
;
term : atom
| "(" E ")"
;
atom : id
| "&" id
| atom "[" int "]"
;
(this happens to be LR(1), so it really is unambiguous!)
E : atom ("=" E)?
| "(" E ")"
;
atom : id
| "&" id
| atom "[" int "]"
;
-> remove left-recursion
E : atom ("=" E)?
| "(" E ")"
;
atom : (id | "&" id) ("[" int "]")*
;
-> BNF
E : atom Etail
| "(" E ")"
;
Etail :
| "=" E
;
atom : id atomtail
| "&" id atomtail
;
atomtail:
| "[" int "]" atomtail
;