If valid strings are never neighbours then a single error is always
detected,
if they are always at least 2 differences apart then a single error can
always be corrected, etc.
This can require extra effort, so some disagree.
Sometimes user can make choices that enhance error detection and correction.
Sometimes possible to language designer: change concrete grammar, keeping
abstract grammar and meaning the same.
( )
and braces { }
, or simply overloading the user
and implementor! In several places the absence of keywords makes reading
and/or writing C programs harder and more error prone.
end_if
or fi
). As well
as being a good solution to this problem (no special rule for users to learn
and compiler writers to implement), there are further benefits when we
generalise this to all control constructs (as we should, to avoid yet
another special rule):
endwhile
, endif
, endfor
etc.)
it is much easier to find where a missing bracket should be within lots of
nested blocks,name:=
,
although the difference is rarely important. However, some people disapprove
of multiple returns, or a return from the middle of a function.switch
statement (often known as a
case
statement) insist that the cases are separate and automatically
terminate. C, along with many other old languages, allows cases to fall into
each other, but at least provides break
to take control to the end of
the switch
. (We will discuss other uses of break
and the
related continue
in a later section.)
C:
access: postfix, left-associative [ ] ( ) . ->
of higher precedence
than prefix, right-associative *
declarations: must use (* ).
instead of ->
and element typename comes first
(like a cast - same precedence/associativity/fix as dereference)
Pascal:
access: postfix, left-associative ^ . [ ]
declarations: [...]
becomes array [...] of
, element typename
comes last
e.g. an array of pointers to characters
in Pascal:
s: array [0..9] of ^char; . . . s[i]^ := 'a';in C:
char *s[10]; . . . *s[i] = 'a'; /* i.e. *(s[i]) */
e.g. a pointer to an array of characters
in Pascal:
s: ^array [0..9] of char; . . . s^[i] := 'a';in C:
char (*s)[10]; . . . (*s)[i] = 'a';Why do new users find declarations in C harder to understand than in Pascal?
*
is different from the rest.
This is an artefact of the design of C expressions, for which I have seen
only historical explanations. Once *
has been overloaded as the
dereference operator it has to be prefix rather than postfix, or else it is
hard to make sense of expressions.
* [] ()
) are attached to each individual identifier declared. Thus,
in Pascal, all the identifiers in a single declaration are of the same type,
but in C they can all be of different types.
This is partly historical accident and partly a consequence of minimising
the use of keywords. Every Pascal declaration must be preceded by a keyword,
such as var
for variables, whereas C variable declarations make do with
the name of the element type - C declarations can be preceded by a storage
class specifier (extern
, register
, auto
, static
or typedef
) but auto
is usually omitted.
{int bar;} {long int;} {long int bar;} typedef int foo; {foo bar;} {long foo;} {long foo bar;}
#define SP3() if(b){int x; av=f(&x); bv+=x;} . . . if (x==3) SP3(); else bork();If we omit the
;
after the use of SP3
, the else
will
silently become associated with the if
in the macro, but if we
include it the else
does not belong to any if
. In Pascal, a
block end
that replaces a statement must still be followed by the
usual ;
, but C treats { };
as two separate statements. We need
something like:
#define STMT(stuff) do{stuff}while(0) #define SP3() STMT(if(b){int x; av=f(&x); bv+=x;})Related problems arise in:
#ifdef CIRCUIT # define CLOSE(circno) {close(circno);} #else # define CLOSE(circno) #endif . . . if (expr) statement; else CLOSE(x) ++i;
(int) *x= 0 |
assigns 0 to the object pointed to by x |
int *x= 0 |
assigns the NULL pointer to x |
(int) x= y= 0 |
makes x and y both equal to 0 |
int x= y= 0 |
makes x and y both equal to 0, but only declares x |
(x= 1, y= x+1) |
the , indicates sequencing |
f (x= 1, y= x+1) |
if f is a function the , does not indicate
sequencing |
if f is a macro, any sequence depends on its definition |
,
are macro definitions and calls, function definitions and
calls, operation, separator in initialisations, separator in declarations.
Arrays and functions sometimes behave like pointers (and vice-versa).
0
is an integer, is false, is a pointer, and is the string terminator
void
is the result type of a function that is really a procedure,
the type of a pointer that is semi-valid (?), and
the type of a function that takes no parameters
>
int f(void) |
can only be called by f() |
int f() |
can be called with any parameters! |
12 == 014
but '\14' == '\014'
'A' getchar()
are both of type int, not of type char
MIN <= x <= MAX
is a valid expression, but unlikely to mean what the
user intended.
typedef enum {red, blue, green=0, yellow} colour;
gives the same value to red and to green, and to blue and
to yellow, and they are all of type int, so
yellow - green == blue
x = 1 x == 1
are both expressions, so both can appear inside
the condition of an if statement or a loop, and
both can appear on their own as a statement.
a <= b a <<= b
static, extern, void, *, &, <, ( )
p= N * sizeof * q; r= malloc (p);
sizeof (int) * x; sizeof ((int *) x); sizeof ((int) * x);
(b) + (c); (b) - (c); (b) * (c); (b) & (c); (b) (c);
a+++ ++b
"hello\
again"
char *s[]= {"A", "B", "C" "D", "E",};
*a/ *b
#define a (b) c
typedef int foo; struct foo bar; struct foo {int f;}; void fred (void) {foo: ;} typedef int foo; struct foo {foo foo;}; void fred (int foo) {struct foo foo;} typedef int foo; struct foo {int foo;}; void fred (void) {foo foo;} struct foo {int foo;}; void foo (int foo) {struct foo bar;}
typedef struct list *list; struct list {int field; list list;}; int length (list l) { int i= 0; list: if (l) { i++; l= l->list; goto list; } else return i; }
a=a+1 a+=1 a++ ++a
(*a).b a->b
a[i] *(a+i) *(i+a) i[a]
for(A;B;C)D; A;while(B){D;C;}
$CS2111/e*/c_grammar/*
Run it through YACC to discover the
conflicts, and try to work out why they are there and how you could improve
the grammar to remove them. Try to generally improve the grammar.