next up previous contents
Next: Syntax-directed Translation Up: ho Previous: Syntactic analysis - Yacc   Contents

Subsections

Semantic Analysis - Dictionaries

input:

output:

The semantic analyser recognises & checks the declarations & use of identifiers, using the dictionary.

Unlike lexical and syntactic analysis, there are no widely available toolkits for semantic analysis, although there are packages that help with maintaining a dictionary.

Dictionaries

A dictionary contains a set of names used in a piece of code and, for each name, information describing their declarations. To look up an identifier in a dictionary:

We normally use a single dictionary, containing all the names in the source text, each linked to whichever declarations are currently valid, so we search the dictionary for a name, and then for the relevant declaration.

(It is possible to use several dictionaries e.g. 1 per scope and/or per kind of declaration, so we must search them dictionaries in turn until we find the declaration we need.)

Using a dictionary

Property entries

Some information will be used by most or all declaration kinds:
kind: variable/constant/function/type/etc.
scope: $\rightarrow$ property entry of the owning function or block, and/or the depth of nested scopes
type: (see below)
$\rightarrow$ name in name-list
$\rightarrow$ any previous declarations of the same name in the dictionary
$\rightarrow$ next declaration in the same scope (?)

Other information will be different for each declaration kind:

e.g.

char (*(*x())[])() { . . . }
is equivalent to:
  typedef char anon1();
  typedef anon1 *anon2;
  typedef anon2 anon3[];
  typedef anon3 *anon4;
  anon4 x() { . . . }
which can be represented by (many details omitted):
names properties
   
char \fbox { type $\vert$\ description of char }
   
anon1 \fbox { type $\vert$\ function $\vert$\ * $\vert$\ () }
   
anon2 \fbox { type $\vert$\ pointer $\vert$\ * }
   
anon3 \fbox { type $\vert$\ array $\vert$\ ? $\vert$\ * }
   
anon4 \fbox { type $\vert$\ pointer $\vert$\ * }
   
x \fbox { function $\vert$\ * $\vert$\ () }

Keywords

C includes words that could be mistaken for identifiers, but are actually defined as part of the language. These include words used to describe data and control structures, type-names, conversions, values, input/output, operators, and maths functions.

Keywords (reserved words) are built into the grammar, and so cannot have their meaning changed by the user.

Keywords are easy to recognise by using a different Lex grammar rule for each, but this tends to make the resulting analyser very large (and slow for Lex to generate). Therefore, some analysers initially do not distinguish between keywords and identifiers, but then search a special dictionary to recognise keywords before treating everything else as an identifier. This makes the analyser smaller but slower.

Exercises: Semantics

  1. a) Show which declaration is referenced each time an identifier is used.
    b) Construct a parse tree for each executable statement, and add the relevant type information to them, inserting extra nodes to represent automatic type conversions. Which statements are illegal?

    static int i, j, k, a;
    static float b, c;
    static struct {
      int i, j, k;
    } x;
    static void p()
    {
      int i, n, d;
      float e;
      for (n = 1; n <= 2; n++)
      {
        i= j; k++; d= a+b*c+e;
        if (e) e= d;
        if (n == 1)
        {
          int i, m, a;
          float f;
          i= j; m= n; f= a*b+c*d; p= x+f;
          if (k < 2)
            p();
        }
      }
    }
    int main(void)
    {
      k= 0; j= 0;
      x.k= 0; x.j= 0;
      p();
      return 0;
    }
    

  2. Show the dictionary when the compiler reaches the point indicated in the following ANSI C program:

    static int i, j, a[10];
    static void p(long *i, long *j)
    {
      typedef enum {red, blue} x;
      typedef char y;
      typedef x z[5];
      typedef long s;
      typedef struct r {
        long f1, f2;
        double f3, f4;
        char f5;
        union {
          char b;
          y f6;
          } UU;
        } r;
      x xx;
      z zz;
      s ss;
      r rr;
      /* show the dictionary here */
    }
    


next up previous contents
Next: Syntax-directed Translation Up: ho Previous: Syntactic analysis - Yacc   Contents
Pete Jinks 2001-02-21