PLE Lecture notes -- Linguistics

Most programming languages are partly a way of expressing things in terms of other things and partly a basic set of given things.
P.J. Landin
The Next 700 Programming Languages
CACM Vol 9 No 3 March 1966

Why study languages?

This activity attempts to give a broader and more hands-on experience of the programming language world than MIT courses do.

Why create languages?

Why are there so many languages? Aren't they all Turing-equivalent anyway? What makes languages different enough to warrant 2000? Don't these people have something better to do?

The intent of language research is not to make programming more complicated; just the opposite. In theory, any language is sufficient for a task, but some languages may result in complicated, unmaintainable, unreusable programs. Things which people have to get done of course do get done, even with primitive languages. Thus one purpose of research is allowing people to write programs which are simpler, or at least tractable.

And there are many things people want to do:


People want to express their tasks at a level of abstraction that encodes exactly what they care about, nor more or less. When we program, we want to think in terms of binary trees, threads, constraints, finite state machines, grammar productions, and windows, not moving bytes between registers. This is why many people use yacc for specifying a parser instead of C. Ideally, yacc allows us to concentrate on the task of constructing a grammar, not e.g. the details of I/O and storage management: we only talk to the computer about grammars, it restricts us to those things which make sense with grammars, and we only get error messages about grammars.

Unfortunately, "the right level of abstraction" is a moving target. Thus each problem seems to require its own programming language, just as it requires its own English vocabulary. C.A.R. Hoare observed that every large-scale project amounts to language design, which is done in terms of user-defined procedures and data types. Indeed, modern languages provide these mechanisms so that they can extend the scope of their abstractions to a wider variety of tasks. However, these "nano-languages" do not always go far enough. For example, we will have a hard time getting C to look like BNF, just with user-defined procedures and data types. Hoare gave a simpler example: a type system which makes sure that we don't confuse meters with yards. Languages should allow us to preserve our conceptual integrity with the problem we're solving.

In his Turing award lecture, Wilkes summarized:

...choosing a programming language is equivalent to choosing a data structure... It would ... be more logical to first choose a data structure appropriate to the problem and then look around for ... a language suitable for manipulating that data structure.


People also want languages to exploit parallel hardware. While theoretically possible with clever compilation of mainstream languages, it often turns out to practically impossible. Hence many new languages are devoted to parallelism. This also relates to the above point: finding the right abstractions for representing parallelism, when we need to, is a continuing research topic.

User interfaces

Sophisticated user interfaces and virtual reality also appear to demand special language support. Three basic operations in user interfaces are: Because of the need for real-time operation and synchronization, some interface designers have proposed including time as a first-class language entity (though I'm not sure what this would mean). At a more pedantic level, widgets, because of their voluminous number of parameters, put a great deal of pressure on parameter passing, e.g. keyword and optional arguments to functions, as well as object orientation (e.g. the Xt and Andrew toolkits both implement an object system on top of C).

Component software

This rising concern of the 90's emphasizes reuse as the primary feature of a language. Distributing software components, instead of entire applications, has the potential to benefit everyone:

Another cause for language development is ideology. Four major ideologies are:

However, while each of these camps seem to pursue orthogonal tracks, making successive refinements to their "own" languages, one wonders how different their goals really are, especially upon comparison of these languages (more below).

Finally, everyone is still looking for the silver bullet to end the so-called software crisis.

Fortunately for us, few of the 2000 languages are innovative.

How can we judge languages?

Three useful criteria, which are exceedingly difficult to describe rigorously, and seem to have many arguments for and against them, are:

Program conciseness

Fred Brooks pointed out that the number of bugs in a program is superlinear in its length. Indeed, many factors seem to promote small programs. However, one must exercise caution. Let's take the idea to a logical extreme: five billion people writing one program per second for fifty years produce less than 2^63 programs. Thus we can encode any program ever written, including revisions, in eight characters. Is this a good programming language?

Wirth, in his languages and writings, emphasized that conciseness by itself is not a reasonable goal for language design (as well as other things; read on), because excessive brevity makes programs difficult to understand, just as does excessive length. But again there is a difficulty, this time concerning the word "understand," which is a very subjective and time-dependent phenomenon. Many language constructs in regular use today were once considered too intricate or non-intuitive (and probably were at the time).

Language simplicity

Two popular ways to make languages easier to understand:


Language features are orthogonal when they can be understood in isolation and freely combined with no surprises. Orthogonality thus allows a simple definition to cover lots of possibilities. A example of non-orthogonality in Pascal:
  • Functions take values as input and produce a value on output.
  • Arrays may be treated as values, and they may be passed to functions.
  • However, arrays cannot be returned from functions.
A common kind of orthogonality is to allow any operator to be used with any data types, i.e. "operator overloading." The potential operator-type combinations form a grid, whose intersections may be defined as we please. Operator overloading was the concept in Algol 68 which was used to introduce the notion of language orthogonality. Today we'd call operator overloading "object-oriented design." (SICP section 2.3.3 calls it "data-directed programming".)

Another example of orthogonality is the separation of type from behavior in CLOS (via generic functions, which allow a particular behavior, e.g. "print," to be independently defined on any type) or the separation of type from code sharing in Sather. In contrast, Smalltalk and C++ couple typing, behavior, and code sharing.

Orthogonality seems to be in opposition with security, since it allows potential errors to go unnoticed. However, if a particular combination of features should not be used in a particular context, the language should allow the programmer to define this restriction.


All language features arise from a few basic concepts. In order words, each concept goes as far as possible and related concepts are unified. An example of non-generality is the distinction between procedure naming (defun) and value naming (lambda) in classical Lisp. Another example is Ada's distinction between procedures, records, package templates, and task templates, which are all specifications for naming environments. Self makes neither of these distinctions. In fact, the Self paper you will read equates uniformity/generality with "power" (whatever that means).

Wirth called an overly general language (such as his own Euler, a predecessor of Pascal) a "high-level Turing machine," since it permits opacity, through a lack of emphasis on what the program is really doing. In his view, language concepts should be highly specialized, so that it is clear when to use them and what they mean. The same phenomenon occurs with the overly general Lisp syntax, which obscures the difference between, say, a conditional and a procedure call. The rejoinder is that making such distinctions complicates the language while adding no new functionality. Furthermore, special cases are brittle in the face of new, unanticipated tasks. As Gelernter and Jagannathan quip: "A language design based on patchwork is more than likely to have some cracks."

Some argue that language simplicity, even when overall functionality is the same, improves the efficiency of executables. This is puzzling, since the special cases teased out by extra features would seem to offer better optimization. However, since compiler complexity is superlinear in language complexity, simpler languages often have significantly better compilers.

Code reusability

Code reusability is like language generality: a single piece of code goes as far as possible and related codes are unified. Thus a good language should support the construction of highly general code. Here are several ways to do this:

Code specifies the operations to perform, but not the exact data types on which they will be performed. The analogy is that code is like a pipe system through which many different fluids can flow.
Higher-order functions
The important places among which a piece of code can vary are identified, and replaced by calls to functions. These functions are then provided as arguments to the function. The analogy is that code is like swiss cheese. (Okay, no more analogies.)
Differential coding (inheritance)
Instead of writing a new module from scratch, only specify the differences between it and an old module.
This was an interesting feature of Algol 60, never to be used again. Function calls behave as though the actuals were textually substituted for the formals, i.e. functions are like preprocessor macros in C. If the caller specifies an expression as one of the actuals, it will be evaluated on every reference to the corresponding formal, in the context of the function. This means you can write a higher-order function without even knowing it.

In a classic example, suppose we define a function which computes k*n in a funny way:

PROCEDURE sum(i, k, n) INTEGER i, k, n;
  sum := 0;
  FOR i := 1 STEP 1 UNTIL n DO sum := sum + k
Now suppose we write:
sum(j, A[j], SizeOfA)
What does this compute? According to the textual substitution rule, this is the same as
sum := 0;
FOR j := 1 STEP 1 UNTIL SizeOfA DO sum := sum + A[j]
i.e. the sum of the elements of vector A! You can do something very similar in Self, a language with call-by-value parameter passing but extensive facilities for reflection (defined below). (BTW, call-by-name also allows delayed evaluation, such as what you would get with Scheme's force. Can you see how streams might be implemented in Algol 60?)

If call-by-name is so powerful, why don't people use it? Because it is at odds with modularity: to use a function in this way, the caller must know a great deal about its implementation, and if the implementation changes (e.g. replacing the sum procedure with simply k*n) callers may break. (Exercise: do the other reuse techniques suffer from this problem? Is a language better for code reuse when the implementor doesn't have to plan for reuse?)

Since code reusability is like language generality, the generality caveats also apply here.

These language criteria are very difficult to define rigorously, much less measure quantitatively. So we usually must resort to "just trying it." In fact, many often-cited properties of a good language are highly application dependent. Here is a list from Horowitz, along with successful language counterexamples:


Keep in mind that a language is not simply an ever-growing collection of features. Unfortunately, the computer press wrongly treats languages (and applications, too) as little more than a list of features. Worse, they only consider the features which get explicit syntactic support, regardless of whether the language supports them in other ways. This is what makes many people feel that new languages are just stockpiling features.

A language is at its best when it is simple enough for people to have a deep enough understanding of it to use it to its fullest potential. This means there should be few different concepts to master, even while improving on the functionality of earlier languages. One of the trickiest parts of language design is knowing what features you don't need. This is part of what makes language design challenging, artistic, and limitless. Languages are like axiomatic systems. Add one more feature and the ramifications can change everything.

What makes languages similar? What makes them different?

Here are four categories of questions that are helpful in understanding languages. Each question is largely independent of the others, and many languages can be described as simply one possible combination of answers. However, each answer may have significant ramifications.

What problem representations does the language support? What problem-specific restrictions can be imposed on the use of this representation?

Values are the data objects manipulated by the language. They go into and out of functions, and can be stored in data structures. How can values be structured? What is an allowable value? Can compound values (structures) be treated as values? What about pointers (analogous to soft links)? References (hard links)? Functions? Namespaces? Can static programs be treated as values (e.g. in Scheme)? Running programs (i.e. threads)? I/O devices? Types? Language operators? Many language advances, including reflection, concern extending the notion of a value, without a corresponding cognitive load.

Is typing supported? Are values typed, or are names? (This is related to, but not the same as, dynamic vs. static typing.) Does automatic coercion ever occur? Can operators be overloaded for different types? What about user-defined functions?

What values can be named? Which require names (e.g. functions)? How is a reference to a name resolved?

What role do names play in the language? Are names with different roles mixed together, or kept distinct? Here are three ways names are used in real life, which all show up in programming languages:

(Exercise: What other roles can names play?)

Locality of effects
What can change when? Can changing a name or value have non-local effect? Functional languages try to avoid this; constraint languages promote it.

What changes to the language can be made in the language? Can the language redefine the meaning of or create new data types? Operators? Syntax (e.g. precedence)? Can it redefine the semantics of procedure call (e.g. parameter passing)? Name binding? Control structures? Can this be done locally or globally? Can the reflection mechanism itself be modified?

Related to values: Can the language treat its own building blocks as data? For example, languages are often implemented with symbol tables, closures, and activation records. Are any of these constructs visible in the language? Scheme supports a kind of static reflection, by allowing program text to be treated as a data structure. Call-with-current-continuation also allows reflection by treating "the rest of the program execution" as a procedure. Symmetric Lisp allows a running program's namespace to be treated as a data structure. Self allows a message send to be treated as a data structure.

A note on ideology

Be wary of arbitrary vocabulary and ideological distinctions. While useful and inspiring, they can obscure similarities. For example, the Algol 68 report, in order to prevent confusion with other languages, adopted alternative words for nearly everything: "assignation" instead of "assignment", "modes" instead of "types", "multiple value" instead of "array", "closed clause" instead of "block", "unit" instead of "expression", and on and on. Tanenbaum explains:
This was done to force the reader to rely on the Report's definitions, rather than to rely on his previous experience with similar concepts that nonetheless may differ from the Report's definitions in subtle, but crucial ways.
(In all fairness, Algol 68 did introduce many new concepts.) Another example is BETA's use of "pattern," Ada's use of "package," and Life's use of "psi-term" for what Self calls an "object." Of course, objects may have different representations and restrictions in these languages, but the underlying concept of first-class namespace is the same.

As another example, since the late 60s there has been two apparently distinct language camps:

but of course both of these ideas are what we now call object-orientation. There really are only a few good ideas in language design.

An aside: is programming style (syntactical layout, use of specifications, etc) a language issue? Can it be relegated to a "style checker"? How does the scale tip between enforcing correct style vs. committing a language to the wrong style?


Programming methodology, like design ideology, also impacts languages, but not severely. Just as processors utilize a few basic tools to support a wide range of languages, languages utilize a few basic tools to support a wide range of methodologies. The aim of linguistics is to isolate and explain these tools, by way of case study. This can be done without a deep understanding of methodology.

A note on history

The languages we will study are all very polished, but of course emerged from a long history of trial and error. In fact, the history of languages looks like a pendulum swinging between
whenever a new breakthrough or need is discovered, whenever a language tries to subsume others (e.g. PL/1, Perl), or whenever a language is designed by committee (Common Lisp, Ada)
when a small group works to generalize a few good ideas, usually over many iterations (Self had four iterations, and its precursor, Smalltalk, had five)
In comparison, old languages appear arbitrary, even irrational, in their limitations, but such is the benefit of 20/20 hindsight.

Since language implementations are programs, indeed some of the most complex programs, one wonders how much language design is influenced by advances in software engineering. For example, did structured programming promote the creation of modern object oriented languages? Will object-oriented programming promote even more powerful languages?


Abelson and Sussman. Structure and Interpretation of Computer Programs. MIT Press, 1985.

Myers, ed. Languages for Developing User Interfaces. Jones and Bartlett, 1992.

Hoare. Hints on programming language design. Sigact/Sigplan Symposium on Principles of Programming Languages, Oct 1973.

Wirth. On the Design on Programming Languages. Proc. IFIP Congress 74, 386--393. 1974?

Horowitz. Fundamentals of Programming Languages. Computer Science Press, 1984.

Tanenbaum. A Tutorial on Algol 68. Computing Surveys (8) 2, June 1976.

Gelernter and Jagannathan. Programming Linguistics. MIT Press, 1990.

PLE Home page,
Last modified: Tue Feb 20 11:58:34 EST 1996