PLE Lecture notes -- Linguistics
Most programming languages are partly a way of expressing things in
terms of other things and partly a basic set of given things.
The Next 700 Programming Languages
CACM Vol 9 No 3 March 1966
This activity attempts to give a broader and more hands-on experience of
the programming language world than MIT courses do.
- Languages and the paradigms they inspire are part of the programmer's
bag of tools. Hence every programmer benefits from acquiring more tools
and a deeper understanding of their tools.
- To make it easier to learn more (because there will be more).
After this exposure to a diverse set of state-of-the-art languages, you will
be able to rapidly understand new languages.
- To become a better language designer. Every programmer ends up doing
this, one way or another.
Why are there so many languages? Aren't they all Turing-equivalent
anyway? What makes languages different enough to warrant 2000? Don't
these people have something better to do?
The intent of language research is not to make programming more
complicated; just the opposite. In theory, any language is sufficient
for a task, but some languages may result in complicated,
unmaintainable, unreusable programs. Things which people have to get
done of course do get done, even with primitive languages. Thus one
purpose of research is allowing people to write programs which are
simpler, or at least tractable.
And there are many things people want to do:
People want to express their tasks at a level of abstraction that
encodes exactly what they care about, nor more or less. When we
program, we want to think in terms of binary trees, threads,
constraints, finite state machines, grammar productions, and windows,
not moving bytes between registers. This is why many people use yacc
for specifying a parser instead of C. Ideally, yacc allows us to
concentrate on the task of constructing a grammar, not e.g. the
details of I/O and storage management: we only talk to the computer
about grammars, it restricts us to those things which make sense with
grammars, and we only get error messages about grammars.
Unfortunately, "the right level of abstraction" is a moving target.
Thus each problem seems to require its own programming language, just
as it requires its own English vocabulary. C.A.R. Hoare observed that every large-scale project
procedures and data types. Indeed, modern languages provide these
mechanisms so that they can extend the scope of their abstractions to
a wider variety of tasks. However, these "nano-languages" do not
always go far enough. For example, we will have a hard time getting C
to look like BNF, just with user-defined procedures and data types.
Hoare gave a simpler example: a type system which makes sure that we
don't confuse meters with yards. Languages should allow us to
preserve our conceptual integrity with the problem we're solving.
In his Turing award lecture, Wilkes
...choosing a programming language is equivalent to choosing a data
structure... It would ... be more logical to first choose a data
structure appropriate to the problem and then look around for ... a
language suitable for manipulating that data structure.
People also want languages to exploit parallel hardware. While
theoretically possible with clever compilation of mainstream
languages, it often turns out to practically impossible. Hence many
new languages are devoted to parallelism. This also relates to the
above point: finding the right abstractions for representing
parallelism, when we need to, is a continuing research topic.
- User interfaces
Sophisticated user interfaces and virtual reality also
demand special language support. Three basic operations in user interfaces
Because of the need for real-time operation and synchronization, some
interface designers have proposed including time as a first-class
language entity (though I'm not sure what this would mean). At a more
pedantic level, widgets, because of their voluminous number of
parameters, put a great deal of pressure on parameter passing,
e.g. keyword and optional arguments to functions, as well as object
orientation (e.g. the Xt and Andrew toolkits both implement an object
system on top of C).
- Displaying objects. Since the goal is to project a physical
metaphor, e.g. layered windows or a 3D landscape, programmers want to
program in these terms. This means commands like "move this object
behind this other one," not "draw these lines." This seems to require
a highly declarative approach to graphics.
- Maintaining state and relations among objects. The visualization,
representation, and manipulation of an object need to be somehow
connected, but not drastically influence each other's implementation.
In addition, structured objects must adapt their form when their parts
change. Programmers have found that constraints, such as
those found in logic languages, are a convenient way to handle both of
- Reacting to input. A reactive or event-driven design is more flexible
than a polling design, since it can handle multiple kinds of input, is
less modal, and permits arbitrary interruptions. It's closer to
how we interact with dynamic objects in the real world. Implementing
such designs, while maintaining modularity and real-time behavior,
seems to require linguistic support for fine-grain concurrency
(e.g. Netscape threads).
- Component software
- This rising concern of the 90's emphasizes reuse as the primary
feature of a language. Distributing software components, instead of entire
applications, has the potential to benefit everyone:
- Users benefit from being able to assemble applications from
only the best parts, or the ones people are familar with. For example, using
the same "Emacs" component whenever an application needs to do text editing.
- Developers benefit from not having to develop monolithic programs.
This means a smaller entry cost and shorter release cycles.
Another cause for language development is ideology. Four major ideologies
However, while each of these camps seem to pursue orthogonal tracks,
making successive refinements to their "own" languages, one wonders
how different their goals really are, especially upon comparison of
these languages (more below).
- Creativity: making programs easier to write.
- Reuse: making existing programs easier to adapt to new goals.
- Security: making programs easier to debug and prove, e.g. by
forcing programmers to be more specific.
- Optimization:making programs run faster on known hardware.
Finally, everyone is still looking for the silver bullet to end the so-called
Fortunately for us, few of the 2000 languages are innovative.
Three useful criteria, which are exceedingly difficult to describe rigorously,
and seem to have many arguments for and against them, are:
These language criteria are very difficult to define rigorously, much
less measure quantitatively. So we usually must resort to "just
trying it." In fact, many often-cited properties of a good language
are highly application dependent. Here is a list from Horowitz, along with successful language
- Program conciseness
Fred Brooks pointed out that the number of bugs in a program is
superlinear in its length. Indeed, many factors seem to promote small
programs. However, one must exercise caution. Let's take the idea to
a logical extreme: five billion people writing one program per second
for fifty years produce less than 2^63 programs. Thus we can encode
any program ever written, including revisions, in eight characters.
Is this a good programming language?
Wirth, in his languages and writings, emphasized
that conciseness by itself is not a reasonable goal for language
design (as well as other things; read on), because excessive brevity
makes programs difficult to understand, just as does excessive length.
But again there is a difficulty, this time concerning the word
"understand," which is a very subjective and time-dependent
phenomenon. Many language constructs in regular use today were once
considered too intricate or non-intuitive (and probably were at the
- Language simplicity
Two popular ways to make languages easier to understand:
Some argue that language simplicity, even when overall
functionality is the same, improves the efficiency of executables.
This is puzzling, since the special cases teased out by extra features
would seem to offer better optimization. However, since compiler
complexity is superlinear in language complexity, simpler languages
often have significantly better compilers.
Language features are orthogonal when they can be understood in
isolation and freely combined with no surprises. Orthogonality thus
allows a simple definition to cover lots of possibilities.
A example of non-orthogonality in Pascal:
A common kind of orthogonality is to allow any operator to be used
with any data types, i.e. "operator overloading." The potential
operator-type combinations form a grid, whose intersections may be
defined as we please.
Operator overloading was the concept in Algol
68 which was used to introduce the notion of language
orthogonality. Today we'd call operator overloading "object-oriented
design." (SICP section 2.3.3 calls it
- Functions take values as input and produce a value on output.
- Arrays may be treated as values, and they may be passed to functions.
- However, arrays cannot be returned from functions.
Another example of orthogonality is the
separation of type from behavior in CLOS (via generic functions, which
allow a particular behavior, e.g. "print," to be independently defined
on any type) or the separation of type from code sharing in Sather.
In contrast, Smalltalk and C++ couple typing, behavior, and code
Orthogonality seems to be in opposition with security, since it allows
potential errors to go unnoticed. However, if a particular
combination of features should not be used in a particular context,
the language should allow the programmer to define this restriction.
All language features arise from a few basic concepts. In order
words, each concept goes as far as possible and related concepts are
unified. An example of non-generality is the distinction between
procedure naming (defun) and value naming (lambda) in classical Lisp.
Another example is Ada's distinction between procedures, records,
package templates, and task templates, which are all specifications
for naming environments. Self makes neither of these distinctions.
In fact, the Self paper you will read equates uniformity/generality
with "power" (whatever that means).
Wirth called an overly general language (such as
his own Euler, a predecessor of Pascal) a "high-level Turing machine,"
since it permits opacity, through a lack of emphasis on what the
program is really doing. In his view, language concepts should be
highly specialized, so that it is clear when to use them and what they
mean. The same phenomenon occurs with the overly general Lisp syntax,
which obscures the difference between, say, a conditional and a
procedure call. The rejoinder is that making such distinctions
complicates the language while adding no new functionality.
Furthermore, special cases are brittle in the face of new,
unanticipated tasks. As Gelernter and Jagannathan
quip: "A language design based on patchwork is more than likely to
have some cracks."
- Code reusability
Code reusability is like language generality: a single piece of code goes
as far as possible and related codes are unified. Thus a good language
should support the construction of highly general code. Here are several
ways to do this:
Since code reusability is like language generality, the generality
caveats also apply here.
- Code specifies the operations to perform, but not the exact data types
on which they will be performed. The analogy is that code is like a
pipe system through which many different fluids can flow.
- Higher-order functions
- The important places among which a piece of code can vary are
identified, and replaced by calls to functions. These functions are
then provided as arguments to the function. The analogy is that code
is like swiss cheese. (Okay, no more analogies.)
- Differential coding (inheritance)
- Instead of writing a new module from scratch, only specify the
differences between it and an old module.
- This was an interesting feature of Algol 60, never to be used again.
Function calls behave as though the actuals were textually substituted
for the formals, i.e. functions are like preprocessor macros in C.
If the caller specifies an expression as one of the actuals, it will
be evaluated on every reference to the corresponding formal, in
the context of the function. This means you can write a
higher-order function without even knowing it.
In a classic example, suppose we define a function which computes
k*n in a funny way:
PROCEDURE sum(i, k, n) INTEGER i, k, n;
sum := 0;
FOR i := 1 STEP 1 UNTIL n DO sum := sum + k
Now suppose we write:
sum(j, A[j], SizeOfA)
What does this compute? According to the textual substitution rule, this is
the same as
sum := 0;
FOR j := 1 STEP 1 UNTIL SizeOfA DO sum := sum + A[j]
i.e. the sum of the elements of vector A! You can do
something very similar in Self, a language with call-by-value parameter
passing but extensive facilities for reflection (defined below).
(BTW, call-by-name also allows delayed evaluation, such as what you would
get with Scheme's force. Can you see how streams might be
implemented in Algol 60?)
If call-by-name is so powerful, why don't people use it? Because it
is at odds with modularity: to use a function in this way, the caller
must know a great deal about its implementation, and if the
implementation changes (e.g. replacing the sum procedure with simply
k*n) callers may break. (Exercise: do the other reuse
techniques suffer from this problem? Is a language better for code
reuse when the implementor doesn't have to plan for reuse?)
- security to errors (C)
- efficient object code (Tcl)
- fast translation (C++)
- machine independence (C*)
- provability (C)
- consistency with mathematical notation (Forth)
Keep in mind that a language is not simply an ever-growing collection
of features. Unfortunately, the computer press wrongly treats
languages (and applications, too) as little more than a list of
features. Worse, they only consider the features which get
explicit syntactic support, regardless of whether the
language supports them in other ways. This is what makes many people
feel that new languages are just stockpiling features.
A language is at its best when it is simple enough for people to have
a deep enough understanding of it to use it to its fullest potential.
This means there should be few different concepts to master, even
while improving on the functionality of earlier languages. One of the
trickiest parts of language design is knowing what features you
don't need. This is part of what makes language design
challenging, artistic, and limitless. Languages are like axiomatic
systems. Add one more feature and the ramifications can change
Here are four categories of questions that are helpful in
understanding languages. Each question is largely independent of the
others, and many languages can be described as simply one possible
combination of answers. However, each answer may have significant
What problem representations does the language support?
What problem-specific restrictions can be imposed on the use of this
Values are the data objects manipulated by the language. They go into
and out of functions, and can be stored in data structures. How can
values be structured? What is an allowable value? Can compound
values (structures) be treated as values? What about pointers
(analogous to soft links)? References (hard links)? Functions?
Namespaces? Can static programs be treated as values (e.g. in
Scheme)? Running programs (i.e. threads)? I/O devices? Types?
Language operators? Many language advances, including reflection,
concern extending the notion of a value, without a corresponding
Is typing supported? Are values typed, or are names? (This is related
to, but not the same as, dynamic vs. static typing.) Does automatic
coercion ever occur? Can operators be overloaded for different types?
What about user-defined functions?
What values can be named? Which require names (e.g. functions)?
How is a reference to a name resolved?
What role do names play in the language? Are names with different
roles mixed together, or kept distinct? Here are three ways names are
used in real life, which all show up in programming languages:
Name something that I have, so that I can refer to it later. Names
are like post-its: I can stick and unstick them any way I like.
(Commonly found in imperative languages.)
Name something that I may not have, but have specified exactly how to
get. Names are like commands; they can only be assigned to the result
of that command. (Commonly found in functional languages, particularly
those with non-strict or write-once variables.)
Name something that I haven't specified how to get, i.e. the name is
a placeholder. "The smallest integer which is the sum of two cubes in
two different ways." (Commonly found in declarative, e.g. logic,
languages, but pointer variables also have this quality.)
(Exercise: What other roles can names play?)
- Locality of effects
What can change when? Can changing a name or value have non-local effect?
Functional languages try to avoid this; constraint languages promote
What changes to the language can be made in the language? Can the
language redefine the meaning of or create new data types? Operators?
Syntax (e.g. precedence)? Can it redefine the semantics of
procedure call (e.g. parameter passing)? Name binding? Control
structures? Can this be done locally or globally? Can the reflection
mechanism itself be modified?
Related to values: Can the language treat its own building blocks as
data? For example, languages are often implemented with symbol
tables, closures, and activation records. Are any of these constructs
visible in the language?
Scheme supports a kind of static reflection, by allowing program text
to be treated as a data structure. Call-with-current-continuation
also allows reflection by treating "the rest of the program execution"
as a procedure. Symmetric Lisp allows a running
program's namespace to be treated as a data structure. Self
allows a message send to be treated as a data structure.
A note on ideology
Be wary of arbitrary vocabulary and ideological distinctions. While
useful and inspiring, they can obscure similarities. For example, the
Algol 68 report, in order to prevent confusion
with other languages, adopted alternative words for nearly everything:
"assignation" instead of "assignment", "modes" instead of "types",
"multiple value" instead of "array", "closed clause" instead of
"block", "unit" instead of "expression", and on and on.
This was done to force the reader to rely on the Report's
definitions, rather than to rely on his previous experience with
similar concepts that nonetheless may differ from the Report's
definitions in subtle, but crucial ways.
(In all fairness, Algol 68 did introduce many new concepts.) Another
example is BETA's use of "pattern," Ada's use of "package," and Life's
use of "psi-term" for what Self calls an "object." Of course, objects
may have different representations and restrictions in these
languages, but the underlying concept of first-class namespace is the same.
As another example, since the late 60s there has been two apparently distinct
but of course both of these ideas are what we now call object-orientation.
There really are only a few good ideas in language design.
- The first emphasized generality, by treating program structures
(blocks) as data structures. This led to the development of Simula,
Smalltalk, and Self. (The wildly popular notion of inheritance, by
the way, emerged in Simula as a generalization of block nesting,
though it now has much deeper interpretations.)
- The second emphasized security, by hiding implementations under an
abstract data type. This led to CLU, Ada, and Eiffel.
An aside: is programming style (syntactical layout, use of
specifications, etc) a language issue? Can it be relegated to a
"style checker"? How does the scale tip between enforcing correct
style vs. committing a language to the wrong style?
Programming methodology, like design ideology, also impacts languages,
but not severely. Just as processors utilize a few basic tools to
support a wide range of languages, languages utilize a few basic tools
to support a wide range of methodologies. The aim of linguistics is
to isolate and explain these tools, by way of case study. This can be
done without a deep understanding of methodology.
A note on history
The languages we will study are all very polished, but of course
emerged from a long history of trial and error. In fact, the history
of languages looks like a pendulum swinging between
In comparison, old languages appear arbitrary, even irrational, in
their limitations, but such is the benefit of 20/20 hindsight.
- whenever a new breakthrough or need is discovered, whenever a language
tries to subsume others (e.g. PL/1, Perl), or whenever a language is
designed by committee (Common Lisp, Ada)
- when a small group works to generalize a few good ideas, usually over
many iterations (Self had four iterations, and its precursor, Smalltalk,
Since language implementations are programs, indeed some of the most
complex programs, one wonders how much language design is influenced
by advances in software engineering. For example, did structured
programming promote the creation of modern object oriented languages?
Will object-oriented programming promote even more powerful languages?
Abelson and Sussman. Structure and Interpretation of Computer
Programs. MIT Press, 1985.
Myers, ed. Languages for Developing User Interfaces.
Jones and Bartlett, 1992.
Hoare. Hints on programming language design.
Sigact/Sigplan Symposium on Principles of Programming Languages, Oct 1973.
Wirth. On the Design on Programming Languages.
Proc. IFIP Congress 74, 386--393. 1974?
Horowitz. Fundamentals of Programming Languages.
Computer Science Press, 1984.
Tanenbaum. A Tutorial on Algol 68.
Computing Surveys (8) 2, June 1976.
Gelernter and Jagannathan. Programming Linguistics.
MIT Press, 1990.
PLE Home page, email@example.com
Last modified: Tue Feb 20 11:58:34 EST 1996