Next: Identifiers: Static and Dynamic
Up: CS2111: Design and Implementation
Previous: Control structures
  Contents
Subsections
Procedures and functions
There are various reasons for splitting a program into subprograms:
- if a particular operation or piece of code is used in several different
places, to reduce (source or object) code size etc.
- to use recursion
- as part of the divide-and-conquer (or similar) paradigm, to allow us to
control the complexity of a problem, or to partition work amongst several
programmers
- if a language has no other way of controlling scope and/or extent or of
making compound statements
There are several different kinds of subprograms, such as macros, functions
and procedures (or subroutines), that can:
- be expanded in-line
or shared
- be parameterised or not
- have side-effects or be pure
- have result
or not
- have an associated scope/extent or not
- have single
or multiple exits
- have single
or multiple entries
A subprogram can sometimes have a separate signature (prototype or header),
containing the parameters (or formal parameters),
and implementation (body).
which are separate from the call (or invocation or activation)
containing the arguments (or actual parameters).
Parameter passing
What happens when we substitute an actual-parameter for a formal-parameter
in a call? Most languages provide more than one mechanism to do this job,
and most of the mechanisms come in variant forms.
The actual-parameter is an expression, which is evaluated at the time of the
call. The result of this evaluation is provided to the procedure for use as
the corresponding formal-parameter. Inside the procedure it behaves either
as a constant
or as a variable.
Snag: passing an array as a value parameter is slow (because all the
elements have to be copied, unless the implementor can optimise the copy
away, or unless s/he cheats).
This example of passing array parameters by value in Pascal should output
1
, but if it outputs a -1
the implementor has not bothered to
copy the array parameter:
type arr = array [1..2] of integer;
var a: arr;
procedure nasty (c: arr);
begin
a[1]:= -1;
writeln (c[1]);
end;
begin
a[1]:= 1;
nasty (a);
end.
The actual-parameter is a variable-access, which accesses an internal
variable. This variable (in practice, its address) is passed to the
procedure for use as the corresponding formal-parameter. Therefore,
assignment to the formal-parameter from within the procedure implies
instant assignment to the variable accessed by the actual-parameter.
Many languages provide both call by reference and call by value.
If a language provides call by reference but not call by value, there is a
problem if we want to use an expression as an actual parameter. Some
languages just forbid this, others place the value of the expression into a
dummy argument, and the address of the dummy argument is passed for use as
the formal-parameter (thus assignments to the dummy argument work, but they
do not affect any variables outside).
Snag: call by reference usually has to be more strongly type-checked than
call by value.
The actual parameter is a variable-access (as with call by reference). It is
first passed by value. The formal-parameter may be assigned to from within
the procedure (but no variable outside gets changed). At the end of the
call, the value of the formal-parameter (which may have been changed by the
procedure) is copied back to the variable accessed by the actual-parameter.
This Fortran program distinguishes between call by reference and
call by copy/copy back:
PROGRAM TEST
COMMON I
I = 1
CALL P(I)
IF (I .EQ. 4) THEN
PRINT *, 'call by reference'
ELSE
IF (I .EQ. 3)
PRINT *, 'call by copy/copy back'
ENDIF
ENDIF
END
SUBROUTINE P(J)
COMMON I
J = 2
J = I + J
END
Snag: although fields of packed records can now be passed without
difficulty, passing arrays is as slow as in call by value.
Problem: suppose that an actual-parameter is a[i]
, and suppose that
i
=4 at the time of the call. Suppose that the procedure body assigns
a new value to the formal-parameter and also contrives to change the value
of i to 5. Is the new value written back to a[4]
or to a[5]
?
Most languages that provide call by copy/copy back also provide call by value.
As with call by reference, if a language provides call by copy/copy back but
not call by value, there is a problem if we want to use an expression as an
actual parameter. The simplest solution is just to omit the copy back.
Some languages provide call by value but not call by reference or by
copy/copy back. If the language provides pointers to named variables, then
call by reference can be simulated by passing pointers by value.
Thus languages that do not allow pointers to named variable, normally have to
provide call by reference or by copy/copy back.
Example of passing a function as a parameter in C:
double integrate (double start, double finish, double(*f)(double))
{
. . . f(x) . . .
}
. . .
area= integrate (0.0, 1.0, sqrt);
At every call, all occurrences of each formal-parameter within the procedure
body are replaced by the corresponding actual-parameter (i.e. the implementor
must make this appear to be the case). The actual-parameter must be
grammatically suitable in its new position e.g. if it is assigned to, it
must be a variable.
If an actual parameter is a variable, this is equivalent to call by
reference. However, if the actual parameter is an expression, the
consequences can be surprising e.g.
- if the actual parameter involves a function call, it is not just
evaluated once, as in call by value, but is evaluated each time the
formal-parameter is used within the routine - if it involves side effects,
this may have strange consequences
- if the actual-parameter involves a variable, which is passed as another
actual-parameter, then any changes in the latter affect the value of the
former. This is exploited in a technique known as Jensen's device.
Example of Jensen's device (call by name) in Algol60:
real procedure series (i, k, term)
value k; integer i, k; real term;
begin real sum;
sum := 0.0;
for i := 1 step 1 until k do
sum := sum + term;
series := sum;
end;
integer j;
print (series (j, 100, 1/j));
In this case the last parameter is implemented in a similar way to call by
procedure, by constructing a small function (known as a thunk). The
thunk is called at all the places in the body of series where that
actual-parameter was supposed to be replaced.
Polymorphism
Overloaded (ad-hoc polymorphic): same name or symbol used for different
operations (e.g. act on different types) - somehow we decide at compile time
(static binding) or at run time (dynamic binding) which actual operation to
use.
Generic (parameterised polymorphic): same operation can act on different types
The difference is only obvious to the implementor e.g. if you have to write
code for each of a limited number of different versions, rather than one (or
a few) piece(s) of code that can cope with all possible parameter types.
This facility is particularly associated with Abstract Data Types
(18), where both a type and its operations may be polymorphic.
Many programming languages do not treat subprograms as first class objects
i.e. they do not allow some of the following:
- subprograms as parameters
- subprograms as results
- subprogram variables
- subprogram literals
- operations on subprograms
Louden: ch. 7.4
Bal & Grune: ch. 2.5.2
Aho, Sethi & Ullman: ch. 7.5
Next: Identifiers: Static and Dynamic
Up: CS2111: Design and Implementation
Previous: Control structures
  Contents
Pete Jinks
1999-09-30