- Vector Declarations and element access
- Bound checking
- Multi-dimensional arrays
- Static and Dynamic arrays
- Operations on vectors, including slicing
- Exercises
- Readings

Vectors and Arrays - Language design and implementation

The same information may be expressed by the number of elements if the language fixes the lower bound.

We will model an array variable as a set of as many locations as the array has elements, each mapped to a value according to the state. Then the variable access

`array[index]`

just selects the correct location from the
array variable and proceeds as for a simple variable.
To simplify finding the location(s) for a particular element of an array, we
place the locations used for the array at a regular pattern of addresses, so
`array[index]`

is at address

origin + index * stride,

where the **origin** is the starting address for the array and the
**stride** is the constant offset between the addresses of the locations of
successive elements of the array. (The origin is the address of
`array[0]`

; if 0 is not a valid index then the origin is not an address
of an element of the array.)

For simple languages, a compiler can recalculate the stride from the element type whenever necessary, so the only information needed to access an array is its origin.

lower bound index upper bound

Many modern languages treat an array as a vector whose elements may be of (almost) any type, including arrays. Thus, a 2-dimensional array is just a vector of vectors. Some languages make this clear in declarations and/or accesses, e.g. using

`a[i][j]`

,
some languages hide this e.g. using `a[i,j]`

,
and some allow both forms,
although they may not treat them in exactly the same way.
The stride is calculated from the array declaration. Many languages leave
this to the implementor; the simplest method is to use the size of a vector
element, calculated from its type (which may itself be a vector). Some
languages insist that the strides are calculated in a specific way (usually
similar to this simple method) so that:

the user knows exactly how memory is used,
or

the bounds of one dimension can be dynamically varied at run-time, so this
dimension (either the first or the last for a multi-dimensional array) must
have the largest stride.

Algorithms for array accesses:

access | bchk | stride(s) | |

1-D | O+I*S | S=sizeof(element) | |

multi-D | O+* | = " | |

+* | = " *() | ||

. . . | . . . | . . . | |

+* | = " *()* . . . *( ) |

(note: and are not required to calculate S)

Some languages allow the size of arrays to be computed at run-time (

Some languages can query a vector (usually a parameter) to find its upper and/or lower bounds, or iterate over all elements of a vector.

So, in a simple language a vector parameter passed by reference is just the origin, but in a more complex language we need more information. This can be either auxiliary vectors of pointers (

Implementation of operations on arrays:

fixed | variable | ||

static | 1-D | S L&U | O |

multi-D | S L&U | O | |

dynamic | 1-D | S | O L&U |

multi-D | - | O S L&U | |

slice | - | O S L&U |

(note: if no bound-checking is required, simply ignore L&U.)

Static arrays passed as parameters have to be treated like dynamic arrays unless the bounds are fixed at compile time, e.g. by type-checking.

Some languages require implementations that calculate and keep this information in a dope vector at run-time, as it allows the most generality for e.g. array slicing or dynamic arrays.

Other languages, that constrain array declarations or operations, can have implementations that fix some or all of these at compile time, and so allow faster code.

- Some languages can copy or compare vectors.
- Some languages can only initialise vectors.
e.g.
`int a[] = {1, 2, 3, 4, 5};`

some languages allow e.g.`{6=>3, 7..15=>4, others=>-1}`

- Some languages support
**slicing**- selecting a subset of the array elements e.g.`b is a [15 : 17]`

or`a [ : 17]`

or`a [15 : ]`

` a: array [1..4, 1..7] of integer;`

means exactly the same as:

` a: array [1..4] of array [1..7] of integer;`

and thus either

`a[i, j]`

or `a[i][j]`

can be used interchangeably.
Therefore only the first dimension can be sliced, and only to select one
element (row or column): `a[i]`

.
- When multiplication took longer than a memory access, a common
implementation of multi-dimensional arrays was to use
**Iliffe vectors**i.e. an array is a vector of pointers to vectors of pointers to... to vectors holding the actual elements. e.g. a 2-D array of integers is a vector of pointers to vectors of integers. During an access, the first index selects a pointer from the initial array, and the second index selects an integer from the vector pointed at.

This technique makes array slicing asymmetrical, as we can slice via the one index but not via the others. As the vectors of values are independent, they can in principle, have different sizes and bounds from each other.

In many languages it is possible for the user to explicitly write code to use Iliffe vectors. Using C, declare and access a 2-dimensional array using this technique. You will also need to ensure that the vectors of values exist and initialise the vector of pointers. Try obtaining the space for the values in two ways: using malloc, and using another local variable of suitable size.

Bal & Grune: chs. 2.2.1, 2.3, 2.3.1

Aho, Sethi & Ullman: ch. 7.1