CS2011 Lab 3: Intrinsic measures of performance: Time complexity of algorithms (1 lab session)

Purpose.

This exercise measures the performance of algorithms by counting how many operations are executed during processing. This is the time complexity of a algorithm. We compare this to the running time as computed in the previous exercise.

Can we know simply by looking at an algorithm how long it is going to take to run? If so then we may be able to design algorithms to have better performances. The answer is ``yes'' and the idea is to choose operations to count, operations which we consider will take alot of processing time, and count them! That is, count how many times they will be executed whilst the algorithm runs. In the lectures, we show how we can count them without running the algorithm! Here we support this analysis by explicitly counting operations whilst the algorithm runs and comparing it to the CPU running time of the previous exercise. We use the three sorting algorithms of the previous exercise.

One way of counting numbers of operations in SML is to introduce so-called reference variables. These are none other that variables as used in C and other imperative languages. In imperative languages, variable denote locations at which data is stored, and the data stored may be changed using assignment and accessed using dereferencing.

Here's an example to show the syntax:

   val x = ref 0;
   x:= 4;
   !x;

It declares x to be a reference variable initialised to 0. It then updates x to 4 using an assignment, and then returns the value in x by dereferencing.

We count the number of operations in the sorting algorithms. Which operations? There are several possibilities (and it turns out it doesn't matter much). The tradition is to count the order operation by which we compare elements.

Here is the the InsertSort algorithm modified to count the number of comparisons.

   val count = ref 0;

   fun insert(x:real,[]) = [x]
     | insert(x, y::ys) = 
          ( count := !count + 1; 
            if x<y then x::y::ys else y::insert(x,ys) );

   fun insertsort([]) = []
     | insertsort(x::xs) = insert(x,insertsort(xs));

It declares a reference variable count, initialises it to zero, and increments it each time a comparison $\bgroup\color{red}$<$\egroup$ is performed (the brackets here are important!). To run:

   insertsort [2,5,1,5,6,2,3,7,1];
   !count;

Tasks

First run the modified InsertSort algorithm on the same pseudo-random sequences as in Exercise 2 and plot the count of the number of operations against the length of the lists. How does it compare to the CPU timing of the previous exercise?

Now modify your MergeSort and QuickSort to count the number of comparisons and plot the results on the same selection of lists. Call your program file complexity.sml.Does the counting of comparisons give a useful guide to the rate of growth of CPU time? I.e. how do these curves relate to those of Exercise 2?

Consider now not the rate of growth, but instead the behaviour of the algorithms on lists of a fixed length. Consider the lists of length 10:

    [9.0, 2.0, 3.0, 12.0, 6.0, 2.0, 5.0, 15.0, 9.0, 2.0 ]

   (* Ordered *)

   [1.0, 3.0, 5.0, 7.0, 9.0, 11.0, 13.0, 15.0, 17.0, 19.0 ]

   (* Reverse ordered *)

   [20.0, 18.0, 16.0, 14.0, 12.0, 10.0, 8.0, 6.0, 4.0, 2.0 ]

   (* Nearly ordered *)

   [1.0, 4.0, 3.0, 6.0, 8.0, 9.0, 8.0, 12.0, 9.0, 14.0 ]

Examine the behaviour, in terms of the number of comparison operations, of the three algorithms on these lists. Explain why some algorithms perform poorly on some lists, in particular those which are in order, or nearly so, or in reverse order. How would you modify QuickSort to perform better on lists in order, or nearly so?

For marking, you with need to show your modified algorithms, show them working and the three plots. You also need to explain to the demonstrator the shape of the curves, why the algorithms perform differently on the given lists of length 10 and answer the question about QuickSort.

Next: CS2011 Lab 4: Tables Up: CS2011. Algorithms and Data Previous: CS2011 Lab 2: Timing

Graham Gough
2001-11-12