next up previous
Next: Data Parallelism Up: LEGACY SYSTEMS - THE Previous: COBOL Program Characteristics

TECHNICAL OUTLINE

There has been much work parallelising both query languages and FORTRAN. The database work provides exactly a performance-enhancing migration route to HPC. While there remains much to be done to achieve the performance of hand-tuned implementations, auto-parallelising FORTRAN compilers can achieve significant speed-up without user intervention. FORTRAN auto-parallelisation has been the subject of research for over twenty years, following the trend in computer architecture from vector supercomputers to massively parallel distributed memory architectures to today's symmetric multiprocessors (Banerjee et al., 1993; Wolfe, 1996).

There has been comparatively little work investigating the auto-parallelisation of COBOL. The possibility of using SPMD parallelism has been discussed (Darema, 1996), and an approach using COBOL-97 to develop independent objects by wrapping up existing code and using a client-server architecture for increased performance (Flint, 1996). Other work includes (Richter 1993), and the precursor of this investigation (Sakellariou and O'Boyle, 1996).

Many COBOL legacy systems typically have unstructured control-flow. Much effort has gone into the re-structuring of such applications to enable easier maintenance (Miller and Strauss, 1987; Sneed, 1991; Haimut et al., 1995). This re-structuring effort essentially provides rules to turn unstructured code into structured code. Such re-structuring rules can also help to make the code more amenable to parallelism.

Many performance developments in computer architecture, e.g. pipelining and vector processors, have been driven by the engineering sector's need for higher computation speeds. As a result, in recent years, the focus for FORTRAN compilers has been ensuring that data is made available to exploit these features. Optimising compilers for scientific applications are predominantly concerned with utilising main memory and cache. A disk is just an extra element in the memory hierarchy: cache, on-processor, off-processor, disk, with increasing latency. The read to disk is no different conceptually, though slower, to the read from a remote memory. Indeed parallel FORTRAN compilation is beginning to address disk locality as an important issue.

Most current research in automatic parallelism is targeted around FORTRAN loop structures: parallelism resulting from repeated execution of the same instructions on different data: data parallelism. A different form is when different parts of a program can be executed concurrently: task parallelism.



Subsections
next up previous
Next: Data Parallelism Up: LEGACY SYSTEMS - THE Previous: COBOL Program Characteristics
Rizos Sakellariou 2000-07-31