Analysis suggests that the two areas of standard COBOL that take up most run-time are input-output (access to disk) and loops (formed by PERFORM and GO TO statements). Input-output is most crucial, however, once that is addressed there must be parallelism within the code to exploit the I/O parallelism.
Most COBOL applications fit into one of two categories:
OLTP programs are typically executed in parallel by the Transaction Processing monitor (e.g. CICS, Tuxedo, TPMS) within which they are developed and executed. In these cases intra-transaction parallelism often comes as a by-product of the underlying RDBMS. As a result, there is little performance improvement likely for such applications. Nonetheless, there is potential for improvement.
For many organisations, the overall runtime of their batch programs is a major operational issue. It means they need to size their systems on the basis of single stream performance rather than aggregate performance and to avoid this they often split work (manually) into separate runs (e.g. by initial letter of customer surname) which can be executed concurrently. In many case this ``batch processing window" takes up all the time available between OLTP work. A relatively small improvement here would be significant.
The similarity between database processing and COBOL applications is indicated by their access to large datasets held on disk. Techniques used by parallel database implementations to minimise disk access by caching, and using distributed file-stores to increase concurrency will be of use for parallel COBOL implementation.