Excerpted from the Cocktail distribution "README" files: What is Cocktail? ---------------- Cocktail is the Compiler-Compiler Toolbox developed at GMD Karlsruhe. The Cocktail tools lay emphasis on practical usability and efficiency. The tools generate compiler parts out of specifications that are as ef- ficient as hand-written ones. The use of tools significantly reduces the construction effort and increases the reliability of the generated com- pilers. The toolbox supports almost all phases of compiler construction by more or less independent tools that generate C, C++, Modula-2, or (in part) Eiffel source code. Cocktail runs on all variants of Unix, as well as under Linux, MS-DOS and OS/2 and contains the following main tools: - Rex scanner generator - Lalr LALR(1) parser generator - Ell LL(1) parser generator - Ast generator for abstract syntax trees - Ag generator for attribute evaluators - Puma transformation of attributed trees using pattern matching Compiler Construction Tool Box ============================== Rex (Regular EXpression tool) is a scanner generator whose specifications are based on regular expressions and arbitrary semantic actions written in one of the target languages C or Modula-2. As scanners sometimes have to consider the context to unambiguously recognize a token the right context can be speci- fied by an additional regular expression and the left context can be handled by so-called start states. The generated scanners automatically compute the line and column position of the tokens and offer an efficient mechanism to normalize identifiers and keywords to upper or lower case letters. The scanners are table- driven and run at a speed of 180,000 to 195,000 lines per minute on a MC 68020 processor. Lalr is a LALR(1) parser generator accepting grammars writ- ten in extended BNF notation which may be augmented by semantic actions expressed by statements of the target language. The gen- erator provides a mechanism for S-attribution, that is syn- thesized attributes can be computed during parsing. In case of LR-conflicts unlike other tools Lalr provides not only informa- tion about an internal state consisting of a set of items but it prints a derivation tree which is much more useful to analyze the problem. Conflicts can be resolved by specifying precedence and associativity of operators and productions. The generated parsers include automatic error recovery, error messages, and error repair. The parsers are table-driven and run at a speed of 560,000 lines per minute. Currently parsers can be generated in the target languages C and Modula-2. Ell is a LL(1) parser generator accepting the same specifi- cation language as Lalr except that the grammars must obey the LL(1) property. It is possible to evaluate an L-attribution during parsing. The generated parsers include automatic error recovery, error messages, and error repair like Lalr. The parsers are implemented following the recursive descent method and reach a speed of 810,000 lines per minute. The possible tar- get languages are again C and Modula-2. Ast - A Generator for Abstract Syntax Trees - generates abstract data types (program modules) to handle trees - the trees may be attributed - besides trees graphs are handled as well - nodes may be associated with arbitrary many attributes of arbitrary type - specifications are based on extended context-free grammars - common notation for concrete and abstract syntax - as well as for attributed trees and graphs - an extension mechanism provides single inheritance - trees are stored as linked records - generates efficient program modules - generates modules in Modula-2 or C - provides many tree operations (procedures): - node constructors combine aggregate notation and storage management - ascii graph reader and writer - binary graph reader and writer - reversal of lists - top down and bottom up traversal - interactive graph browser Ag - An Attribute Evaluator Generator - processes ordered attribute grammars (OAGs) - processes higher order attribute grammars (HAGs) - operates on abstract syntax - is based on tree modules generated by Ast - the tree structure is fully known - terminals and nonterminals may have arbitrary many attributes - attributes can have any target language type - allows tree-valued attributes - differentiates input and output attributes - allows attributes local to rules - allows to eliminate chain rules - offers an extension mechanism (single inheritance) - attributes are denoted by unique selector names instead of nonterminal names with subscripts - attribute computations are expressed in the target language - attribute computations are written in a functional style - attribute computations can call external functions - non-functional statements and side-effects are possible - allows to write compact, modular, and readable specifications - AGs can consist of several modules - the context-free grammar is specified only once - checks an AG for completeness of the attribute computations - checks for unused attributes - checks an AG for the classes SNC, DNC, OAG, LAG, and SAG - the evaluators are directly coded using recursive procedures - generates efficient evaluators - generates evaluators in Modula-2 (or C) Puma - Transformation Tool based on Pattern Matching - last but not least A comparison of the above tools with the corresponding UNIX tools shows that significant improvements in terms of error handling as well as efficiency have been achieved: Rex generated scanners are 4 times faster than those of LEX. Lalr generated parsers are 2-3 times faster than those of YACC. Ell generated parsers are 4 times faster than those of YACC. The input languages of the tools are improvements of the LEX and YACC inputs. The tools also understand LEX and YACC syntax with the help of the preprocessors l2r and y2l. The tool box is publicly copyable. It has been developed since 1987. It has been tested by generating scanners and parsers for e. g. Pascal, Modula, Oberon, Ada and found stable. The tool box is implemented in Modula-2. It has been developed using our own Modula-2 compiler called MOCKA on a MC 68020 based UNIX workstation. It has been ported to the SUN workstation and been compiled successfully using the SUN Modula-2 compiler. The tools also run on VAX/BSD UNIX and VAX/ULTRIX machines. This should assure a reasonable level of portability for the Modula-2 code. Meanwhile the sources exist in C, too. directory contents ------------------------------------------------------------------------ README this file Makefile compilation, installation, and test of the tools (UNIX) compile.bat compilation of the tools (MSDOS) install.bat installation of the tools (MSDOS) test.bat test of the tools (MSDOS) doc.ps documentation in postscript format doc.me documentation in troff format, me macros doc.doc documentation in ascii format (without pictures) man manual pages in troff format, man macros rex Scanner Generator lalr LALR(1) Parser Generator ell LL(1) Recursive Descent Parser Generator bnf Transforms Grammars from Extended BNF to Plain BNF front Common Front-End of Lalr, Ell, and Bnf reuse Library of Reusable Modules (needed for all programs) common Library for estra and ell specs Example Specifications for the Above Tools cg Common Program implementing Ast and Ag Ast = Generator for Abstract Syntax Trees Ag = Attribute Evaluator Generator puma Transformation Tool based on Pattern Matching l2r Transforms Lex input to Rex input y2l Transforms Yacc input to Lalr input r2l Transforms Rex input to Lex input rpp Rex PreProcessor: rpp + cg extract most of a scanner specification out of a parser specification estra Transformation of attributed trees (prototype) hexa contains the scanner and parser tables of Rex and Front (= front-end of Lalr and Bnf) converted from binary to ascii hexadecimal representation dos directory containg the MSDOS version bin UNIX: shell scripts (my version), MSDOS: batch scripts lib executables, table and data files (for SUN 3/SunOS 4.0 or PC/MSDOS) (mtc Modula-2 to C translator) The names of the subdirectories indicate the following types of information: sub directory contents ------------------------------------------------------------------------ src source files in Modula-2 m2c source files in C (generated from the Modula-2 sources) src source files in C (generated from the C sources for MSDOS) c source files in C (hand-written) lib data files, module skeletons test test environment for a tool Documentation: -------------- The directories doc.ps, doc.me, and doc.doc contain documentation in postscript format, troff format (me macros), and in ASCII format (without pictures). The documentation for UNIX and MSDOS is the same. Therefore the documents are stored in the UNIX directory, only, they are not repeated in the 'dos' subdirectory. The document entitled "Toolbox Introduction" in the files intro.ps, intro.me, or intro.doc gives an overview and introduces into the toolbox. It should be read first. The following documents are available: Filename Title ------------------------------------------------------------------------ intro Toolbox Introduction toolbox A Tool Box for Compiler Construction werkzeuge Werkzeuge fu"r den U"bersetzerbau reuse Reusable Software - A Collection of Modula-2-Modules prepro Preprocessors rex Rex - A Scanner Generator scanex Selected Examples of Scanner Specifications scangen Efficient Generation of Table-Driven Scanners lalr-ell The Parser Generators Lalr and Ell lalr Lalr - A Generator for Efficient Parsers ell Efficient and Comfortable Error Recovery in Recursive Descent Parsers highspeed Generators for High-Speed Front-Ends autogen Automatische Generierung effizienter Compiler ast Ast - A Generator for Abstract Syntax Trees toolsupp Tool Support for Data Structures ag Ag - An Attribute Evaluator Generator ooags Object-Oriented Attribute Grammars estra Spezifikation und Implementierung der Transformation attributierter Ba"ume puma Puma - A Generator for the Transformation of Attributed Trees trafo Transformation of Attributed Trees Using Pattern Matching (minilax Specification of a MiniLAX-Interpreter) (begmanual BEG - a Back End Generator - User Manual) New Features ------------ With respect to Version 9209 the next release of Cocktail (Version 9311) will have the following new features: [snip] - Lark is a complete new implementation of a parser generator for LALR(1) and LR(1) grammars. It is compatible with lalr and will re- place lalr in the future. It features: + processes LALR(1) and LR(1) grammars + fast generation of information about LR-conflicts + trace of parsing steps during run-time available + supports named attributes as well as the $i notation + semantic predicates control parsing by conditions + support for backtracking parsing + includes bnf - no preprocessing necessary - As a first step to list-processing ast generates a procedure ForallTREE that allows to perform an operation for all elements of a list. - Besides evaluators based on recursive procedures, ag can now generate evaluators based on a stack automaton. This results in a tremendous reduction of the stack consumption. The recursive evaluators are now optimized with elimination of tail recursion. - The code generated by puma is optimized with elimination of tail re- cursion, too. References: ----------- 1. J. Grosch, `Generators for High-Speed Front-Ends', LNCS, 371, 81-92 (Oct. 1988), Springer Verlag. 2. H. Emmelmann, F. W. Schroeer, Rudolf Landwehr, ` BEG - a Generator for Efficient Back Ends', Sigplan Notices, 24, 227-237 (Jul. 1989) 3. W. M. Waite, J. Grosch and F. W. Schroeer, `Three Compiler Specifications', GMD-Studie Nr. 166, GMD Forschungsstelle an der Universitaet Karlsruhe, Aug. 1989. 4. J. Grosch, `Efficient Generation of Lexical Analysers', Software-Practice & Experience, 19, 1089-1103 (Nov. 1989). 5. J. Grosch, `Efficient and Comfortable Error Recovery in Recursive Descent Parsers', Structured Programming, 11, 129-140 (1990). 6. J. Grosch, H. Emmelmann, `A Tool Box for Compiler Construction', LNCS, 477, 106-116 (Oct. 1990), Springer Verlag. 7. J. Grosch, `Object-Oriented Attribute Grammars', in: Proceedings of the Fifth International Symposium on Computer and Information Sciences (ISCIS V) (Eds. A. E. Harmanci, E. Gelenbe), Cappadocia, Nevsehir, Turkey, 807-816, (Oct. 1990). 8. J. Grosch, `Lalr - a Generator for Efficient Parsers', Software-Practice & Experience, 20, 1115-1135 (Nov. 1990). 9. J. Grosch, `Tool Support for Data Structures', Structured Programming, 12, 31-38 (1991). 10. J. Grosch, `Transformation of Attributed Trees Using Pattern Matching', to appear (1992).