3rd year projects for 2002/3

Language Processing - Grammars: Language Processing - Implementation: Information Systems: (template)

more ideas for projects


CLASSIFICATION: Language Processing

SH: Y - UF
CM: Y - C
BM: Y - C

Grammar transformation

Project: 000  Supervisor: PJJ

The Java Compiler Compiler (JavaCC) is a parser generator created by Sun for use with their Java language. It is roughly equivalent to tools like Lex and Yacc, but has different underlying mechanisms. This makes it easier for humans to understand what it is doing, but also makes the basic system less powerful. Because of this, users have to put some effort into modifying an initial grammar to make it acceptable to JavaCC.

For example, we would normally recognise non-associative, left-associative and right-associative operators using non-recursive, left-recursive and right-recursive grammars respectively:

	  non_expression = sub_exp | sub_exp '=' sub_exp

	 left_expression = sub_exp | left_expression '+' sub_exp

	right_expression = sub_exp | sub_exp '^' right_expression
However, with JavaCC, we have to rewrite the grammars:
	  non_expression = sub_exp ( '=' sub_exp )?

	 left_expression = sub_exp ( '+' sub_exp )*

	right_expression = sub_exp ( '^' right_expression )?

This project is about producing a program that helps users to modify their grammars, to get the most out of JavaCC. The program would be interactive, so that it could point out problem areas, make suggestions, and allow the user to pick a suitable modification. Some specific areas that could be investigated are converting left recursion to iteration (as for left_expression above), left-factoring (as for non_expression and right_expression above), using look-ahead to remove ambiguities, and adding error handling.

There are several possible extensions to this project if time permits, implementing other similar transformations on grammars, such as removing empty, trivial, or unusable grammar rules.

I would expect the program to be written using Java and JavaCC.

REFERENCES:
JavaCC is now maintained by WebGain http://www.webgain.com/products/metamata/java_doc.html
A similar, but not identical, tool: http://www.iit.edu/~tc/llk.htm
A bigger example
COURSE PREREQUISITES: CS2121 would be a big help, but is not necessary


CLASSIFICATION: Language Processing

SH: Y - F
CM: Y - F
BM: Y - F

Graphical Design and Generation of Textual User Interfaces

Project: 000  Supervisor: PJJ

You are probably familiar with the idea of Graphical User Interfaces (GUIs) and the kinds of tools that can be used to create them. However, we often need to build a Textual User Interface (TUI) - for example, to recognise an expression to be drawn by a graph-plotting package, or to input a file of data into a statistical package - and although tools exist to help, they are not necessarily easy to use, particularly to begin with. What is needed is a graphical tool, that allows a user to build an informal description of the structure of the text, which can then be transformed into a formal description (grammar and/or parser program).

I would like the user to be able to "drag and drop" boxes corresponding to ideas like "a list of ..." or "a choice between ...", and to be able to nest them one inside the other as required. e.g.
a number is
a list of one or more things,
where each thing is
a digit character
an identifier is
a letter
character
followed
by
a list of none or more things,
where each thing is
a choice between
a letter character
or
a digit character
It must be possible for the user to experiment with the structure, e.g. by changing one "thing" into "a list of things" or back again. Finally, it should be possible to output the grammar as Regular Expressions or BNF, or even in a form suitable for parsing tools like lex and yacc.

If time allows, the basic concept can be extended in various ways e.g.:

REFERENCES: Some examples: http://www.cs.man.ac.uk/~pjj/projects/grammarGUI.html.
COURSE PREREQUISITES: CS2121 would be a big help, but is not necessary


CLASSIFICATION: Language Processing

SH: Y - F
CM: Y - F
BM: Y - F

Implementing a simple text-processing language, TRAC

Project: 000  Supervisor: PJJ

This project consists of implementing an interpreter for the text-processing language TRAC. As an example of TRAC, the following:

	#(ds,factorial,(#(eq,x,1,1,(#(ml,x,#(cl,factorial,#(su,x,1)))))))'
	#(ss,factorial,x)'
defines the equivalent of a function, factorial, with one parameter, and:
	#(cl,factorial,5)'
invokes it to calculate 5!.

TRAC was originally defined in 1964, with a new version in 1984. Work is currently going on to define the current version of the language (2001) and work towards a new standard (2004). An important first step in the project would be to briefly compare the different versions of the language and pick one of them (or a subset) to implement. As your work progresses, it may be that you become aware of gaps or inconsistencies in the existing definitions, or you may think of extensions, and so you may become involved in this language standardisation and development effort.

REFERENCES:
A short introduction to TRAC: http://www.cs.man.ac.uk/~pjj/projects/trac_doc
The TRAC standardisation effort: http://www.tracfoundation.org/


CLASSIFICATION: Language Processing

SH: Y - LF
CM: Y - F
BM: Y - F

Design and Implementation of a Forth-like language

Project: 000  Supervisor: PJJ

Threaded code is a technique for implementing interpreters for computer languages, and is especially suitable for reverse-polish notation. (It has nothing to do with using "threads" to get the effect of multiple processes running in parallel.) This project consists of designing and implementing a threaded code language and system based on the paper reference, available from me.

Initially you can implement any sensible subset of 'Generalisations' 1 to 11 and any 'Advanced Concepts', including those from the bibliography, such as 'defining words'. It would be reasonable to aim initially at about generalisation 8 but to design your program to be easily extended. You will have to select your own set of primitive commands, including integer arithmetic and input-output. You should expect to enhance your language by implement new facilities such as virtual memory, filestore, parallelism, data structures, or any other features to be found in computer languages.

REFERENCES:
What is threaded code: http://www.complang.tuwien.ac.at/forth/threaded-code.html
Kogge P.M. An Architectural Trail to Threaded-Code Systems IEEE computer March 1982 pp22-31


CLASSIFICATION: Information Systems

SH: Y - F
CM: Y - F
BM: Y - F

Exam Mark Input and Tabulation

Project: 000  Supervisor: PJJ

Several times each year, staff of this department have to input the marks from exams and labs and generate and tabulate the overall results for various groups of students. The results have to be generated in several different formats, depending on how they are to be used, but similar information is present in each format. This project should use existing tools (e.g. a database and a form/query front-end) to create a user-friendly system to help with this task.

Staff need to be able to:

Furthermore, users need to be able to work with incomplete marks, adding to them and changing them over many weeks.

The expectation is that you would use software available in the department, such as MS Office, or the equivalent under Linux.


CLASSIFICATION: Information Systems

SH: Y - F
CM: Y - F
BM: Y - F

A Timetabling Assistant

Project: 000  Supervisor: PJJ

Timetabling for this department is very tedious and currently performed with the minimum of computer assistance. This project should use existing tools (e.g. a database and a form/query front-end) to create a user-friendly system for exploring and controlling the interaction of: Student groups, Activities, Staff, Rooms and Time-slots.

Different users will have different requirements; a few people need to create the timetable, but any staff or students might need to access the information in it, for example to see their personal timetable. Users need to be able to:

Furthermore, users need to be able to work with an incomplete timetable, adding to it and changing it over many weeks.

The expectation is that you would use software available in the department, such as MS Office, or the equivalent under Linux.


CLASSIFICATION: Information Systems

SH: Y - F
CM: Y - F
BM: Y - F

Recording and analysing canvas results

Project: 000  Supervisor: PJJ

A well-run political campaign needs to canvass and record the opinions of the electorate. This involves putting the electoral roll onto a data-base, printing lists of the people eligible to vote in each road in a political district, asking those people how they intend to vote, and putting the results back into the data-base.

We need to be able to add extra pieces of information, such as who is displaying posters for the different parties, who the members of our party are, who wants a lift on polling day and when, and how to get access to different buildings and where the letter-boxes are to be found. We need to be able to process information about phone numbers as well as addresses, about households as well as individuals, about buildings as well as households within the same building, and about non-voters as well as voters. We need to be able to compare this roll with previous ones to identify new voters, calculate on which topics and in which physical areas it would be best to concentrate the candidate's efforts, and many other activities.

This project is about creating a suitable database, including a user interface that supports the various data-entry tasks and the extraction of useful information from the raw data. The expectation is that you would use software available in the department, such as MS Office, or the equivalent under Linux.


Template - ignore

CLASSIFICATION: goes here (see below)

SH: Y/N - and if Y, the grading (see below)
CM: Y/N - and if Y, the grading
BM: Y/N - and if Y, the grading

title goes here

Project: 000  Supervisor: PJJ

Project description goes here. Must be in HTML format. E.g. paragraphs must be separated by

REFERENCES: delete if none.
COURSE PREREQUISITES: delete if none.
EQUIPMENT: delete if none.