Project: $DO NOT EDIT THIS FIELD$
Supervisor: Pete Jinks
Difficulty grading: INF=F,
SH=F,
BM=F,
CM=F
Area: General Applications
Max number of students who can do this project: 1
Timetabling for this school is very difficult, and it is hard to try out alternative ways of organising the timetable, such as scheduling all activities in the mornings to leave the afternoons free, or changing labs to be 1 hour each week instead of 2 hours once a fortnight, or repeating practical activities every 3 weeks instead of every 2 weeks. However, those are exactly the sorts of changes we want to be able to test as we reorganise our teaching over the next few years.
The problem is that it is unrealistic to try to automatically create genuine timetables, as it would take far too long for a program to explore all the alternatives. However, we have to have some automation to be able to explore the effects of the kinds of changes mentioned above.
One possible approach would be to get the computer to quickly come up with (one or more) outline solutions to simplified problems (e.g. treating a set of course-units as essentially interchangeable, or ignoring the effects of trying to interleave our lectures with those from other schools, or ignoring limitations of room or staff availability) using some simple heuristics (e.g. treat all hours as equivalent and just fill up the timetable from Monday 9am, starting with the biggest course-units in each year).
If an outline solution resulting from a change was not a significant improvement over a previous outline solution then there would be little need to pursue it further. If it passed this first test, then the computer would assist the user to modify it to make it more realistic.
This project is to create a system for rapidly exploring different ways of organising our timetables. It should be possible to create lists of entities (e.g. student groups, activities, staff, rooms and time-slots) and the relationships between them, and ask the system to construct possible outline timetables, with the user able to intervene to help the process along. (It may be sensible to try to re-use existing timetabling software, such as GATT or Neeps & Tatties.)
The user should then be able to take an outline timetable and add details to it to make it more realistic. This may involve: assigning time-slots to activities and being automatically notified of any clashes, asking for a list of possible time-slots and/or rooms for an activity, and obtaining timetables for particular groups of students, staff or rooms.
REFERENCES:
GATT
Neeps & Tatties
COURSE PREREQUISITES: -
EQUIPMENT: -
Project: $DO NOT EDIT THIS FIELD$
Supervisor: Pete Jinks
Difficulty grading: INF=F,
SH=F,
BM=F,
CM=F
Area: Graphical Systems and Applications
Max number of students who can do this project: 1
The basic idea is that a user would be able to input a regular expression (regexp) as in e.g. egrep, or a set of regexps as in e.g. lex, and be able to see how the regexp(s) matched parts of an actual sample text. The basic idea is that the tool could display the sample text in a window (or even several samples, each in a different window) and draw a box around each chunk of text that matched a regexp (with different colours for different regular expressions).
There are various different situations that might arise that this tool could help with:
COURSE PREREQUISITES: COMP20121 or some similar knowledge of regular expressions would help but is not essential
EQUIPMENT: -
Project: $DO NOT EDIT THIS FIELD$
Supervisor: Pete Jinks
Difficulty grading: INF=F,
SH=F,
BM=F,
CM=F
Area: Teaching Tools
Max number of students who can do this project: 1
We would like to have a simulator for a very simple computer that can be used to support learning about how computers work, in many different teaching contexts and levels, such as university course-units like COMP10031 and COMP10211, A-levels, and GCSEs.
This simulator would be used to introduce students who might have little computing background to the concepts of programs made up from a series of instructions, executed in sequence. Therefore, what is important will be very different from, say, the KoMoDo simulator that we currently use, where it is important to simulate every detail of a very complex processor:
This project is about designing and implementing the simulator. There are various extensions that might be possible, depending on the time available and your interests:
REFERENCES:
COMP10031
COURSE PREREQUISITES: -
EQUIPMENT: -
Project: $DO NOT EDIT THIS FIELD$
Supervisor: Pete Jinks
Difficulty grading: INF=F,
SH=F,
BM=F,
CM=F
Area: Information Systems
Max number of students who can do this project: 1
A well-run political campaign needs to canvass and record the opinions of the electorate. This involves putting the electoral roll onto a data-base, printing lists of the people eligible to vote in each road in a political district, asking those people how they intend to vote, and putting the results back into the data-base.
We need to be able to add extra pieces of information, such as who is displaying posters for the different parties, who the members of our party are, who wants a lift on polling day and when, and how to get access to different buildings and where the letter-boxes are to be found. We need to be able to process information about phone numbers as well as addresses, about households as well as individuals, about buildings as well as households within the same building, and about non-voters as well as voters. We need to be able to compare this roll with previous ones to identify new voters, calculate on which topics and in which physical areas it would be best to concentrate the candidate's efforts, and many other activities.
This project is about designing and implementing a suitable database and user interface that supports the various data-entry tasks and the extraction of useful information from the raw data.
COURSE PREREQUISITES: -
EQUIPMENT: -
Project: $DO NOT EDIT THIS FIELD$
Supervisor: Pete Jinks
Difficulty grading: INF=F,
SH=F,
BM=F,
CM=F
Area: Programming Language Processing
Max number of students who can do this project: 1
It is easy to write Java programs that make every method and variable "public". It is much harder to have the discipline to make as much as possible "private", to make it easier to modify in the future. This project is about taking a set of Java class sources and modifying them to make as much as possible "private".
A first attempt might just be to go through the sources inserting "private" in front of every declaration, recompiling them, and dealing with the resulting error messages. Better might be to first go through the sources making a list of everything that looks like "name1.name2" using name1 to help work out which class name2 comes from, and then go through all the sources again inserting "private" in front of any names not found in the first step.
A possible extension might be to find instances where variables could not be made private as above, and then insert "getVar" and "setVar" methods so that even those variables could be made private. Another possible extension would be to try to make use of "protected" as well.
It would be reasonable to reuse an existing open-source Java parser, rather than trying to construct one from scratch.
REFERENCES:
Skinny Languages and Fat Tools
COURSE PREREQUISITES: COMP20121 or COMP30142 or some similar knowledge of compilation would help but is not essential
EQUIPMENT: -
Project: $DO NOT EDIT THIS FIELD$
Supervisor: Pete Jinks
Difficulty grading: INF=F,
SH=F,
BM=F,
CM=F
Area: Programming Language Processing
Max number of students who can do this project: 1
Most programmers, at some point in their careers, need to implement a textual part of a user interface, or a textual configuration file, or similar. The implication of this is that most people who need to use grammars (i.e. regular expressions and/or BNF) will not be experts in their use. Unfortunately, learning how to make best use of grammars is nearly as hard as learning how to program for the first time, as it requires a different way of thinking about programming tasks.
One technique for resolving this sort of dilemma is to provide a "wizard", which can ask novice users simple questions and act upon the answers. In this instance, we would want to deal with common grammar-related tasks. For example, if a user wanted to recognise some kind of expressions, the wizard could ask for a list of the different operators, and their individual characteristics, such as precedence and associativity, and automatically create the resulting grammar for the user. Similarly, users could be presented with a list of various formats for identifiers and literals, and select from them, or perhaps even be able to slightly modify one of them.
The wizard might initially just deal with some small range of possibilities, but an extension would be to broaden its range. For example, it might start with some very basic questions, such as whether the user wants to be able to deal with data files or scripts or programs. The answer could be used to make some fundamental assumptions, such as whether end-of-line was significant or not, or whether there would be declarations or not.
COURSE PREREQUISITES: COMP20121 or similar knowledge of grammars would help but is not essential
EQUIPMENT: -
Project: $DO NOT EDIT THIS FIELD$
Supervisor: Pete Jinks
Difficulty grading: INF=F,
SH=F,
BM=F,
CM=F
Area: Graphical Systems and Applications
Max number of students who can do this project: 1
I have always been interested in the way in which human communities and economies develop, originally because I wanted to create imaginary countries to fight war-games in, but nowadays as a way of understanding history. In the past few years, simulations by social scientists, economists, archaeologists and historians have progressed to the point where this almost seems a feasible, but long-term goal.
This aim of this project is to investigate what has already been done, starting with the references below, and then pick an interesting and relevant (i.e. computer-science) aspect of this work to follow up. (Remember that this is only a third year project, and it is quite reasonable to re-implement an existing piece of work in a better way, e.g. in Java to make it more easily available. Please discuss this with me before selecting this project.)
REFERENCES:
A simple introduction:
Science News
A social-science view:
Seeing Around Corners
Two existing systems:
Swarm
and
Sugarscape
COURSE PREREQUISITES: -
EQUIPMENT: -
Project: $DO NOT EDIT THIS FIELD$
Supervisor: Pete Jinks
Difficulty grading: INF=F,
SH=F,
BM=F,
CM=F
Area: Programming Language Processing
Max number of students who can do this project: 1
Recognising text is a common problem, often best solved by writing a grammar and then using a parser-generator (like lex or yacc) to automatically create a recogniser from the grammar. However, writing the grammar to begin with can be very difficult. Perhaps this process could be simplified by a tool that interacted with the user to derive a grammar from an example piece of text.
A complete grammar for a programming language would probably have to be created in two phases. The first (this project) would be simply to recognise individual "words" such as numbers, names, operators and punctuation for use with e.g. lex. This would then be used to create a BNF linking the words into "sentences" such as expressions, statements and declarations for use with e.g. yacc.
The tool might recognise typical strings, such as numbers consisting of digit characters "0" to "9", and perhaps also using "+" and "-" and ".". The tool might then interact with the user to decide whether these strings really were complete words, or whether we are trying to recognise something more complex e.g. course-unit codes like "COMP30900".
A significant problem will be preventing the tool from over-using an incomplete grammar. For example, if it is told that "for" is the start of an for-statement, it shouldn't try to use the "for" inside "before".
Another problem is that of white space (spaces, tabs, newlines etc.). In some circumstances, such as in typical computer languages like Java and C, white space can usually be discarded, although it may separate words. In other circumstances, such as input to a database, white space may be very important as tabs may be used to separate fields and newlines used to separate records. It may be hard to recognise these fundamentally different situations without significant help from the user.
A simple example is a list of symbols such as:
abcdabcdddabcabcd.
The first step would be to identify boundaries between repeated groups i.e.
abcd abcddd abc abcdd.
The next step would be to distinguish between special cases and
generalisations i.e. do we permit only 0 to 3 instances of "d",
or do we permit any number of instances?
Must the list of groups contain exactly 4 instances of "abcd...", or any
number?
Must the instances be in the order "...d ...ddd ... ...dd" or is any order
possible?
REFERENCES:
Programming By Example
an example of using lex
COURSE PREREQUISITES: COMP20121 or similar knowledge of grammars would help but is not essential
EQUIPMENT: -
Project: $DO NOT EDIT THIS FIELD$
Supervisor: Pete Jinks
Difficulty grading: INF=F,
SH=F,
BM=F,
CM=F
Area: Programming Language Processing
Max number of students who can do this project: 1
Creating a grammar for a computer language can be quite hard - often, we have a good idea of the sort of program or script we want to be able to write, but much less idea how to represent that in a grammar. The idea would be to create a system that would allow the user to start from such an example text in a window and then interact with the system to draw the "parse tree", and finally derive a grammar from it.
For the first step, for example, one might tell the system that this part of the text is an expression, the next part is a declaration, the next an if statement, and so on. One could then build up (this is a list of statements) and down (this is a sub-expression; this is a number, this is a variable, this is an operator) to create the complete parse tree.
The final step would be to take these parse trees, take each node in
the tree as an example use of a grammar rule, and merge and generalise
these to build up a complete grammar. For example, we might have
examples of expressions containing several different operators, merge
them to allow any of the given operators from the example, and then
generalise to include all the operators we might actually want to use.
We could look at using concepts like Unification to help automate the
merging.
Other facilities might be to provide suggestions or alternatives to choose from, to help the user when creating the parse trees or when converting the parse trees to a grammar (e.g. to help deal with operator precedence).
REFERENCES:
Unification
(Wikipedia)
COURSE PREREQUISITES: COMP20121 or some similar knowledge of grammars would help but is not essential
EQUIPMENT: -
Project: $DO NOT EDIT THIS FIELD$
Supervisor: Pete Jinks
Difficulty grading: INF=F,
SH=F,
BM=F,
CM=F
Area: Graphical Systems and Applications
Max number of students who can do this project: 1
When one is working with a non-trivial grammar, e.g. for some programming language, it can be very difficult to understand exactly what each part of the grammar is used for. It would be nice to use a tool that could display the grammar and an example program recognised by the grammar, and then be able to select either a part of a program and have the corresponding part of the grammar highlighted, or select a grammar rule and have corresponding parts of the program text highlighted.
One way of accomplishing this would be to actually run the grammar through a parser-generator such as yacc, and use it to parse the example program, creating a parse-tree as it does so. The tool could then use the parse-tree to decide which parts of the grammar or program text to highlight for the user. (Berkely yacc can be made to automatically output information about a parse, which could be input by the tool and used to create the parse-tree - see reference below.)
The tool could be enhanced to display more detailed information from the parse tree - for example, if the user selected a statement, it might highlight the individual lexemes - names, numbers, etc. - using different colours to indicate the lexeme type, and then draw nested boxes around groups of lexemes to indicate how they are recognised by the grammar rules to form sub-expressions and larger expressions etc.
REFERENCES:
Tracing information from Byacc
COURSE PREREQUISITES: COMP20121 or similar knowledge of grammars would help but is not essential
EQUIPMENT: -
Project: $DO NOT EDIT THIS FIELD$
Supervisor: Pete Jinks
Difficulty grading: INF=F,
SH=F,
BM=F,
CM=F
Area: Graphical Systems and Applications
Max number of students who can do this project: 1
You are probably familiar with the idea of Graphical User Interfaces (GUIs) and the kinds of tools that can be used to create them. However, we often need to build a Textual User Interface (TUI) - for example, to recognise an expression to be drawn by a graph-plotting package, or to input a file of data into a statistical package - and although tools exist to help, they are not necessarily easy to use, particularly to begin with. What is needed is a graphical tool, that allows a user to build an informal description of the structure of the text, which can then be transformed into a formal description (grammar and/or parser program).
I would like the user to be able to "drag and drop" boxes corresponding to ideas like "a list of ..." or "a choice between ...", and to be able to nest them one inside the other as required. e.g.
|
| ||||||||||||||||
If time allows, the basic concept can be extended in various ways e.g.:
REFERENCES:
examples
COURSE PREREQUISITES: COMP20121 or similar knowledge of grammars would help but is not essential
EQUIPMENT: -
Project: $DO NOT EDIT THIS FIELD$
Supervisor: Pete Jinks
Difficulty grading: INF=F,
SH=F,
BM=F,
CM=F
Area: Graphical Systems and Applications
Max number of students who can do this project: 1
"A Mind Map is a diagram used to represent words, ideas, tasks or other items linked to and arranged radially around a central key word or idea. It is used to generate, visualise, structure and classify ideas, and as an aid in study, organisation, problem solving, and decision making." [Wikipedia] Mind-maps are essentially hierarchical, with a tree-like structure, possibly with a few cross-links between branches. A Concept Map is a similar idea, with a similarly hierarchical structure, but typically using many more cross-links: "A well made concept map grows within a context frame defined by an explicit 'focus question,' while a mind map has branches radiating out from a central picture." [Wikipedia]
Mind maps and Concept maps seem like a good way to write things down, but most implementations seem very rigid: they assume a unique, well-defined starting point and centre for each map, and the hierarchical links are treated very differently from cross-links. This is sensible if you have a clear idea of your starting point, but less so e.g. if you want to explore the relationships between a set of ideas that you don't yet fully understand. I would like to be able to build a less rigid structure with many cross-links, and then have different ways of redrawing the network, for example if I select a new central node, or reposition individual nodes, or if I "zoom in" on a subset of the nodes, with the remainder either hidden or indicated by links going "out of view". It would also be nice to be able to search for nodes or links via their text labels. (You can see something similar in e.g. the "Genealogy of Influence", a visualisation of the connections between influential people in Western culture.)
I also want to be able to manipulate multiple maps. For example, I would like to be able to compare and perhaps merge two maps created by different people but representing similar concepts, or two maps that represent different but related concepts. I would also like to be able to partition maps, for example to give different parts of a map to different people to update, and then merge them back together.
This project consists of designing and implementing such a system.
REFERENCES:
Mind mapping (Wikipedia)
Concept mapping (Wikipedia)
Genealogy of Influence
COURSE PREREQUISITES: -
EQUIPMENT: -
Project: $DO NOT EDIT THIS FIELD$
Supervisor: Pete Jinks
Difficulty grading: INF=F,
SH=F,
BM=F,
CM=F
Area: Programming Language Processing
Max number of students who can do this project: 1
Recognising text is a common problem, often best solved by writing a grammar and then using a parser-generator (like lex or yacc) to automatically create a recogniser from the grammar. However, writing the grammar to begin with can be very difficult. Perhaps this process could be simplified by a tool that interacted with the user to derive a grammar from an example piece of text.
A complete grammar for a programming language would probably have to be created in two phases. The first would be simply to recognise individual "words" such as numbers, names, operators and punctuation for use with e.g. lex. This would then be used to create a BNF (this project) linking the words into "sentences" such as expressions, statements and declarations for use with e.g. yacc. For this project, therefore, we will have to assume that the first phase has already happened, by breaking the given text up into "words" using some simple algorithm.
Perhaps the user would select an example statement, and then interact with the tool to construct a simple grammar rule for it. The tool could then attempt to match other chunks of text with this rule, and the user could agree or disagree, allowing the tool to refine the grammar rule.
A significant problem will be preventing the tool from over-using an incomplete grammar. For example, if it knows how to recognise an expression consisting of names, numbers and operators, it shouldn't assume that the names in a declaration are also expressions.
Another problem is that of dealing with a hierarchical structure, for example
trying to recognise if-statements and assignments. We can assume that
we have a list of simple symbols, such as:
i(n<n){n=n;i(n)n=n;}
("i" stands for "if", "n" for a name, and so on).
We need to identify n=...; as an assignment statement, and then
more complex statement forms, such as the if-statement:
i(n) followed by a statement, and the block:
{ } containing one or more statements, and so on.
REFERENCES:
Programming By Example
an example of using lex + yacc
COURSE PREREQUISITES: COMP20121 or similar knowledge of grammars would help but is not essential
EQUIPMENT: -
Project: $DO NOT EDIT THIS FIELD$
Supervisor: Pete Jinks
Difficulty grading: INF=F,
SH=F,
BM=F,
CM=F
Area: Teaching Tools
Max number of students who can do this project: 1
Chart parsing is a general parsing technique that, unlike those used in most parsers, is able to deal with any context-free grammar. For example, it can handle ambiguous grammars quite simply. It pays for this generality by being significantly less efficient. However, it is straightforward to make versions with different parsing strategies, such as top-down or bottom-up. This would make it ideal for use in teaching, to demonstrate these various strategies.
Chart parsing is mainly used for recognising natural languages, such as English, and various toolkits exist for this, such as NLTK. However, facilities that are important for natural languages, such as being able to deal with enormous vocabularies, words with more than one meaning, and imprecise grammars, are less useful for dealing with computer languages. This project consists of creating a chart-parser that is suitable for use e.g. in COMP20121, perhaps by modifying NLTK or some similar package, or more likely by designing and implementing it from scratch.
There are various possible extensions. For example, the tool might have a simple GUI, to allow parse trees to be displayed, or to allow a user to "single step" through a parse. Another example would be to adapt the parser to be able to deal with EBNF as well as BNF.
REFERENCES:
Chart Parsing with NLTK
BNF and EBNF
COURSE PREREQUISITES: COMP20121 or similar knowledge of grammars would help but is not essential
EQUIPMENT: -
Project: $DO NOT EDIT THIS FIELD$
Supervisor: Pete Jinks
Difficulty grading: INF=F,
SH=F,
BM=F,
CM=F
Area: Programming Language Processing
Max number of students who can do this project: 1
Most programmers, at some point in their careers, need to implement a textual part of a user interface, or a textual configuration file, or similar. The implication of this is that most people who need to use grammars (i.e. regular expressions and/or BNF) will not be experts in their use. Unfortunately, learning how to make best use of grammars is nearly as hard as learning how to program for the first time, as it requires a different way of thinking about programming tasks.
One way to start could be to construct some example texts (e.g. programs) and
then try to design grammars that matched them, using a parser-generator
(like lex or yacc) to automatically create a recogniser from the grammar.
However, at the moment one has to be able to write a complete grammar at the
first attempt to follow this route. Instead, it would be nice if one could
simply test pieces of grammar - for example
experiment with using if-statements of the form
if ( ... ) ... else ...
where we don't yet want to have to worry about what "..." represents or
about what precedes or follows the if-statement.
Chart parsing is a general parsing technique that, unlike those used in most parsers, is able to deal with any context-free grammar. For example, it can handle ambiguous or incomplete grammars quite simply. It pays for this generality by being significantly less efficient. However, this does not really matter when one is designing and testing grammars, where efficiency is much less important than flexibility e.g. coping with partial matches.
Chart parsing is mainly used for recognising natural languages, such as English, and various toolkits exist for this, such as NLTK. However, facilities that are important for natural languages, such as being able to deal with enormous vocabularies, words with more than one meaning, and imprecise grammars, are less useful for dealing with computer languages. This project consists of creating a chart-parser that is suitable for use e.g. in COMP20121, perhaps by modifying NLTK or some similar package, or more likely by designing and implementing it from scratch.
REFERENCES:
Chart Parsing with NLTK
COURSE PREREQUISITES: COMP20121 or similar knowledge of grammars would help but is not essential
EQUIPMENT: -