This page contains descriptions of specific PhD projects I am interested in supervising.
If you are interested in doing a PhD with me and you are interested in a project described below, feel free to contact me.
Note, however, that if you are not interested in any project described below, you should not take that to mean that I am not interested in supervising you. Please, check my research interests and if the topic you're interested in is listed there, feel free to contact me all the same.
You should also to have a look at the more general information about doing a PhD in the School of Computer Science.
The combination of advances in miniaturization, universal networks and embedded systems has brought us closer than ever to truly ubiquitous, pervasive systems. In the near-future, many orders of magnitude more data will be available from, into and about smart devices that, on the one hand, link directly and explicitly to the environment they are immersed in and, on the other, are simply another node in the distributed computations that shape our experience of the world. The first manifestations of these trends are now approaching maturity. Of these, the most significant are wireless sensor networks (WSNs). WSNs are evolving and blending with devices into an Internet of Things (IoT) [http://www.internet-of-things.eu/]
The broad PhD topics below aim to explore the question as to whether a data-centric, query-based approach to the IoT would be both effective and efficient in enabling 10^6 applications to coordinate and cooperate with 10^10 devices.
The Broad Vision
The query processing paradigm (i.e., the idea that one expresses requirements in a declarative language and delegates to a compiler/optimizer the task of mapping expressions in the former into executable code) has been remarkably successful over the last four decades. One reason behind this success is the fact that the query processing paradigm accommodates many extensions. For example, it can encompass continuous streams and reactive triggers for event processing; it can model spatiotemporal properties and hence be grounded on the physical world; it can enforce quality-of-service (QoS) and quality-of-data expectations; it is capable of seamless modelling both data and metadata; and it elegantly ranges over discovery, binding, integration, retrieval and analysis.
In our data-centric vision for the IoT, as well as a Web of Things (WoT) layered above it, devices would run embedded server and client apps (e.g., a web server, a web browser, SQL/noSQL/SPARQL endpoints, etc.). Devices would be simply nodes in a RESTful architecture: what flies around is queries and query results.
Relevant Contributions from Manchester
In the WSN case, prior work had been overcautious and lacked ambition. The IMG took up this challenge and over the last six years has designed and built the most expressive and functionally complete sensor network query processing platform in existence. It enables zero-programming interoperability and over-the-air deployment and repurposing; it provides mechanisms for seamless integration of pulled and pushed streams with stored extents; it supports QoS-driven optimization as well as in- and off-network execution; it integrates with service-based architectures.
The academics involved are Norman W Paton and Alvaro A A Fernandes. The work has so far resulted in 4 PhD and 6 MSc dissertations over the course of two three-year research grants from the UK and the EU. The publication record consists of some 12 publications so far [see here].
Research Challenges
Four broad research challenges can serve as a source of PhD projects (possibly jointly supervised with Prof Norman Paton) in this broad area:
Application Development and Deployment in a Data-Centric Web of Things
How to design and develop an application development and deployment model that does justice to the openness and diversity of a Web of Things (WoT)? (E.g., does it look like apps on fridges and heaters and punnets and shelves and lorries?) Projects stemming from this broad challenge would involve studying, designing, implementing and evaluating languages, protocols and virtual machines in a WoT software/protocol/hardware stack with a view towards making a case for a WoT environment over which a varied ecology of effective and efficient applications can emerge.
A Data-Centric Web of Things as a Cloud of Smart Devices
How to make the most of the opportunities of scale in an Internet of Things (IoT)? (E.g., does it look like a cloud where nodes can be tiny in size but are available in billions?) This project would focus on a different conception of cloud computing than the current one (which is already boring and stale). In this new conception, the WoT is a cloud of spatio-temporally structured smart devices whose combined storage and processing power is kilo or megascale in isolation but exascale when connected. Projects stemming from this broad challenge would involve studying, designing, implementing and evaluating a cloud computer realized in software over smart device hardware (e.g., smartphones, wireless sensing devices, smart appliances, urban furniture, etc.). It could be, e.g., an amorphous computer but with self-* properties and bound by QoS constraints. Or it could use Berkeley-style distributed logic
Approximate Querying in a Data-Centric Web of Things
In a WoT/IoT, best effort is the best one can aspire to. What approximations of soundness and completeness of query results are usable (e.g., rigorously and efficiently computable) and useful (e.g., precise, fresh, intelligible and comprehensive) enough that they have a fighting chance of being actually used? In a planetary-scale distributed platform running 10^6 applications over 10^10 devices, stuff happens. Projects stemming from this broad challenge would involve involve studying, designing, implementing and evaluating one or more (existing or not) query languages (with their associated compiler/optimizer/evaluator) whose best-effort semantics is clear and well-founded. For example, the Brewer conjecture (later proved by Gilbert and Lynch) on the relationship between consistency, availability and partitioning has clarified fundamental impediments to what cloud databases that lie behind web services can deliver. In the same vein, we would like to be clear as to fundamental limits to accuracy and precision in the answers to queries that run over a WoT.
Smart Glue for a Data-Centric Web of Things
How does the WoT becomes both invaluable and invisible to individuals and organizations? (In other words, how do we achieve seamless integration, painless plug in and plug out, etc.?) Projects stemming from this broad challenge would involve involve studying, designing, implementing and evaluating interfacing mechanisms between application-level layers (both in a resource- and a service-oriented architecture) and the network (and consequently computing) fabric of an WoT/IoT. The focus here would be on component-level reasoning, interaction patterns, knowledge-driven coupling and composition, etc.
The Information Management Group at the School of Computer Science of The University of Manchester has been developing, over the course of two three-year research projects the technology to express the data retrieval needs of wireless sensor network (WSN) applications as queries.
This work has led to the design and development of a declarative query language, called SNEEql, that integrates continuous and one-off queries over stored and streamed extents. SNEEql queries can be evaluated in centralized fashion (but with remote resource access) over robust networks or in distributed fashion over sensor, as well as robust, networks or both. The compiler/optimizer for SNEEql is referred to as SNEE and is distinctive in being quality-of-service (QoS) aware, i.e., a query can give rise to different evaluation plans depending on the QoS expectations which are specified for its evaluation.
This PhD project aims to investigate the problem of designing a physical deployment given a SNEEql query workload and the corresponding QoS expectations. To see what, in broad terms, this research opportunity would be, consider the fact that the physical deployment of a WSN is, at the moment, assumed to be a given by SNEE. Roughly speaking, one needs to assume that an environmental scientist, say, would have pondered the various dimensions of the problem of designing a physical deployment that would be most conducive to eliciting the observations and measurement that s/he is interested in.
However, this is, quite obviously, a complex decision-making task. Specifically, and just as an illustration, one would need an understanding of the terrain. For example, are there occluding geographic features to circumvent (say, a hill, or a building) or to bridge across (say, a river, or a lake, or a gorge)? Also, what kind of spatio-temporal distribution of the observations one requires (e.g., close in space and distant in time, or vice-versa, or close in both space and time, or, conversely, distant in both space and time)? One would also need and understanding of how to use candidate sensor network hardware platforms. For example, does one use mote-level hardware only? Does one use more robust gear such as data loggers, or relay stations capable of using mobile telephony infrastructure?
In other words, one can see that deciding on the design of a physical network is a specialized (and hence costly) task. Automating it poses an interesting, challenging scientific problem. In its full generality, it is quite likely too complex a problem for a single PhD student to tackle. In this light, one would remove some the hardness by assuming that the design of the network benefits from access to detailed, formalized knowledge about the application workload (and the associated QoS expectations) that the physical network to be designed needs to execute. This detailed, formalized knowledge is the one currently provided by the SNEE software infrastructure that has been developed in Manchester.
Thus, the PhD project would, broadly speaking, aim to investigate the question as to what is the best physical network deployment to meet a given application workload (specified, roughly, in terms of a set of queries, in some suitable representation, and a set of QoS expectations). To achieve this aim, one would have, among others, such objectives as developing functional and non-function models of wireless sensor networks as query execution platforms, generating deployment alternatives that satisfy the physical constraints and are realizable using known hardware platforms, and devising techniques to match an application workload to alternative deployments in order to find an optimal one. By optimal here is meant a deployment that (1) is realizable by available hardware in the intended deployment site, (2) meets the functional and non-function requirements of the application workload, and (3) is the least costly, or the one that produces the most data for longest, etc.
Concretely, the expectation would be that the PhD student, after the normal initial learning period, would spend substantial time devising some (analytical, mathematical) models that describe how a physical network would perform for a given workload and then engage in experimentation by coding this acquired knowledge in the form of simulations in an environment like MATLAB, for example. Once the models have been devised and empirically verified, they would be ready for use in driving a constrained optimization task. One would cast the models, the workload defined by the queries and the associated QoS expectations as constraints and then, given an optimization goal, use (potentially off-the-shelf) optimizers to search the space of possible deployments to find the one the best meets the given goal.
The University of Manchester is not responsible for the content of this page.