Semantic Matching of Bioinformatic Web Services

A thesis submitted to the University of Manchester for the degree of Doctor of Philosophy in the Faculty of Engineering and Physical Sciences by Duncan Hull, School of Computer Science, September 2007. Successfully defended May 2008


Understanding bioinformatic data on the Web often requires the interoperation of heterogeneous and autonomous services. Unfortunately, getting many different services to interoperate is problematic, and frequently requires cumbersome ``shim'' components which can be difficult to describe and discover using existing techniques.

The use of description logic reasoning has been proposed as a method for improving discovery of services, by classifying advertisements and matchmaking them with requests on the semantic Web. However, theoretical approaches to reasoning with semantic Web services have not been adequately tested on realistic scenarios while practical approaches have not fully investigated or applied useful aspects of current theory.

This thesis investigates the use of description logic for describing Web services semantically and discovering them using reasoning. Three main contributions are made. Firstly, we provide a partial semantic classification of shim services using the Web Ontology Language (OWL). Secondly, we show where existing theories are insufficient for discovering this class of services and also demonstrate limitations of current practical approaches. Finally, we describe and evaluate a novel technique for supplementing semantic classifications of services with queries. The work combines theoretical approaches, such as WSMO and OWL-S with practical solutions used in myGrid, Taverna and BioMOBY. The combination improves service discovery by ranking results and matching service requests to advertisements with higher precision and recall, than either theoretical or practical techniques currently allow.

The work presented in this thesis should be of interest both to theoreticians of the semantic Web and practitioners implementing “real-world” systems with bioinformatic, and other services on the Web.

More (table of contents, chapters etc) will appear here later. Most of the content of this thesis has been published elsewhere in peer-reviewed journals, however, if you would like a copy, I can email you a pdf file.