WWW2003 Dev Day Proposal

 

Nobody said it would be easy. Semantically Discovering (Bio)Grid services is Tricky

 

Carole Goble, Chris Wroe and Phil Lord

The myGrid project at the Information Management Group, University of Manchester, UK

carole@cs.man.ac.uk

 

myGrid is a UK e-Science project lead by the University of Manchester involving four other UK universities, the European Bioinformatics Institute and many industrial collaborators. The project aims to exploit the growing interest in Grid technology, with an emphasis on the Information Grid, and provide middleware layers that make it appropriate for the immediate needs of bioinformatics. Specifically, myGrid is building high level services for data and application resource integration such as resource discovery, workflow enactment and distributed query processing. This middleware is semantically-enabled, making extensive use of metadata (expressed in RDF and XML) and ontologies (represented using OWL) drawn from the Semantic Web community.  This makes myGrid one of the early Semantic Grid projects.

 

One of the focuses of myGrid is the discovery of service by their semantic descriptions. Initially we adopted the “traditional” approaches followed by the DAML-S programme by concentrating on the expressivity and complexity of the semantic descriptions, and simply attaching descriptions to instances of services. This turns out not to work in practice. By examining the complexities of real and available right now bioinformatics services we conclude that this simple approach will also fail for OGSA-compliant Grid services, primarily because of the complexities of the service invocation models. Moreover, in myGrid we use a workflow enactment engine to compose and orchestrate services, discovering and substituting services instances as the workflow enacts, which further complicates the picture. Finally we frequently need to provide a description of a service at multiple levels of abstraction - class level as well as instance level - in order to formulate abstract execution templates that are not directly invocable until they are enacted by the workflow engine.

 

These outcomes have led us to examine more closely the relationship between service and workflow descriptions and the services themselves. This has implications for the frameworks, architectures and ontologies needed for semantic discovery of (Bio)Grid services.