Integrating genomic data across species boundaries is critical to the successful exploitation of previous investment in this area. Systematic attempts to do this have thus far carried a single species focus e.g. annotating the genome of one species using functional data from a second. Due to the multiple potential views that could be applied to the combined data set, a generalised 'warehousing' approach will not succeed.
We have developed a prototype system to capture the details of relationships between genomic data either within or across species in a way that will enable complex ad-hoc queries to be run and demonstrate that the underlying raw data can be combined to draw maximum benefit from those data for all genomic communities.
Objectives:
- To define controlled vocabularies describing:
- Evolutionary relationships
- Containment relationships
- Nomenclature relationships relevant to comparative genomics
- To implement a Web/GRID middleware layer that will support operations over the wrapped databases including the integration of data by reference to controlled vocabularies
- To demonstrate practical applications based on those web services:
- To address biologically-relevant questions e.g. to assist in identifying candidate genes underlying QTL in farm animals or crop plant species
- To use existing comparative genomics knowledge to infer further comparative observations and stimulate hypothesis-driven experiments
Please visit the Documentation pages for more in-depth information regarding the ComparaGRID framework.
The ComparaGRID project is sponsored by the Biotechnology and Biological Sciences Research Council

