The ComparaGRID framework relies upon Maven 2 to do much of the building required to construct a publisher. If you don't already have it, you should first download and install Maven (2.0.4 onwards recommended) by following the instructions here.
Once Maven is installed, we need to prepare the components ComparaGrid will use to build the DataPublisher creation tools.
Download this jar to a clean directory.
Unpack datapublisher.jar to a convenient directory, by running:
jar xvf datapublisher-0.3.3.jar
After unpacking the jar, you should see a new directory, ./datapublisher, which contains our build script build.sh. You may need to check the permissions on this file, or modify them using:
chmod 755 build.sh
You should then be able to run the build.sh script, passing in the location of your Maven 2 settings.xml file (probably either ${usr.dir}/.m2/settings.xml or $M2_HOME/conf/settings.xml) as a parameter. So, for example:
./build.sh ~/.m2/settings.xml
If you wish, you can skip this script, and do this yourself. For information on how do this, see the advanced documentation.
The script provided will modify your settings.xml so that Maven can use ComparaGrid plugins, and will then build the ComparaGrid plugins we need. You should see Maven downloading required dependencies, and finally you should get a "Build successful" message.
Now, our setup is complete and everything is installed and ready to go. We should now be able to create a DataPublisher for a relational database using a command which will look something like:
mvn publisher:create / -DgroupId=my.group / -DartifactId=my-publisher / -Dversion=1.0 / -DdatasourceUrl=jdbc:mysql://my.host:3306/database / -Ddbms=mysql/postgresql / (-Duser=username) / (-Dpassword=password)
For example, the following command will build a DataPublisher for the Human Ensembl database:
Cut and paste here:
You should replace the -DgroupId and -DartifactId with something that will identify your DataPublisher sensibly. The -DdatasourceUrl should point to the URL of the relational datasource you want to publish, and the -Ddbms is a string that tells the publisher generator which datasource management system is to be used. Currently, we support MySQL and Postgres interfaces (use the string mysql or postgresql as appropriate). The -Duser and -Dpassword arguments are optional - if specified, the generator will use these arguments when connecting to the database, and if they are not specified the generator will assume it can connect to the datasource using the username "anonymous" and an empty password field.
When you run a command appropriate to your datasource, you should see maven downloading all the required jar dependencies to construct a skeleton DataPublisher.
Once you have generated the skeleton publisher framework, you should see a new directory "./ensembl-publisher" or whatever you used for the artifactId.
Now, change to the artifact directory ("./ensembl-publisher") and run the command
mvn install
Maven may download more required dependencies and construct the DataPublisher for you. Once this has completed, you can change to the ./publisher/target directory and you should see a "ensembl-publisher.war" file (or, again, whatever you used for your own artifactId). This is a war file that contains everything required for a DataPublisher against your datasource. Drop this into a servlet container (we've tested it using several versions of Tomcat) and you should be ready to go!
This process will create a simple DataPublisher, exposing a single datasource. It uses some default parameters for acessing the datasource. For a more detailed discussion of how to do additional configuration, such as accessing a single database schema, see the advanced documentation.
What Next?
This process has supplied you with a DataPublisher, which is a webservice capable of publishing semantically marked up versions of the data contained within a specific datasource. However, it is important to remember that the semantics of the published data is tied to the semantics of the underlying datasource, so that the OWL data obtained from the DataPublisher will not generally be informative to the wider community. To tackle this problem, ComparaGrid provides integration tools so that OWL individuals obtained from a DataPublisher webservice can be mapped into another ontology, that is fully descriptive. The Runcible tool suite performs this function, and there is a GUI available as a Protege 4 plugin which can be used to edit these rules.
