Java OWL APIs

Introduction

Lately I am experimenting with the whole Semantic Web fuzz. More specifically using OWL. I have edited two small test Ontologies, to see what can be acomplished and how I have to use it from Java. So I ran into the issue of comparing some Java Semantic Web APIs.

The tests I wrote are mainly JUnit Tests since they are easy to run from my IDE (Eclipse) and provide stack traces of exceptions by default, so the code does not need to include try statements and stacktrace prints. For production use proper Exception handling is a necessity.

This is still under Construction and will evolve in the next couple days. Still everything stated here reflects my personal opinion of seeing things. I am no expert at Ontologies or the description logic behind it (yet ;) ). Nevertheless if you see anything suboptimal, leave a comment and I try to improve.

Use Case

My Use Case is more or less simple. By using Java I want to fill a specific data model (= ontology) with some data. The data model in this respect the OWL class and properties are considered to be static. That means, during the deploy time (and run time) the ontology's class definition will never change. Changing the ontology in that respect would be considered a major development cycle.

The Program is used to read some data and creates a bunch of individuals for the ontology at hand. This could be the first step of information collection and the individuals can be consumed later by another application which reasons on them or does something completely different.

In short I am looking for an API that achieves the following:

  1. Hide complexity of OWL to the average Java Programmer (like me) and provide simple usage of Java Objects. (I am aware that this means loosing expressivity of certain OWL-Concepts!)
  2. Be able to save ontologies from the in memory Java model.
  3. Be reasonably fast at this.

All statements made about the API's are in respect to the fitness of use for that particular use case.

"Bucket, Stones and Stuff" Ontology

The first ontology consists of a Bucket with some properties like Engraving and Material plus a List of the Contents of this Bucket. The contents can be either Stones which have a weight and some date property when they where found. Stuff is just a filler to see if the content lists can cope with heterogenous contens in Java. The Ontology is attached to this page: bucket.owl.

The basic Test for this ontology is as follows:

  1. Open the Ontology
  2. Create a Bucket and set it's properties
  3. Create 2 Stones and set their props
  4. Put the stones in the bucket (inverse Properties set automatically?)
  5. Create "Stuff"
  6. put "Stuff" in the bucket (Range ok? can lists handle more than one type?)
  7. Save the new ontology to a file.

For testing the limits of the system I will test the following:

  1. Open the Ontology
  2. Create a Bucket and set it's properties
  3. Create {a lot} Stones and set their props
  4. Put the stones in the bucket
  5. Save the new ontology to a file

"Matryoshka" Ontology

I created this ontology to see how far it is possible to nest individuals. If you don't know them take a look at the Wikipedia Article Matryoshka Doll and it should get clear what the intention is. The ontology consists only of one Class, namely Matryoshka with a property Size and Color plus the Object Property Contains with its inverse Contained_in (which is obviously transitive). Check out the simple matryoshka.owl file.

Basic test:

  1. Create Matryoshka m1
  2. set the properties
  3. create second Matryoshka m2
  4. put m2 into m1
  5. save

To test how deep it can be nested the following test is done:

  1. create root Matryoshka
  2. loop
  3. create new Matryoshka and put it in the last one
  4. until {a lot} has been reached
  5. save

In this 'a lot' can mean anything. Usually described by a constant. To make it more interesting: you can really break api's with this.

Java API Usage

I know that it is not a trivial task to map the OWL semantics (with the Description Logic approach) to Java frame based class system. Still I think for the most Applications this approach will give the programmer a lot of benefits. A discussion can also be found in this Paper.

I started looking for some Tools and APIs for Java that can ease the pain (from a Java programmer perspective) of working with ontologies, preferably those with Java Source Code generation capabilities. This is a non exhaustive list of API's I found.

It has to be noted that there are mainly two types of API's:

  1. Low level API that manipulate the ontology directly. That means also that they can usually be used to change anything in the model like classes properties and assumptions about them, e.g. which property belongs to which classes, what data type and range has a property, etc..
  2. High level API's that hide certain parts of the model, by transforming a certain part (usually definition of classes and properties) to a different concept in Java (classes and fields plus getter and setter methods). Thus disabling the user (or program) to add new classes and properties to the model at run time and usually after deployment.
API-Name Type Generates Java Classes Tested Version License Problems / Issues / Remarks
Jastor Type 2 Yes Tried CPL Unable to run with Jena 2.4 or Jena 2.5
Sommer Type 2 ? Not Yet BSD
Protege OWL-API 1 & 2 Yes Yes 3.4 beta MPL none
RDFReactor Type 2 Yes Yes 4.5.2 BSD / LGPL ? More research needed to correct OWL output
Kazuki ? ? Not Yet
Sesame and Elmo ? ? Not Yet BSD
OWL API Type 1 No Yes 2.1.1 LGPL Since it is a plain low level OWL API code looks horrible for this purpose
Jena Type 1 No Indirect 2.4 BSD-Style Is used in a lot of different higher level APIs
Topaz Type 2 ? No Apache 2.0
Owl2Java Type 2 Yes No 1.0 GPL uses Jena for reading and writing
JAOB Type 2 Yes Yes 0.1 LGPL My own implementation

Future mapping API named Tripresso is apparently currently been thought of. It seems that it could be a mix of Elmo, RDF2Java, RDFReactor and Sommer.

The tests (Eclipse project with JUnit) are available via anonymous Subversion.
URL: https://svn.iai.uni-bonn.de/repos/nsdb/malottki/OntologyTests/
User: anonymous
Password is to be left blank.

Protege-API

The Protege tool is pretty standard in creating and editing ontologies. The Protege-API is part of the older 3.x branch and is not included in the new 4.0 (as of now still alpha) release. So it can somewhat be considered deprecated. It can be used as both low level API and Java class generator, according th the use case I only take a look at the class generation features. Java Class generation is pretty easy and straight forward done in Protege itself with a few clicks (Code -> Generate Protege-OWL Java Code). The generated Code is primarly made for Java 1.4 so you will miss typesafe Collections. I haven't found any option to change that. Since Protege-OWL itself uses Jena as serializer Jena (and its dependencies) has to be included into the imported libraries, a starting point is the Protege-OWL plugin directory from the Protege package plus the protege.jar and jlooks.jar from the main Protege folder. I find the jlooks.jar particulary ugly, since it is only for swing gui stuff and should not be needed to include if you want to use Protege only as an API. Another point is the inclusion of different loggers (apache commons-logging and log4j) this could be fixed by using the Simple Java Logging Facade and the included Bridges for commons-logging or log4j.

Let's take a look at the usage of the API itself and how to get the job done:

Bucket Tests

package ontotest.tests;
 
import java.io.File;
import java.util.Calendar;
import java.util.GregorianCalendar;
 
import javax.xml.bind.DatatypeConverter;
 
import org.junit.Before;
import org.junit.Test;
 
import edu.stanford.smi.protegex.owl.ProtegeOWL;
import edu.stanford.smi.protegex.owl.jena.JenaOWLModel;
import edu.stanford.smi.protegex.owl.model.RDFSLiteral;
import ontotest.model1.Bucket;
import ontotest.model1.MyFactory;
import ontotest.model1.Stone;
import ontotest.model1.Stuff;
import ontotest.util.TimerUtil;
 
public class OntologyTest {
 
    /** the maximum Number of Stones created for this test */
    private static final int maxStones = 20000;
 
    /* JenaOWLModel stores the Model at runtime and gives the ability to save etc.*/
    private JenaOWLModel owlModel = null;
 
    /* The Factory creates new Individuals and literals for the Ontology */
    private MyFactory mf = null;
 
    /**
     * Create a correct representation of an Date Object for xsd:dateTime
     * @param dt Date to print
     * @return Literal for RDFS
     */
    public RDFSLiteral getDate(Calendar cal){
        if(cal == null){
            // create new Date
            cal = new GregorianCalendar();
        }    
        // Prints the Correct DateTime Format for XSD based Types
        return owlModel.createRDFSLiteral(DatatypeConverter.printDateTime(cal), owlModel.getRDFSDatatypeByName("xsd:dateTime"));
    }
 
    @Before
    public void setUp() throws Exception{
        // Load the ontology from file 
        File file = new File("bucket.owl");
        owlModel = ProtegeOWL.createJenaOWLModelFromURI(file.toURI().toString());
 
        // create the corresponding Factory
        mf = new MyFactory(owlModel);
    }
 
    @Test
    public void simpleTest() throws Exception{
 
        // Lets have a new Bucket
        // the String passed to the factory states the name of the Individual
        Bucket bucket = mf.createBucket("MyPrecious");
 
        // set the properties
        bucket.setMaterial("Copper");
        bucket.addEngraving("One Bucket to carry them all");
 
        //Ok Let's create 2 Stones
        Stone pebble = mf.createStone("Pebble");
        pebble.setWeight(80);
 
        /* Setting DateTime Values usually has to be done the following way:
         * note that the DateTypeConverter comes from java.xml.bind and is 
         * especially necessary to get the correct String representation of 
         * a Date. This is really important in international settings! 
         * (new Date()).toString is heavily locale dependent and the output 
         * is parsed not always to the right value.
         */
        RDFSLiteral myDay = owlModel.createRDFSLiteral(
                DatatypeConverter.printDateTime(new GregorianCalendar()),
                owlModel.getRDFSDatatypeByName("xsd:dateTime"));
        pebble.setDateFound(myDay);
 
        Stone suiseki = mf.createStone("Suiseki");
        suiseki.setWeight(5000);
        suiseki.setDateFound(getDate(null));
 
        // Lets put the stones in the Bucket:
        suiseki.setIs_in_Bucket(bucket);
        pebble.setIs_in_Bucket(bucket);
 
        // inverse Properties seem to be filled automatically
        // no need to do the following:
        //bucket.getContains().add(pebble);
        //bucket.getContains().add(suiseki);
 
        // Joint Properties can be filled too
        Stuff zeug = mf.createStuff("Paper");
 
        zeug.addIs_in_Bucket(bucket);
 
        // save our new ontology
        owlModel.save((new File("smallbucket.owl")).toURI() );
 
    }
 
    @Test
    public void createBigOntology() throws Exception{
 
        // Lets have a new Bucket
        Bucket bucket = mf.createBucket("MyPrecious");
        bucket.setMaterial("Steel");
        bucket.addEngraving("One Bucket to carry them all (lot's of em)");
 
        TimerUtil tu = new TimerUtil();
 
        System.out.println("Start filling Bucket ");
 
        //Ok Let's create lots of Stones
        for( int i = 1; i <= maxStones; i++){        
            Stone pebble = mf.createStone("Pebble"+i);
            pebble.setWeight(i);
            pebble.setDateFound(getDate(null));
        }
        System.out.println("Filling "+ maxStones + " Stones into Bucket took: " + tu.getTime() + " Milliseconds");
 
        System.out.println("Start writing Full Bucket to File");
 
        owlModel.save((new File("bigbucket.owl")).toURI() );
 
        System.out.println("Wrote one Bucket in "  + tu.getTime() + " Milliseconds");
 
    }
}

Matryoshka Tests

package ontotest.tests;
 
import java.io.File;
import java.util.Calendar;
import java.util.GregorianCalendar;
 
import javax.xml.bind.DatatypeConverter;
 
import ontotest.model2.Matryoshka;
import ontotest.model2.MatryoshkaFactory;
import ontotest.util.TimerUtil;
 
import org.junit.Before;
import org.junit.Test;
 
import edu.stanford.smi.protegex.owl.ProtegeOWL;
import edu.stanford.smi.protegex.owl.jena.JenaOWLModel;
import edu.stanford.smi.protegex.owl.model.RDFSLiteral;
 
public class OntologyTest2 {
    // setting this value to 2000 will kill Jena through regex limitations
    private static final int maxMatryoshkas = 1000;
 
    /* JenaOWLModel stores the Model at runtime and gives the ability to save etc.*/
    private JenaOWLModel owlModel = null;
 
    /* The Factory creates new Individuals and literals for the Ontology */
    private MatryoshkaFactory mf = null;
 
    /**
     * Create a correct representation of an Date Object for xsd:dateTime
     * @param dt Date to print
     * @return Literal for RDFS
     */
    public RDFSLiteral getDate(Calendar cal){
        if(cal == null){
            // create new Date
            cal = new GregorianCalendar();
        }    
        // Prints the Correct DateTime Format for XSD based Types
        return owlModel.createRDFSLiteral(DatatypeConverter.printDateTime(cal), owlModel.getRDFSDatatypeByName("xsd:dateTime"));
    }
 
    @Before
    public void setUp() throws Exception{
        // init and neccessary Stuff
        File file = new File("matryoshka.owl");
 
        owlModel = ProtegeOWL.createJenaOWLModelFromURI(file.toURI().toString());
 
        mf = new MatryoshkaFactory(owlModel);
    }
 
    @Test
    public void simpleTest() throws Exception{
        Matryoshka m1 = mf.createMatryoshka("DarthVader");
        m1.addColor("Black");
        m1.setSize(10);
 
        Matryoshka m2 = mf.createMatryoshka("AnakinSkywalker");
        m2.addColor("Brown");
        m2.setSize(9);
 
        m2.addContained_in(m1);
 
    }
 
    @Test
    public void bigTest() throws Exception{
        Matryoshka m1 = mf.createMatryoshka("TheOne");
        m1.addColor("SoylentGreen");
        m1.setSize(maxMatryoshkas+1);
 
        TimerUtil tu = new TimerUtil();
 
        System.out.println("Building Matryoshkas");
        Matryoshka ml = m1;
        for (int i = maxMatryoshkas; i > 0; i--) {
            // creating new Matryoshka
            Matryoshka tmp = mf.createMatryoshka("No"+i);
            tmp.setSize(i);
            ml.addContains(tmp);
            ml = tmp;
        }
        System.out.println("Build of "+ maxMatryoshkas +" Matryoshkas took: " + tu.getTime() + " ms");
 
        System.out.println("Saving File ");
        owlModel.save((new File("bigmatryoshka.owl")).toURI() );
        System.out.println("took "+ tu.getTime() + " ms");
    }
}

Conclusion

Generally get's the Job done and it looks ok from my view.

Pros

  • The mapping of OWL-Classes to Java Interfaces resembles the intention of OWL (better than mapping to Objects)
  • Let's you create additional abstract interfaces and classes, so no need to edit the generated code if customization is needed.

Cons are:

  • Jena can not handle too large nested items.
  • DateTime handling could be better. Not using the correct Converter can give you trouble (e.g. Date.toString() in a german locale)
  • Typesafe Collections would be nice.
  • Neccessary to include libs that do not belong there (jlooks.jar)

RDFReactor and RDF2Go

RDFReactor can be found here: RDFReactor. There is also a paper available from one of the authors of RDFReactor which discusses the way of mapping RDFS to Java classes.

Code Generation can be acomplished by a small Java line like:

CodeGenerator.generate("bucket.owl", "src", "ontotest.model1", Reasoning.rdfs, true, true);

It would have been nice if they also provided some Ant Task like Jastor does. Also it seems that this line is deprecated in newer versions, but it can still be found in the docs (and it still works).

This API uses Jena as writer, or to be precise it can actually work with a lot of storage backends since RDF2Go cares about the mapping. Again I have to point at the different logging implementations shipped. This is even more surprising since they already use the Logging Facade and could had just included the bridge.

Generated code looks nice, and has a lot of comments generated too. This helps using the classes when programming since instant help is provided for it. However there are some Warnings regarding unnecessary casts (as seen by Java 6) in it. But it could be that in more complex Ontologies casts become necessary at those points.

Bucket Tests

Matryoshka Test

Conclusion

Until now this one looks really nice. I will play around with it a little further and will see if it does other things (loading Ontologies etc.) as well.

Pros:

  • Nice correct and simple date handling
  • Straight forward Class translation, easy to code

Cons:

  • No Ant Task
  • Javadoc on their Homepage a little older than the distribution package
  • Maps OWL Classes to Java Classes (OWL Individuals can consist of multiple Classes which is not possible with Java Classes)

OWL API

Can be found at [http://owlapi.sourceforge.net/]. It is a low level API to generate OWL, and haven't found any Libs for this thing that can generate Java Classes from the OWL-Ontology using OWL-API for managing the objects in the background. Even if I repeat myself here, this is a low level OWL API, which gives you the power of transforming the Ontology's completely (changing classes, properties, individuals and assumptions).

Writing code with this is quite difficult if you are not used to the whole OWL semantics. I needed quite some time to get used to the model. They say that it is cleaner and more close to OWL itself, which maybe true, but I mainly seek for abstraction and easy access. I like the idea with the change manager though, because it really helps to implement undo or some other kind of transactional integrity. However API-Documentation is sparse, and that doesn't really help getting accustomed to the thing! The samples provide an entry point but I was missing some explanation, that in this model (or in OWL in general) setting a Property for an individual is merely an assertion, which is reflected heavily in the API. Retrospecting, their Example4.java describes just that, but I didn't get it at that time.

Furthermore i am not quite sure why some Frameworks write:

<Stone rdf:Id="#Pebble955">

And OWL-API uses the rdf:about property.
<Stone rdf:about="#Pebble955">

I am not quite sure of the implicated differences.

Bucket Tests

The Java file is overly verbose so just a small example for creation of an Individual and it's properties:

// hold changes in the Queue as they are done
        Queue<AddAxiom> changes = new LinkedList<AddAxiom>();
 
        OWLIndividual bucket = null;
        {
            // Creating Bucket Individual
            bucket = factory.getOWLIndividual(URI.create(ontoURI + "#MyPrecious"));
            OWLClass Class = factory.getOWLClass(URI.create(ontoURI + "#Bucket"));
            OWLAxiom bucketaxiom = factory.getOWLClassAssertionAxiom(bucket, Class);
            changes.add(new AddAxiom(ontology, bucketaxiom));
        }
 
        // creating Data property
        {
            OWLDataProperty owlp = factory.getOWLDataProperty(URI.create(ontoURI + "#Engraving"));
            OWLConstant owlc = factory.getOWLTypedConstant("One to hold them All", xsdString);
            OWLDataPropertyAssertionAxiom addProp = factory.getOWLDataPropertyAssertionAxiom(bucket, owlp, owlc);
            changes.add(new AddAxiom(ontology, addProp));
        }
 
        { // create Material Property
            OWLDataProperty owlp = factory.getOWLDataProperty(URI.create(ontoURI + "#Material"));
            OWLConstant owlc = factory.getOWLTypedConstant("Plastik", xsdString);
            OWLDataPropertyAssertionAxiom addProp = factory.getOWLDataPropertyAssertionAxiom(bucket, owlp, owlc);
            changes.add(new AddAxiom(ontology, addProp));
        }

The full code can be seen in the attached file Test1OwlApi.java.

Matryoshka Tests

Conclusion

If you are used to do the things low level, this is the right stuff for you, but for the normal "object-oriented"-Programmer this is more or less pain.
Pros:

  • Clean design (I have to emphasize this, after working with it a little bit longer)
  • Good change management in ontologies
  • Low level approach

Cons:

  • Low Level approach (in regard to the use case)
  • Overly verbose for normal (business rule driven) programming.
  • Sometimes sparse Documentation (JavaDoc) although the class names are chosen very well, for a beginner this is not always sufficient to understand the whole concept.
Add a New Comment
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License