Community Area
| Storage and Reasoning Systems Evaluation Campaign 2010 results |
|
Storage and Reasoning Systems Evaluation Campaign 2010 aims at the evaluation of description logic based systems (DLBS). They are based on a common family of languages, called description languages, which provide a set of constructors to build concept (class) and role (property) descriptions. Such descriptions can be used in axioms and assertions of description logic (DL) knowledge bases and can be reasoned about with respect to them. The evaluation must provide informative data with respect to both DLBS interoperability and performance. We evaluate DLBS's interoperability and performance as the number of tests passed by a system without parsing errors, the number of tests passed by a system, the time a system needs to perform a given inference task and the task loading time. The number of tests passed by a DLBS without parsing errors is a metric of a system's interoperability with respect to the relevant syntax standard. The number of inference tests passed by a DLBS is a metric of a system's ability to perform the standard inference services. The inference and loading times are the metrics of a system performance. Four evaluations reecting the standard DLBSs inference services: class satisfiability, ontology satisfiability, classification and logical entailment are designed to assess both interoperability and performance. Our collected data set contains most of the ontologies that are well established and widely used for testing DLBS inference services. These ontologies reflect real-world scenarios and test cases and have been almost the de facto standard for DLBSs evaluation. Our dataset also includes the entailment and non-entailment tests from Web Ontology Language (OWL) 2 tests repository. The testing data was used for evaluation of three DLBSs HermiT 1.2.2, FaCT++ 1.4.1 and jcel 0.8.0. HermiT is a reasoner for ontologies written using the OWL. HermiT is the first publicly-available OWL reasoner based on a novel hypertableau calculus which provides efficient reasoning capabilities. HermiT can handle DL Safe rules and the rules can directly be added to the input ontology in functional style or other OWL syntaxes supported by the OWL API. FaCT++ is the new generation of the well-known FaCT OWL-DL reasoner. FaCT++ uses the established FaCT algorithms, but with a differerent internal architecture. Additionally, FaCT++ is implemented using C++ in order to create a more efficient software tool, and to maximise portability. jcel is a reasoner for the description logic EL+. It is an OWL 2 EL reasoner implemented in Java. OWL API 3.0 was used for evaluation. isSatisfiable(), isConsistent(), isEntailed() from OWLReasoner class were used for class satisfiability, ontology satisfiability and entailment evaluations. fillOntology method of InferredOntologyGenerator class initialized with InferredSubClassAxiomGenerator passed as list parameter was used for classification evaluation. The evaluation has been run on two AMD Athlon(tm) 64 X2 Dual Core Processor 4600+ machines with 2GB of main memory. DLBSs were allowed to allocate up to 1 GB. To obtain the comparable results from both machines we executed a small subset of the class satisfiability tasks on them comparing the systems performance. The results have shown the influence of the software installed on the machines on the systems execution times. For example, average loading time (ALT) for class satisfiability tasks differ in 0.74 times for HermiT reasoner depending on the evaluation machine while average reasoning time (ART) differ in 1.29 times. Thus, we factored the results obtained to make them comparable. The systems had the 10 seconds evaluation time frame for single class satisfiability computation. More complex evaluations such as entailment and classification consists of many class satisfiability computations. Storage and reasoning systems evaluation component has been used in the evaluation. All three systems support OWL API 3. The evaluation have been performed exploiting it as a common interface to DLBSs. Thus, the systems were run on the subset of the evaluation tasks that is OWL-API 3 parsable. The evaluation results consist of ALT, ART, number of correct results (TRUE), number of incorrect results (FALSE), number of errors (ERROR), number of evaluation tasks that were not completed within the given time frame (UNKNOWN).
The results for HermiT system was not obtained due to limited time allocated for Storage and Reasoning Systems Evaluation Campaign 2010. Most errors were related to the datatypes not supported in FaCT++ system. There were several description logic expressivity related errors such as NonSimpleRoleInNumberRestriction. There also were several syntax related errors where FaCT++ was unable to register a role or a concept.
FaCT++ clearly outperformed HermiT on the most of the reasoning tasks. Most errors for both FaCT++ and HermiT were related to the datatypes not supported in the systems. The evaluation tasks proved to be challenging enough for the systems. Thus, 16 and 30 evaluation tasks respectively were not solved in the given time frame. The relatively poor HermiT performance can be explained taking into account the small number of very hard tasks where FaCT++ was orders of magnitude more efficient.
The results for HermiT system was not obtained due to limited time allocated for Storage and Reasoning Systems Evaluation Campaign 2010. Most errors were related to the datatypes not supported in FaCT++ system. There were several description logic expressivity related errors such as NonSimpleRoleInNumberRestriction. There also was several syntactic related errors where FaCT++ was unable to register a role or a concept.
jcel system does not support entailment evaluation tasks. Therefore, we will provide the results for FaCT++ and HermiT. The HermiT time was influenced by small number of very hard tasks. FaCT++ demonstrated a big number of false and erroneous results.
HermiT time was influenced by small number of very hard tasks. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
















