Community Area
| Campaign 2010 results |
|
Here we present the Semantic Search Tool theme results and analyses from the first SEALS Evaluation Campaign which was conducted during Summer 2010. More details can be found in SEALS Deliverable D13.3.
The list of participants and the phases in which they participated is shown in the following table.
All registered tools attempted the automated phase of the evaluation campaign. The Jena Arq tool did not participate in the user-in-the-loop phase because this tool was only included in the campaign to act as a baseline within the automated phase. Jena Arq does not have a user interface: it only provides programmatic access to the underlying SPARQL query engine.
Automated phase
The automated evaluation results are shown below. In order to facilitate the presentation, the responses to each of the ten questions per dataset size have been averaged. The full results can be found in Appendix B of SEALS Deliverable D13.3.
EvoOnt 1k triples
EvoOnt 10k triples
EvoOnt 100k triples
EvoOnt 1M triples
EvoOnt 10M triples
User-in-the-loop phase
The user-in-the-loop evaluation results are shown below. In order to facilitate the presentation , the responses to each of the twenty questions by all users, along with the average experiment time and feedback scores have been averaged. The full results can be found in Appendix C of SEALS Deliverable D13.3. Due to a bug in the PowerAqua SEALS wrapper identified after the user-in-the-loop experiments were completed, the values for precision, recall and f-measure had to be recalculated separately with a representative sample of queries posed by the users in the evaluation. Hence these results do not match those documented in Appendix C of SEALS Deliverable D13.3.
The mean experimental time indicates how long, on average, the entire experiment (answering twenty pre-defined questions) took for each user. The mean SUS indicates the mean system usability score for each tool as reported by the users themselves. A score of 0 implies that the user regards the user interface as unusable and that a score of 100 implies that the user considers the user interface to be perfect. A score of approximately 60 and above is generally considered as an indicator of good usability. The mean extended questionnaire shows the average response to the questionnaire in which more detailed questions were used to establish the user's satisfaction and is scored out of 52. The mean number of attempts shows how many times the user had to reformulate their query using the tools interface in order to obtain answers with which they were satisfied (or indicated that they were confident a suitable answer could not be found). This latter distinction between finding the appropriate answer after a number of attempts and the user `giving up' after a number of attempts is shown by the mean answer found rate. Input time refers to the amount of time the subject spent formulating their query using the tool interface before submitting the query.
Further analysis is available in Chapter 3 of SEALS Deliverable D13.3.
|
















