Elmo Tools

Creating Concepts and Ontologies

The included elmo-codegen.jar can be used from the command line to create an RDF ontology file from existing JavaBeans or generate Elmo concepts from an RDF ontology file. The command below will search the given jar (example-entities.jar) for classes in the package com.example.entities and output an OWL DL ontology in example-ontology.owl using the given ontology URI and the same namespace.

java -jar elmo-codegen.jar \
    -b "com.example.entities=http://www.example.com/rdf/2008/model#" \
    -r example-ontology.owl \
    example-entities.jar

In the example below, the ontology example-ontology.owl will be imported, and Elmo concepts that are defined by the given ontology URI will be created and compiled in example-concepts.jar. This jar will then be ready to be used for development and deployment of an Elmo application.

java -jar elmo-codegen.jar \
    -b "com.example.concepts=http://www.example.com/rdf/2008/model#" \
    -j example-concepts.jar \
    example-ontology.owl

The elmo-codegen.jar will import the ontologies and concepts from elmo-rdfs.jar and elmo-owl.jar, so they don't need to be specified on the command line. However, other dependent concepts jar files must be included at the end of the command.

Elmo Scutter

The Elmo scutter is a generic RDF crawler that follows rdfs:seeAlso links in RDF documents, which typically point to other relevant RDF sources on the web. The Elmo scutter is based on original code by Matt Biddulph for Jena.

RDF(S) seeAlso is also the mechanism used to connect FOAF profiles and thus (given a starting location) the scutter allows to collect FOAF profiles from the Web. Several advanced features are provided to support this scenario:

The data collected by the scutter is stored in a Sesame repository. We recommend using a Native RDF repository for scuttering, because it provides the best performance for uploads.

The Scutter is available as a Java class as well as a Java servlet. The servlet provides access to all of the above features, except for filtering (which requires programming) and it can be deployed by placing the Elmo.war file in the web application directory of a Servlet/JSP container.

The servlet initialization parameters to be specified in the web.xml descriptor file are listed below. An example web.xml file is provided in the war file.

Parameter

Description

Optional/Default

server

URL of the Sesame server to store the collected data

Required

repository

Name of the repository on the server

Required

username

Username for access to the Sesame repository

Optional

password

Password for access to the Sesame repository

Optional

queue

Location of the file used to save the queue when the scutter is stopped

Required

start

URL(s) used to start scuttering. URLs should be separated by white space.

Optional

domain

Limits crawling to URLs that match the provided regular expression.

Optional

metadata

Produce reified statements containing information about the provenance of the statements and the time they were collected. Possible values: true/false

Optional, defaults to false.

autoblacklist

Enable/disable automatic blacklisting. Possible values: true/false

Optional, defaults to true (enabled).

vocab

Restrict crawling to FOAF specific vocabularies (statements with predicates from the RDF, RDFS, FOAF or WGS_84 namespaces)

Optional, only possible value is 'foaf'

focused

Collect data about a specific set of target persons. The target persons are given as foaf:Person instances in the repository.

Optional, actual value is ignored

maxThreads

Maximum number of threads allowed to be running. Must be a positive integer.

Optional, defaults to 20.

The request parameters to the server are listed in the table below. For convenience, there is an html file provided in the distribution for calling various operations on the servlet.

Parameter

Description

Optional/Default

start

Try to load the set of visited URLs and start the scutter

Parameter value ignored.

stop

Stop the scutter, save the queue to disk

Parameter value ignored.

preloadQueue

Preload the queue from the saved file

Parameter value ignored.

clear

Clear the queue and the set of visited URLs

Parameter value ignored.

A custom filtering of statements can be implemented by setting an instance of the StatementFilter interface using the setStatementFilter method of the Scutter class. See the JavaDoc for more details.

Elmo Smusher

The task of the Elmo smusher is to find equivalent instances in large sets of data. This is a very common problem when processing collections of FOAF profiles as several sources on the Web may describe a the same individual using different identifiers or blank nodes (which are always assumed to be different). While the servlet provided is specific to smushing foaf:Person instances, the underlying mechanism is generic

The smusher uses instances of ResourceComparator for comparing instances. Implementations of ResourceComparator are given for foaf:Person and swrc:Publication.

The smusher reports the results (matching instances) by calling methods on registered listeners. Listeners implement the SmusherListener interface. Two implementations of SmusherListener are provided: one writes out results in text, while the other represents matches using the owl:sameAs relationship and uploads such statements to a Sesame repository. While Sesame does not directly support OWL semantics, the semantics of this relationship (the equivalence of property values) can be easily axiomatized using Sesame's custom rule language.