I have started using the
Sesame 2 (beta 5) RDF store for hosting the conference metadata for
ISWC+ASWC2007 at
http://data.semanticweb.org. The software is still beta and a lot of the interface funtionality is missing, but when you figure out how everything works (which I did, with a lot of help from
Aduna's OpenRDF forum), it seems to do the job. Even though the Web Interface is lacking a lot at the moment, Aduna supplies a console that can do most of what I need. Here is what I did:
Creating Repositories
After having installed Sesame as a Web application in Tomcat (5.5 works for me), I installed the console in a different location. I start it like this:
$ bin/start-console.sh
OpenRDF Sesame console 2.0-beta5
Using data dir: /home/knumoe/.aduna/openrdf-sesame-console
The following repositories are available:
+----------
|SYSTEM ("System configuration repository")
+----------
Commands end with '.' at the end of a line
Type 'help.' for help
>
At the beginning, the console (which has its own data and doesn't know yet about the Sesame Web app), only knows one repository: its own
SYSTEM
repository. To get started, we have to let the console know about the Web apps
SYSTEM
repository. This is done by using the
create
command:
> create remote.
Please specify values for the following variables:
Sesame server location [http://localhost:8080/openrdf-sesame]: http://localhost:8080/openrdf-http-server-2.0-beta5
Remote repository ID [SYSTEM]:
Local repository ID [SYSTEM@localhost]:
Repository title [SYSTEM repository @localhost]:
Repository created
>
Here is the first pitfall: the default server location is
http://localhost:8080/openrdf-sesame
, whereas, at least in this release of Sesame, the server location is actually
http://localhost:8080/openrdf-http-server-2.0-beta5
, so you need to change the default here (and if we don't, the console won't even complain). The rest of the default values are just fine, so we leave them as they are. The console now knows two repositories:
> show r.
+----------
|SYSTEM ("System configuration repository")
|SYSTEM@localhost ("SYSTEM repository @localhost")
+----------
>
Note that we havn't actually created a new repository on the server, we have just made sure the console knows about it. Every repository that the console knows can be opened and manipulated. We open
SYSTEM@localhost
:
> open SYSTEM@localhost.
Opened repository 'SYSTEM@localhost'
SYSTEM@localhost>
The
SYSTEM@localhost
repository is for internal housekeeping, so we don't want to add any actual data to it. Instead we now create a new repository, which I will call
test
. Because we have previously opened the server's
SYSTEM
, the new repository will be created on the server. We can either create a native, a memory, or a memory-rdfs store. I choose native:
SYSTEM@localhost> create native.
WARNING: You are about to add a repository configuration to repository SYSTEM@localhost
Proceed? (yes|no) [yes]:
Please specify values for the following variables:
Repository ID [native]: test
Repository title [Native store]: Test store
Triple indexes [spoc,posc]:
Repository created
SYSTEM@localhost>
Now, even though we have just created a new repository on the server, the console doesn't know about it (strange, yes):
SYSTEM@localhost> show r.
+----------
|SYSTEM ("System configuration repository")
|SYSTEM@localhost ("SYSTEM repository @localhost")
+----------
SYSTEM@localhost>
To let the console know about it, we need to close
SYSTEM@localhost
and create another remote repository as a placeholder for the console:
SYSTEM@localhost> close.
Closed repository 'SYSTEM@localhost'
> create remote.
Please specify values for the following variables:
Sesame server location [http://localhost:8080/openrdf-sesame]: http://localhost:8080/openrdf-http-server-2.0-beta5
Remote repository ID [SYSTEM]: test
Local repository ID [SYSTEM@localhost]: test@localhost
Repository title [SYSTEM repository @localhost]: Test Repository
Repository created
> show r.
+----------
|SYSTEM ("System configuration repository")
|SYSTEM@localhost ("SYSTEM repository @localhost")
|test@localhost ("Test Repository")
+----------
>
Now we are finally done with this bit: we have created a placeholder for the server's
SYSTEM
repository, created a repository for our data on the server, and created another placeholder for this new data repository.
Loading Data with Context
Loading data from the Web Interface in Tomcat is easy. However, it has one drawback - you cannot specify contexts (or named graphs, if you like). So, if you need named graphs, you need to resort to the console again. It's not hard though.
The first thing we do is open our newly created remote repository placeholder:
> open test@localhost.
Opened repository 'test@localhost'
test@localhost>
Now we can load data from a URL using the
load
command. If we just use the command like that, it will load the data into the repository without any specific context. To add a context for the new data, you can use the
-c
option. I want to load Eyal's foaf file. For the context, I just make up a URI (not good practice, I know):
test@localhost> load -c http://context.next/eyalsfoaf http://eyaloren.org/foaf.rdf.
Loading data...
Data has been added to the repository (12736 ms)
test@localhost>
So now, all the triples from
http://eyaloren.org/foaf.rdf
have been added to the
test
repository on the server, using
http://context.next/eyalsfoaf
as a name for the context (the "next" was a spelling error which I didn't bother to correct). Note that with the current version of Sesame (2b5), if you explore your repository with the Web interface, the contexts don't show up. Don't worry though, they're there. You can see the contexts in the console (in the meantime, I have added another foaf file with another context):
test@localhost> show c.
+----------
|http://context.net/knudsfoaf
|http://context.next/eyalsfoaf
+----------
test@localhost>
Querying with Context
Now that we have data in two different named graphs (Sesame has the concept of
contexts internally, but they are the same as named graphs for my purposes), we can also make use of this with
SPARQL queries. To do this, we can use SPARQL's
GRAPH
construct. E.g., if I simply want to find all instances of
foaf:Person
in the repository, I can use this query, which is just your basic SPARQL, that every self-respecting SW nerd knows:
PREFIX foaf: < http://xmlns.com/foaf/0.1/>
SELECT $person
WHERE {
$person a foaf:Person
}
This will give me six different instances, ignoring context (I did the queries in the Web interface - you can also do them in the console, but that is kind of unwieldy).
The following query, which uses the GRAPH construct, also tells us from which named graph/context each instance stems from:
PREFIX foaf: < http://xmlns.com/foaf/0.1/>
SELECT $person $g
WHERE {
GRAPH $g {
$person a foaf:Person
}
}
The result of this query also shows us that one instance comes from the named graph
http://context.net/knudsfoaf
and five from
http://context.next/eyalsfoaf
.
Finally, we can rewrite the query such that we will only get those instances that come from a particular graph:
PREFIX foaf: < http://xmlns.com/foaf/0.1/>
SELECT $person
WHERE {
GRAPH < http://context.next/eyalsfoaf> {
$person a foaf:Person
}
}
The result set of this query now contains those five instances that are in the
http://context.next/eyalsfoaf
named graph. Hooray!