Confused Development

I develop software and I often get confused in the process. I usually find the answers after a while, but a month later I can't remember them. So from now on, I will write them down here.

Monday, October 15, 2007

Realm-based authentification for Tomcat Web Apps

I needed to find out how to secure a Tomcat Web App with a password, and found the answer here. Basically, you need to edit the Web app's web.xml and add this little piece of xml to it:
<!-- Define a security constraint on this application -->
<security-constraint>
  <web-resource-collection>
    <web-resource-name>Entire Application</web-resource-name>
    <url-pattern>/*</url-pattern>
  </web-resource-collection>
  <auth-constraint>
    <!-- This role is not in the default user directory -->
    <role-name>manager</role-name>
  </auth-constraint>
</security-constraint> 			
<!-- Define the login configuration for this application -->
<login-config>
  <auth-method>BASIC</auth-method>
  <realm-name>Tomcat Manager Application</realm-name>
</login-config>
This is enough if you want to protect the Web app with the same password as the manager application (changing the realm name would probably make sense). If you want to define a new role for this Web app, you can do it as follows (again taking the example from the manager, so change the description and role name):
<!-- Security roles referenced by this web application -->
<security-role>
  <description>
    The role that is required to log in to the Manager Application
  </description>
  <role-name>manager</role-name>
</security-role>
All that needs to be done now is to reload your Web app.

Thursday, October 11, 2007

The Sesame 2 console: Creating Repositories, Loading Data with Context,SPARQL-Querying with Context

I have started using the Sesame 2 (beta 5) RDF store for hosting the conference metadata for ISWC+ASWC2007 at http://data.semanticweb.org. The software is still beta and a lot of the interface funtionality is missing, but when you figure out how everything works (which I did, with a lot of help from Aduna's OpenRDF forum), it seems to do the job. Even though the Web Interface is lacking a lot at the moment, Aduna supplies a console that can do most of what I need. Here is what I did:

Creating Repositories

After having installed Sesame as a Web application in Tomcat (5.5 works for me), I installed the console in a different location. I start it like this:
$ bin/start-console.sh 
OpenRDF Sesame console 2.0-beta5
Using data dir: /home/knumoe/.aduna/openrdf-sesame-console

The following repositories are available:
+----------
|SYSTEM ("System configuration repository")
+----------

Commands end with '.' at the end of a line
Type 'help.' for help
>
At the beginning, the console (which has its own data and doesn't know yet about the Sesame Web app), only knows one repository: its own SYSTEM repository. To get started, we have to let the console know about the Web apps SYSTEM repository. This is done by using the create command:
> create remote.
Please specify values for the following variables:
Sesame server location [http://localhost:8080/openrdf-sesame]: http://localhost:8080/openrdf-http-server-2.0-beta5 
Remote repository ID [SYSTEM]:
Local repository ID [SYSTEM@localhost]: 
Repository title [SYSTEM repository @localhost]: 
Repository created
>
Here is the first pitfall: the default server location is http://localhost:8080/openrdf-sesame, whereas, at least in this release of Sesame, the server location is actually http://localhost:8080/openrdf-http-server-2.0-beta5, so you need to change the default here (and if we don't, the console won't even complain). The rest of the default values are just fine, so we leave them as they are. The console now knows two repositories:
> show r.
+----------
|SYSTEM ("System configuration repository")
|SYSTEM@localhost ("SYSTEM repository @localhost")
+----------
>
Note that we havn't actually created a new repository on the server, we have just made sure the console knows about it. Every repository that the console knows can be opened and manipulated. We open SYSTEM@localhost:
> open SYSTEM@localhost.
Opened repository 'SYSTEM@localhost'
SYSTEM@localhost> 
The SYSTEM@localhost repository is for internal housekeeping, so we don't want to add any actual data to it. Instead we now create a new repository, which I will call test. Because we have previously opened the server's SYSTEM, the new repository will be created on the server. We can either create a native, a memory, or a memory-rdfs store. I choose native:
SYSTEM@localhost> create native.
WARNING: You are about to add a repository configuration to repository SYSTEM@localhost
Proceed? (yes|no) [yes]:    
Please specify values for the following variables:
Repository ID [native]: test
Repository title [Native store]: Test store
Triple indexes [spoc,posc]: 
Repository created
SYSTEM@localhost>
Now, even though we have just created a new repository on the server, the console doesn't know about it (strange, yes):
SYSTEM@localhost> show r.
+----------
|SYSTEM ("System configuration repository")
|SYSTEM@localhost ("SYSTEM repository @localhost")
+----------
SYSTEM@localhost>
To let the console know about it, we need to close SYSTEM@localhost and create another remote repository as a placeholder for the console:
SYSTEM@localhost> close.
Closed repository 'SYSTEM@localhost'
> create remote.
Please specify values for the following variables:
Sesame server location [http://localhost:8080/openrdf-sesame]: http://localhost:8080/openrdf-http-server-2.0-beta5
Remote repository ID [SYSTEM]: test
Local repository ID [SYSTEM@localhost]: test@localhost
Repository title [SYSTEM repository @localhost]: Test Repository
Repository created
> show r.
+----------
|SYSTEM ("System configuration repository")
|SYSTEM@localhost ("SYSTEM repository @localhost")
|test@localhost ("Test Repository")
+----------
> 
Now we are finally done with this bit: we have created a placeholder for the server's SYSTEM repository, created a repository for our data on the server, and created another placeholder for this new data repository.

Loading Data with Context

Loading data from the Web Interface in Tomcat is easy. However, it has one drawback - you cannot specify contexts (or named graphs, if you like). So, if you need named graphs, you need to resort to the console again. It's not hard though. The first thing we do is open our newly created remote repository placeholder:
> open test@localhost.
Opened repository 'test@localhost'
test@localhost> 
Now we can load data from a URL using the load command. If we just use the command like that, it will load the data into the repository without any specific context. To add a context for the new data, you can use the -c option. I want to load Eyal's foaf file. For the context, I just make up a URI (not good practice, I know):
test@localhost> load -c http://context.next/eyalsfoaf http://eyaloren.org/foaf.rdf.
Loading data...
Data has been added to the repository (12736 ms)
test@localhost> 
So now, all the triples from http://eyaloren.org/foaf.rdf have been added to the test repository on the server, using http://context.next/eyalsfoaf as a name for the context (the "next" was a spelling error which I didn't bother to correct). Note that with the current version of Sesame (2b5), if you explore your repository with the Web interface, the contexts don't show up. Don't worry though, they're there. You can see the contexts in the console (in the meantime, I have added another foaf file with another context):
test@localhost> show c.
+----------
|http://context.net/knudsfoaf
|http://context.next/eyalsfoaf
+----------
test@localhost>

Querying with Context

Now that we have data in two different named graphs (Sesame has the concept of contexts internally, but they are the same as named graphs for my purposes), we can also make use of this with SPARQL queries. To do this, we can use SPARQL's GRAPH construct. E.g., if I simply want to find all instances of foaf:Person in the repository, I can use this query, which is just your basic SPARQL, that every self-respecting SW nerd knows:
PREFIX foaf: < http://xmlns.com/foaf/0.1/>
SELECT $person
WHERE {
   $person a foaf:Person
}
This will give me six different instances, ignoring context (I did the queries in the Web interface - you can also do them in the console, but that is kind of unwieldy).

Simple SPARQL query result in Sesame2 Web interface The following query, which uses the GRAPH construct, also tells us from which named graph/context each instance stems from:
PREFIX foaf: < http://xmlns.com/foaf/0.1/>
SELECT $person $g
WHERE {
   GRAPH $g {
      $person a foaf:Person
   }
}
The result of this query also shows us that one instance comes from the named graph http://context.net/knudsfoaf and five from http://context.next/eyalsfoaf.

SPARQL query result with named graphs in Sesame2 Web interface Finally, we can rewrite the query such that we will only get those instances that come from a particular graph:
PREFIX foaf: < http://xmlns.com/foaf/0.1/>
SELECT $person
WHERE {
   GRAPH < http://context.next/eyalsfoaf> {
      $person a foaf:Person
   }
}
The result set of this query now contains those five instances that are in the http://context.next/eyalsfoaf named graph. Hooray!

SPARQL query result with from a particular named graph in Sesame2 Web interface

Monday, October 08, 2007

WordPress install.php not showing

Although, judging by the time that has passed since my last post, it may appear I'm no longer confused, this is not true. I'm constantly confused, about everything. In fact, just today I tried to install WordPress 2.3 on a virtual server here at DERI, and was very confused when the setup php script didn't run (.../blog/wp-admin/install.php). The browser didn't show any error message, but instead just a blank page. After a while of scratching my head I decided to look at the httpd error logs (always a good idea in such situations, I guess), and found this:
$ sudo tail /etc/httpd/logs/error_log
...
[client 10.2.18.50] PHP Fatal error:  Allowed memory size of 8388608 bytes exhausted (tried to allocate 233472 bytes) in /var/www/html/blog/wp-admin/includes/schema.php on line 107
Allowed memory size of 8388608 bytes exhausted (tried to allocate 0 bytes)
Aha! Some memory problem. So, googling for "PHP Fatal error: Allowed memory size of" quickly brought me to this page, which had the solution: just increase the allowed memory size for PHP scripts from the default 8MB to something higher (e.g. 12MB). This can either be done at the top of the culprit script itself (didn't work for me):
ini_set("memory_limit","12M");
or in /etc/php.ini (or wherever your php.ini is):
memory_limit = 12M
Perfect. After fixing this, the installation ran smoothly.