Confused Development

I develop software and I often get confused in the process. I usually find the answers after a while, but a month later I can't remember them. So from now on, I will write them down here.

Thursday, October 27, 2011

HTML Output in Puelia/Linked Data API

Puelia is a PHP implementation of the Linked Data API to facilitate publishing of Linked Data from a SPARQL endpoint in the form of a RESTful API. Puelia can give back data representations in various different formats, such as JSON, XML, Turtle or HTML. This is done by implementing so-called "formatter" modules, which are responsible for creating the actual representation. However, while there are a built-in formatters available for a range of formats, there is none available for HTML.

So, how do we get an HTML representation? There are some hints in the Linked Data API spec, but not a complete howto. The answer is, we need to create an instance of the built-in XSLT formatter, tell it to respond to the relevant mimetypes to enable content negotiation (text/html and application/xhtml+xml) and point it to an appropriate XSLT stylesheet (there are a handful available out of the box in Puelia, but we might want to modify them for our needs). We do this in the Puelia configuration file (which uses Turtle syntax) as follows:

<#HTMLFormatter> a api:XsltFormatter ;
  api:name "html" ;
  api:mimeType "text/html" , "application/xhtml+xml" ;
  api:stylesheet "views/xslt-styles/sample.xsl" ;
.

If you're wondering what XML the stylesheet will operate on: it's the XML generated by the built-in XML formatter (api:xmlFormatter). One other thing we have to do is to register our new HTML formatter in a relevant location in the config file. This could either be the API object, or an endpoint object. Below is an example for the API itself, for which we register the built-in JSON formatter and our new HTML formatter:

<#FooAPI> a api:API ;
  rdfs:label "A RESTful API for the Foo Dataset"@en ;
  # ...
  api:formatter api:JsonFormatter, <#HTMLFormatter> ;
  # ...
.

Labels: , ,

Friday, February 11, 2011

Using the svn:keywords property (Keyword Substitution)

I always forget this, so here it goes:

svn propset svn:keywords "Id" filename.txt

Other keywords are: Date, Revision, Author, HeadURL. You can set several keywords at once like this:

svn propset svn:keywords "Date Author" filename.txt

The original documentation is in the SVN book.

Labels:

Friday, October 29, 2010

Using CURL to access a SPARQL endpoint

You can use the command line tool curl to access a SPARQL endpoint, e.g. from Sesame 2. Just do like this:

curl -H "Accept: application/sparql-results+xml" "http://REPOSITORY_URI?query=ESCAPED_QUERY"

The interesting bit is the -H parameter, where you specify header information. What you do here is to tell the endpoint what kind of result format you accept. I always forget how to do that...

Edit:

To get JSON back, the mimetype has to be application/sparql-results+json, at least on SESAME 2. This mimetype was defined in this W3C note.

Labels: , , , , , ,

Tuesday, October 13, 2009

Reverse Proxy to Make SPARQL Endpoint Available on Nice URI

When you set up an RDF repository such as Sesame2, you are likely to get a SPARQL endpoint at a URI such as http://my.server.org:8080/openrdf-sesame/repositories/repname. That's ok, but it doesn't look very nice. In order to serve the same endpoint at a nicer URI, say http://my.server.org/sparql, you can set up something called a reverse proxy on your Apache server. Here is how to do it on Debian:

Add the Reverse Proxy Rule

In /etc/apache2/conf.d, add a new configuration file sparql.conf. Make it look like this:

# Make the SPARQL endpoint of the Sesame server running on port 8080
# available as http://my.server.org/sparql.
#
# Configuration is done as per the Basic Example for Reverse Proxies
# in http://httpd.apache.org/docs/2.2/mod/mod_proxy.html .

<Proxy *>
 Order deny,allow
 Allow from all
</Proxy>
ProxyPass /sparql http://my.server.org:8080/openrdf-sesame/repositories/repname
ProxyPassReverse /sparql http://my.server.org:8080/openrdf-sesame/repositories/repname

Enable the Proxy Modules

In order for this to work, Apache has to load a number of modules having to do with proxies. The easiest and safest way to load them is by using the a2enmod command and select all proxy-related modules:

knumoe@exp3:/etc/apache2/conf.d$ sudo a2enmod
Your choices are: actions alias asis auth_basic auth_digest authn_alias authn_anon authn_dbd authn_dbm authn_default authn_file authnz_ldap authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cern_meta cgi cgid charset_lite dav dav_fs dav_lock dbd deflate dir disk_cache dump_io env expires ext_filter file_cache filter headers ident imagemap include info ldap log_forensic mem_cache mime mime_magic negotiation php5 proxy proxy_ajp proxy_balancer proxy_connect proxy_ftp proxy_http rewrite setenvif speling ssl status substitute suexec unique_id userdir usertrack version vhost_alias
Which module(s) do you want to enable (wildcards ok)?
proxy*

Restart Apache

sudo /etc/init.d/apache2 restart

Labels: , , , ,

Thursday, October 08, 2009

Linux/Unix: Redirecting Output Using ">" with sudo

Problem: In Linux/Unix, you want to redirect output of something to a file with > you don't have write permissions for:

local$ ls -la > test.txt
-bash: test.txt: Permission denied

You think "I can solve this by using sudo"! But, alas:

local$ sudo ls -la > test.txt
-bash: test.txt: Permission denied

The shell won't even ask for your password. This indicates to me that somehow sudo only applies to the ls command here. What seems to happen is that first, the shell tries to open the file and only then runs the ls command to feed it with input. But since you don't have write permissions for this file, the execution of the whole expression is stopped. The way to solve this, as I found out in this blog post, is to run bash (or any other shell) as sudo and then give it the entire expression as a parameter:

local$ sudo bash -c "ls -la > test.txt"
Password:
local$ more test.txt
total 16
drwxr-xr-x  18 root  wheel   612 Oct  8 16:23 .
drwxr-xr-x@ 13 root  wheel   442 Dec  6  2007 ..
drwxr-xr-x@ 13 knud  staff   442 Sep  8 11:45 arc
drwxr-xr-x  61 root  wheel  2074 Apr 17 12:30 bin
drwxr-xr-x  14 root  wheel   476 Dec  4  2007 gwTeX
drwxr-xr-x  33 root  wheel  1122 Feb 17  2009 include
drwxr-xr-x  56 root  wheel  1904 Feb 17  2009 lib
-rw-r--r--   1 root  wheel     0 Oct  8 16:23 test.txt

Note that test.txt is already in the this list, proving that the file was created before the ls command was run!

Labels: , ,

Monday, June 15, 2009

WordPress Blank Screen of Death

I'm using subversion to keep my WordPress installation updated. This always worked perfectly fine, but recently I did an update and wasn't able to access any php driven part of the blog anymore (so, basically everything). All I would get is the dreaded WordPress Blank Screen of Death.

I looked in the error logs (WP had a special error log in the Apache log folder) and saw that there was a Parse Error: parse error, unexpected T_SL reported for one of the php files. After some googling, this post here made me realise that the problem could be related to a subversion conflict, which will lead to < signs showing up in the source files, rendering them invalid. I checked the offending files and really, it turned out that my last svn update had introduced a conflict. To resolve it, I simply deleted the offending files, executed svn update again, and all my troubles were gone!

Labels: , , ,

Friday, January 23, 2009

Creating Bar Charts with Gnuplot

As part of the work on my thesis, I'm currently analysing lots of log files that show the access patterns to the Semantic Web Dogfood linked data server. At the end, I want to auto-generate nice-looking bar charts from them which show usage over time. Having generate lots of csv files (using a combination of Webalizer, standard Unix commands like sed and grep, and finally some Ruby scripting), my first idea was to do this with Excel. However, while it is certainly possible and easy enough to generate one chart with Excel, I couldn't see an easy way to do it in batches (maybe with AppleScript?). So I decided to use gnuplot, which, while it has a higher learning curve, is so much easier to configure and automate.

Installation was easy enough on Mac OS 10.5. I tried a MacPorts port first, but unfortunately that just generate lots of errors during the installation process. As a last resort, I tried installing gnuplot from source (I'm always a little scared of doing that...) - and it worked like a breeze! I had expected it to be the other way around…

Anyway, after going through some short tutorials and the gnuplot documentation, I finally came up with this plot to generate the bar charts, which reads its data from a 'numbers.dat' file:

#our dat file looks like this:
##   date    hits   files   pages  visits   sites  kbytes
#20081126    1854    1080    1811     246     136   18060

# we want the x-axis to be a time line
set xdata time

# by default, the x-axis is defined by the first column in the 
# .dat file.  the format of our dates is like "20081230", we 
# need to tell gnuplot about that:
set timefmt "%Y%m%d"

# rotate the labels on the x-axis by 90°
set xtics rotate

# set the 
set term postscript eps color blacktext

# we want to output the plot to a file:
set output "hits.eps"

# we want to generate solid looking bars for our bar charts, so 
# we tell gnuplots to use a solid fill style
set style fill solid 0.5

# now plot the second (hits) and third (files) column from our 
# data file, using boxes instead of the default crosses
plot "numbers.dat" using 1:2 title 'hits' with boxes,\
     "numbers.dat" using 1:3 title 'files' with boxes

And the output is:

Log File Visualisation with Gnuplot

P.S.: Very handy for editing the plot file is the gnuplot bundle for TextMate.

Labels: , ,