Creating Bar Charts with Gnuplot
As part of the work on my thesis, I'm currently analysing lots of log files that show the access patterns to the Semantic Web Dogfood linked data server. At the end, I want to auto-generate nice-looking bar charts from them which show usage over time. Having generate lots of csv files (using a combination of Webalizer, standard Unix commands like sed and grep, and finally some Ruby scripting), my first idea was to do this with Excel. However, while it is certainly possible and easy enough to generate one chart with Excel, I couldn't see an easy way to do it in batches (maybe with AppleScript?). So I decided to use gnuplot, which, while it has a higher learning curve, is so much easier to configure and automate.
Installation was easy enough on Mac OS 10.5. I tried a MacPorts port first, but unfortunately that just generate lots of errors during the installation process. As a last resort, I tried installing gnuplot from source (I'm always a little scared of doing that...) - and it worked like a breeze! I had expected it to be the other way around…
Anyway, after going through some short tutorials and the gnuplot documentation, I finally came up with this plot to generate the bar charts, which reads its data from a 'numbers.dat'
file:
#our dat file looks like this: ## date hits files pages visits sites kbytes #20081126 1854 1080 1811 246 136 18060 # we want the x-axis to be a time line set xdata time # by default, the x-axis is defined by the first column in the # .dat file. the format of our dates is like "20081230", we # need to tell gnuplot about that: set timefmt "%Y%m%d" # rotate the labels on the x-axis by 90° set xtics rotate # set the set term postscript eps color blacktext # we want to output the plot to a file: set output "hits.eps" # we want to generate solid looking bars for our bar charts, so # we tell gnuplots to use a solid fill style set style fill solid 0.5 # now plot the second (hits) and third (files) column from our # data file, using boxes instead of the default crosses plot "numbers.dat" using 1:2 title 'hits' with boxes,\ "numbers.dat" using 1:3 title 'files' with boxes
And the output is:
P.S.: Very handy for editing the plot file is the gnuplot bundle for TextMate.
4 Comments:
Hi Knud,
looks cool :)
I"ve seen that Gnuplot 4.2 has a way of enabling transparency, which might make the graph even look cooler, and also show the possibly hidden lines.
Maybe
set style fill transparent solid 0.5 noborder
does the trick?
Andreas.
Good idea, I'll try that! However, in this case there are no hidden lines, because there will never be more hits than files (except if we get a lot of 404s).
The transparency doesn't work for me in postscript, and I need postscript, because the X11 terminal doesn't allow me to insert symbols. You have a solution to my problem? :-)
The transparency doesn't work for me in postscript, and I need postscript, because the X11 terminal doesn't allow me to insert symbols. You have a solution to my problem? :-)
Post a Comment
<< Home