Clustered/Stacked Filled Bar Graph Generator

  The Script

I wanted a scriptable bar graph generator for my PhD thesis that supported stacked and clustered bars, but couldn't find one that played well with latex and had all the features I wanted, so I built my own. I followed the scheme of Graham Williams' barchart shell script to have gnuplot produce fig output and then mangle it to fill in the bars. I added support for more than just two or three clustered datasets and support for stacked bars, as well as automatic averaging and other features.

The script is bargraph.pl, released under the GPL.
Version 4.3 was released June 1, 2008, with new support for error bars.

Features:

  Usage

The script's usage message shows the command-line options:

Usage: bargraph.pl [-gnuplot] [-fig] [-pdf] [-png [-non-transparent]] [-eps] <graphfile>

File format:
<graph parameters>
<data>

Graph parameter types:
<value_param>=<value>
=<bool_param>
The script takes in a single file that specifies the data to graph and control parameters for customizing the graph. The parameters must precede the data in the file. Comments can be included in a graph file following the # character.

The script's output, by default, is encapsulated postscript (.eps). It first produces data to send to gnuplot, which can be seen by specifying -gnuplot. Next, the script takes the resulting fig output from gnuplot and post-processes it to fill in the bars. The final fig data can also be selected via -fig. This data is then sent to fig2dev to produce a final figure.

I keep my data in .perf files and have my Makefile generate .eps for latex and .png for slides or web pages. See converting to non-vector formats for notes on avoiding aliasing and other problems when creating images, and for some Makefile rules. My script magnifies 2x when converting to png to help avoid these problems, but for most uses that's not enough and you should follow my suggestions rather than using -png. My default for -png produces a transparent background; the -non-transparent option disables that feature.

The following sections describe each graph parameter.

  Multiple Datasets
  Data Manipulation
  Graph Display
  Examples

 

  Converting to Non-Vector Formats

Because fig2dev does not perform anti-aliasing, converting directly to an image format can result in very poor quality lines and text. This problem is compounded if that image is subsequently resized without any anti-aliasing, such as by your web browser: a case in point is the image on the right.

The solution is to magnify the vector data to at least 4x and then generate a lossless bitmap format, such as TIFF. From there, have a real image manipulator (such as mogrify) resize it to the size you want. For displaying in html, you should choose the final size at this point -- you cannot really make browser-resizable bar graphs.

Below are my Makefile rules for creating the .png images for this page, including removing the 2nd TIFF page (I don't know why fig2dev generates it). Note that mogrify preserves the image's aspect ratio by default, so asking for 700x700 asks for the image to be shrunk so that its longest dimension is 700.

SIZE=700
%.png: %.tiff
	mogrify -resize ${SIZE}x${SIZE} -format png $<
# older mogrify uses these names:
# rm $@.1
# mv $@.0 $@
	rm $*-1.png
	mv $*-0.png $@
%.tiff: %.perf
	bargraph.pl -fig $< | fig2dev -L tiff -m 4 > $@

For including in slides, PowerPoint does perform anti-aliasing, and I found that going straight to png from fig with a magnification of 4x was enough to be able to resize the image in PowerPoint and have it look good at any size:

%.png: %.perf
	bargraph.pl -fig $< | fig2dev -L png -m 4 > $@

  Caveats and Future Work

Some issues and future work with my current script:

  Version History
4.3 -- June 1, 2008

Added error bar support (from Mohammad Ansari), along with miscellaneous options (-non-transparent option, =color_per_datum, datascale=, datasub=, =nolegend).

4.2 -- May 25, 2007

Added support for gnuplot 4.2 (the default fig styles changed).

4.1 -- April 1, 2007

Fixed bugs in handling scientific notation and negative offsets in fonts.

4.0 -- October 16, 2006

Added support for clusters of stacked bars, font face and size changes, and negative maximum values.

3.0 -- July 15, 2006

Added support for custom table delimiters, spaces in names, and the =nocommas option.

2.0 -- January 21, 2006

This version added pattern fill support and fixed issues with supporting large numbers of datasets.

  Contact

Comments to