Clustered/Stacked Filled Bar Graph Generator

  The Script

I wanted a scriptable bar graph generator for my PhD thesis that supported stacked and clustered bars, but couldn't find one that played well with latex and had all the features I wanted, so I built my own. I followed the scheme of Graham Williams' barchart shell script to have gnuplot produce fig output and then mangle it to fill in the bars. I added support for more than just two or three clustered datasets and support for stacked bars, as well as automatic averaging and other features.

The script is bargraph.pl, released under the GPL. A package that includes the script and samples is also available.

Features:

  Usage

The script's usage message shows the command-line options:

Usage: bargraph.pl [-gnuplot] [-fig] [-pdf] [-png [-non-transparent]] [-eps] <graphfile>

File format:
<graph parameters>
<data>

Graph parameter types:
<value_param>=<value>
=<bool_param>
The script takes in a single file that specifies the data to graph and control parameters for customizing the graph. The parameters must precede the data in the file. Comments can be included in a graph file following the # character.

The script's output, by default, is encapsulated postscript (.eps), which is sent to stdout. Simply redirect it to the desired output file:

   bargraph.pl mygraphfile > mygraphfile.eps

The script first produces data to send to gnuplot, which can be seen by specifying -gnuplot. Next, the script takes the resulting fig output from gnuplot and post-processes it to fill in the bars. The final fig data can also be selected via -fig. This data is then sent to fig2dev to produce a final figure.

I keep my data in .perf files and have my Makefile generate .eps for latex and .png for slides or web pages. See converting to non-vector formats for notes on avoiding aliasing and other problems when creating images, and for some Makefile rules. My script magnifies 2x when converting to png to help avoid these problems, but for most uses that's not enough and you should follow my suggestions rather than using -png. My default for -png produces a transparent background; the -non-transparent option disables that feature.

The following sections describe each graph parameter.

  Multiple Datasets
  Data Manipulation
  Graph Display
  Examples

 

  Converting to Non-Vector Formats

Because fig2dev does not perform anti-aliasing, converting directly to an image format can result in very poor quality lines and text. This problem is compounded if that image is subsequently resized without any anti-aliasing, such as by your web browser: a case in point is the image on the right.

The solution is to magnify the vector data to at least 4x and then generate a lossless bitmap format, such as PPM or TIFF. From there, have a real image manipulator (such as mogrify) resize it to the size you want. For displaying in html, you should choose the final size at this point -- you cannot really make browser-resizable bar graphs.

Below are my Makefile rules for creating the .png images for this page. Note that mogrify preserves the image's aspect ratio by default, so asking for 700x700 asks for the image to be shrunk so that its longest dimension is 700.

SIZE=700
%.png: %.ppm
	mogrify -reverse -flatten $<
	mogrify -resize ${SIZE}x${SIZE} -format png $<
%.ppm: %.perf
	bargraph.pl -fig $< | fig2dev -L ppm -m 4 > $@

The latest gnuplot patterns contains lines that are much closer together than they used to be. With magnification of 4 or higher they shrink down into gray uniformity (can't see individual # lines), so for a pattern plot, a 2-times magnification seems to work the best.

For including in slides, PowerPoint does perform anti-aliasing, and I found that going straight to png from fig with a magnification of 4x was enough to be able to resize the image in PowerPoint and have it look good at any size:

%.png: %.perf
	bargraph.pl -fig $< | fig2dev -L png -m 4 > $@

  Caveats and Future Work

Use the Issue Tracker to see the current list of requested features and reported bugs. Below is a list of some key issues and future work with my current script:

  Version History

The full version history is in the bargraph public repository.

4.8 -- January 2, 2017

Added datadup= and =datadup_merge for repeated identical values. Added colorset= for specifying colors via a list of rgb values. Fixed gnuplut 5.0 problems.

4.7 -- March 25, 2012

Added xscale= and yscale= to properly scale graphs on gnuplot 4.2+. Added custfont= feature. Fixed bugs in legend centering, font bounding boxes, and yerrorbars.

4.6 -- January 31, 2010

Added automatic legend placement, including automatically finding an empty spot inside the graph. Added logarithmic y value support. Added control over leading and intra-bar spacing.

4.5 -- January 17, 2010

Improved legends with a filled background and outline and bounding box. Added sorting, horizontal line drawing, and other features.

4.4 -- August 10, 2009

Added gnuplot 4.3 support along with miscellaneous options (rotateby=, xticshift=, ylabelshift=, =stackabs).

4.3 -- June 1, 2008

Added error bar support along with miscellaneous options (-non-transparent option, =color_per_datum, datascale=, datasub=, =nolegend).

4.2 -- May 25, 2007

Added support for gnuplot 4.2 (the default fig styles changed).

4.1 -- April 1, 2007

Fixed bugs in handling scientific notation and negative offsets in fonts.

4.0 -- October 16, 2006

Added support for clusters of stacked bars, font face and size changes, and negative maximum values.

3.0 -- July 15, 2006

Added support for custom table delimiters, spaces in names, and the =nocommas option.

2.0 -- January 21, 2006

This version added pattern fill support and fixed issues with supporting large numbers of datasets.

  Contact

Bugs or feature requests can be filed using the Issue Tracker.

Other comments can be sent to