This is the second in a series of two posts in which I share some thoughts I’ve accumulated over the past few years about how to draw better graphs. In the previous post, I recommended that normalised data should usually be plotted on a logarithmic scale. In this post, I argue that scatter plots can be easier to understand than bar charts. To elaborate on my recommendation, I draw upon many examples of graphs I found in the proceedings of PLDI 2019.
In Praise of Scatter Plots
The problem with a bar chart – and this applies whether the y-axis shows absolute values plotted on a linear scale, or normalised values plotted on a logarithmic scale – is that the x-axis tends to be underemployed. It has the capacity to represent some sort of quantity, but often is used simply to spread out a collection of benchmarks in some arbitrary order. I think that information can often be conveyed more effectively to the reader by using a scatter plot instead.
As an example, here is a classic bar chart, showing the performance of a “new” technique compared to an “old” technique over a range of benchmarks.
The graph’s structure is straightforward: when the green bar is higher, the old technique is faster, and when the red bar is higher, the new technique is faster. This graph provides all the information needed to compare the two techniques. But it is not easy to see “at a glance” which technique is better.
Here is an analogous scatter plot.
I reckon that scatter plots take longer to read than bar charts, but less time to understand. What I mean is: it’s immediately obvious from the bar chart that the height of a bar represents the time taken for that benchmark, but it takes a few moments to work out that in the scatter plot, the points above the diagonal represent benchmarks where the old technique is faster, and those below the diagonal represent benchmarks where the new technique is faster.
However, once this has been established, it becomes straightforward to compare the two techniques. One can immediately make observations like “the new technique seems to win on the shorter-running benchmarks, but to lose on the longer ones”.
It’s worth pointing out that my scatter plot shows less information than my bar chart, because it does not identify the individual benchmarks. Perhaps, if there are not too many benchmarks, it would be possible to label the points individually, or to use a different colour for each point. Of course, this risks overcomplicating things, and we are often more concerned with general trends than with the performance of particular benchmarks. A reasonable compromise might be to colour a handful of the particularly interesting points so that they can be referred to in the surrounding text.
By the way, I have used semi-transparent markers in my scatter plot. I find this quite an attractive way to deal with multiple points being almost or exactly on top of each other. With opaque markers, coincident points could get lost.
Scatter plots can cope with more adventurous situations too. For instance, here is a scatter plot that compares two variants of the “new” technique against the “old” one.
And here’s a scatter plot that conveys the uncertainty surrounding each data point using ellipses. The width of the ellipse corresponds to the uncertainty in the x-value, and the height of the ellipse corresponds to the uncertainty in the y-value. Ellipses that cross the y=x diagonal represent benchmarks where we’re not sure which technique is better.
Examples from PLDI 2019
I found eight PLDI 2019 papers that contained bar charts which might have been more effective as scatter plots. The full version of this article lists all of the examples I found; here I’m just going to focus on a couple of interesting ones.
The first example is from Renaissance: benchmarking suite for parallel applications on the JVM by Prokopec et al.
It uses a logarithmic scale for normalised data, thus meeting the first criterion in this post, but has an under-employed x-axis that is only being used to spread out the benchmarks in an arbitrary order. I suspect that plotting “time taken before” against “time taken afterwards” on a scatter plot would be more informative. There could be one ellipse per benchmark, sized according to the uncertainty in the measurements, and four different colours of ellipse, corresponding to the four groups of benchmarks.
The second example is from Computing summaries of string loops in C for better testing and refactoring by Kapus et al.
It is a bit different from most of the bar charts that I looked at because the benchmarks are not in an arbitrary order along the x-axis; they are in descending order of their y-values. However, I think showing “time taken originally” against “time taken afterwards” on a scatter plot would be more informative.
Summary
Graphs can be a hugely valuable tool for communicating quantitative information. And I daresay that they are the first thing readers look at when glancing through any academic paper. So it is really worthwhile getting them right. In the first post of this series, I suggested how graphs of normalised data can be made to convey speedup figures more insightfully by using logarithmic scales, and in this second post I argued that scatter plots may be more effective than bar charts when the set of benchmarks is substantial.
An extended version of this article is available on my blog. LaTeX code for the graphs drawn by me is available.
Bio: John Wickerson is a Lecturer in the Department of Electrical and Electronic Engineering at Imperial College London, where he researches programming languages and hardware design.
Disclaimer: These posts are written by individual contributors to share their thoughts on the SIGPLAN blog for the benefit of the community. Any views or opinions represented in this blog are personal, belong solely to the blog author and do not represent those of ACM SIGPLAN or its parent organization, ACM.