The best chart is always task-dependent, but let me assume that you would choose the scatterplot as the best chart and the pie as the worst. They are like water and oil: impossible to mix them!
Let me tell you about a little experiment. I call it the scatterplot pie just for fun, and the idea is to display proportions using a scatterplot.
A traditional pie chart with two data points can be reduced to an angle:
The same message, no fat. And because there are no textures and arcs to deal with we can now superimpose many pies.
These are percentages of the age group 65+ in 1996 (left) and 2050 (right) for 220 countries (data from the US Census Bureau). This comparison clearly shows that the World is getting older.
One of the problems with pie charts is that you can compare proportions but you can’t compare wholes. In the images above we are comparing very different country sizes (Tuvalu and China?). With the scatterplot pie we can add this dimension:
China and India are not helping resolution, but it would happen to any other chart. We can focus on a detail:
Other things we could do:
- Group series (in this example, color-coding by continent would show us the significant differences between Europe and Africa);
- Add axis and circular grid lines to improve readability;
- Set line transparency to 50%;
- Remove the vertical line or make it look like a grid line;
- Label the more significant data points.
I actually like this idea and I’ll test it a bit further. I’ll try to decide if it is a good alternative to stacked bar charts. It should also work with three or more slices, but just because it works it doesn’t mean we should use it (like most chart options in Excel…).
So, what do you think? Would you use it?
This is not entirely mine: I was inspired by this comment. And I’m sure someone must have though about this first. If it rings a bell please let me know…)
8 thoughts on “The best of two worlds: the scatterplot pie”
It’s always great to explore different ways to present data, but I think there are simpler ways to do this that will be clearer to read. The single pie chart visualizes a single number: the proportion. That means that the “fan plot” that overlays a bunch of angles could be better visualize by a histogram (or a dot plot, if you don’t want to bin the data). To add in country size, you can plot the scatter plot of “proportion over 65” vs. the size of the country….although log10(size) probably works better. Grouping by color to show Europe/Africa/Asia…is then easy. In summary, I think the scatter plot will visualize these data better.
Rick: while I agree that a simple scatterplot would be a better choice, the idea behind this post is to identify ways of displaying proportions with higher data density. It’s like replacing a pie chart with a bar chart: since you can’t see the whole, you gain precision but you lose that sense of proportion.
I like “fan plot”, let’s call it that.
See http://stat-computing.org/newsletter/issues/scgn-20-1.pdf for an article about fan plots. The R package plotrix provides code for fan plots. I suggest checking whether these are the same as your proposed plot to avoid the possible use of the same term for different plots.
Thanks for the link Naomi. These are different concepts, so I better try to find a new name for it. Suggestions accepted.
When I search for a name I can think of, I see that someone else has already given a plot that name; e.g., angle plot.
It’s always interesting to try another way to look at things, and you seem very creative at this task. But then one has to think of what has been done, and whether it’s actually an improvement or merely a fun diversion. And my task is to be the grumpy old man.
You’ve converted X-Y values (population and some percentage thereof) into R-theta values. These are harder to read, since the R values are no longer represented by parallel vectors, and the percentages increase as the R vector points further downward. It also is conceptually corrupt, because there is no 2-pi periodicity in either of the variables.
Seems to me a standard scatterplot would be better, using X=population and Y=percentage above 65 years old. To show trends, you could use open points for 1996 and filled points for 2050, connected by thin lines. Color code for continent or for other categorical factors.
P.S. I strongly dislike the fan plot in the article Naomi has cited.
Jon: I have absolutely now doubts that a standard scatterplot would do a better job if we are not trying to compare proportions of a whole. I also believe that once people start improving their data visualization skills, “proportions of a whole” (ie pies) is a less relevant analysis, as it should.
That said, my little experiment is about proportions of a whole, so the whole must be visible or implicit. I see a pie in the first image on the right. Do you see it? Can you not see it the moment I tell you that’s how you should read it? I accept that varying vector lengths are harder to read in terms of shaping an implicit pie.
So, this chart cannot be compared to a scatterplot. Its effectiveness should be compared to stacked bar charts or doughnut charts (or the square pie).
PS I don’t like the fan plot either.
Yeah, proportions of a whole, I get it. But we’re talking about only two values; one value, really, because the second value is what’s left after we take out the first. Percent over 65, and percent under 65.
I don’t think we need a pie or a stacked bar or a donut or a waffle or anything else to envision such a single percentage, even if the percentage is not displayed as making up part of an arbitrary shape.
If it’s crucial that this value be envisioned in terms of a fraction of 100%, then scale the value axis from 0% to 100%. Voila, fraction of total distance along the axis represents proportion of the whole.
Comments are closed.