A companion post to my NTTS2017 presentation

My presentation at NTTS 2017 is titled An evaluation of data visualization practices of statistical institutes. I’m writing this post to share a few ideas with people not familiar with my work. Some of these ideas require context, and I will not be able to provide it within the 15-minute allocation time. I’ll update the post if there are questions I’m unable to answer during the session.

Core message

When we have a small table, it’s OK to use hard numbers to communicate, and perhaps we can use a chart or two to illustrate them. When our table grows, we have to shift our analysis and communication from the individual data points to their relationships. For that to happen, charts need to move to center stage, and their design must change, along with their nature. Tables and charts switch roles: now we communicate with charts, and use a few hard numbers to illustrate.

The problem with most charts in publications from the Eurostat and from national statistical institutes is that, at their heart, they remain illustrations. Because more data was added, and their nature, purpose and design didn’t change much, they became both less effective and less efficient. I present several examples of this chart-as-illustration perspective and possible alternatives.

Effectiveness and efficiency

How good a chart is at making invisible relationships visible defines its effectiveness (more on that later). How well it manages finite resources (page/screen real estate, color constraints) defines its efficiency. While we discuss effectiveness all the time, efficiency is often overlooked (because most of the time we have enough space to display a single chart?). But now we have to take small screens into account, and designing “graphic landscapes” (infographics, dashboards) requires a better management of the available resources. Effectiveness and efficiency are closely connected, and in many cases when you improve one you’ll notice a positive impact on the other.

Aesthetics…

Making a chart easy to read, making it relevant to me or displaying unexpected patterns: grabbing audience’s attention doesn’t have to always be about aesthetics. Problem is, adding makeup (by way of canned visual effects) is a simpler path. Vendors take advantage of that to add a few bells and whistles to “make your chart look professional and memorable”, code for “silly effects not found in Excel or PowerPoint”. It’s possible that they do grab your attention once, or even twice, but a third time will put you off for good.

If you can’t use canned effects, if most defaults are ugly or ineffective, and if a statistician is not required to possess artistic talent or graphic design skills, how do you make charts that are both effective and pleasing to the eye?

I too had to find a way to create more pleasing charts without this apparently basic talent/skill (you can’t imagine how painful is for me to draw a recognizable sticky figure). Much of my book is devoted to this.

… for mere mortals

Here is what works for me. All design choices when making a chart have an aesthetic and a functional dimension (form and function). Understanding and managing the functional dimension is much easier than the aesthetic dimension: if you want to emphasize a series in a line chart you can use a saturated color, then use pale colors to encode the remaining series and gray for axis and grid lines. You can read this as “managing stimuli intensity”, no aesthetics involved. Functional choices impact the aesthetic result (and the other way around), but my own experience tells me that, when I put aesthetics first, the end result will be ugly.

When you put function first you can play with ideas and concepts without feeling you are losing control to vague and contradictory sensations of beauty and aesthetics. The chart below is an example of a hobby of mine: trying to salvage apparently hopeless chart types, like the gauge / speedometer. It displays three pointers instead of one, and each pointer is actually a time series. The jury is still out on this chart, but it could be used in very specific cases. Except for the chart type itself, all design choices can be justified rationally.

A gauge / speedometer with time series encoded into pointers

Left brain, right brain. Really?

Later this month I’ll be in Pamplona, Spain, for the Malofiej, infographic summit and awards. On the surface, NTTS and Malofiej can hardly be more distant from each other. Most people at NTTS come from statistical institutes or similar organizations, while at Malofiej most people are graphic designers, artists, journalists. Kind of left brain vs. right brain.

I know several people attending both conferences, so maybe this is not about brain hemispheres. Maybe at a not-so-fundamental level they are more similar than expected. We can easily see this in a recent article, where Stephen Few proposes seven criteria to evaluate a data visualization effectiveness profile, grouped into two categories, informative (usefulness, completeness, perceptibility, truthfulness, intuitiveness) and emotive (aesthetics, engagement).

These criteria can be applied to a beautiful infographic or to a terribly distorted 3D pie chart. Both are instances of visual communication, and their effectiveness profile can be compared. That said, some criteria are valued differently from field to field, aesthetics being the obvious example. A graphic designer is supposed to be able to create a visualization that is pleasing to the eye and, in some cases, unique. At a statistical office these skills are not required or expected.

If you use data visualization to communicate, you should keep experimenting the effectiveness of your visualizations, and that applies to everyone.

Color

Color is a difficult subject for everyone, with or without the right skills. Apparently, when left unattended, people tend to cram as many saturated colors into a chart as possible. I would need to take a closer look, but my feeling is that national publications where no color constraints seem to be in place have more color issues than the ones following the Eurostat guidelines or similar.

The real issue in both cases (with or without guidelines) is that color is not used effectively from a data visualization point of view. Again, if you identify the functional tasks of color, using it becomes much easier (or less difficult). I identify six tasks: categorize (using colors/hues), group (using colors and tints), emphasize (using color and saturation), sequence (using tints), diverge (colors and tints) and alert (color). You also need to manage gray.

If you try to use color effectively, you’ll probably discover two interesting things: first, we often use color more often (and more colors) than we need; second, if you remove color you’ll have to change other design options that will probably improve your chart.

Tools

There is no shortage of data visualization tools, from the so-called self-service BI tools (PowerBI, Qlik, Tableau) to a vast array of programming languages and libraries (R, Python, D3). And then you have Excel.

Of all the charts published by the Eurostat and the national statistical offices, I’m not aware of a single one that couldn’t be made in Excel, and then some. There are several reasons why Excel is the right tool to make charts for these publications, and also a tool to experiment and go beyond its poor chart library. Excel charts don’t have to look the same, here is one that looks a bit different:

If you think Excel has no place in a conference titled New Techniques and Technologies for Statistics here is a quick reply before you fall into your fake Excel-induced narcoleptic state: you’re wrong. You can use Excel to support new data visualization practices and explore new ways of doing so. And don’t worry, I believe there should be a place for you to explore new tools and cool data visualization gadgets.

You can download the presentation here and the extended abstract here. [I updated the presentation and exported it to PDF. You can find it here.]

Comments, suggestions? Leave them below. Don’t forget to follow me on Twitter (@camoesjo) and the NTTS hashtag (#NTTS2017)

[Update: You can watch the entire session on visualization in the video below. Click start to jump to my presentation]

<br />