« « A list of 40 CAR-friendly news organizations
(my adventures in parsing the IRE directory)


Why technology matters: It’s about reporting » »

Importance of combining data analysis with context (reflections on readings from week two)

“Visual and Statistical Thinking: Displays of Evidence for Making Decisions,” Edward Tufte, Visual Explanations

This chapter gave practical examples of something I’ve been saying from almost the first day of my data analysis journey — that it’s absolutely fundamental that the decisions behind the analysis are shown to the reader/user.  Data’s never perfect, and if you think it is, that’s probably the most likely clue that something’s wrong.

That means it’s essential to tell your audience what you’re concerned about, what you were unable to verify for certain, and how you got the answer you got. Tufte gives examples of this being necessary in visualizations form nearly a century ago, but I would argue that it’s more important than ever now.

As more news organizations are posting databases online, so users can dig deeper themselves (a move supporting openness and transparency that I am absolutely behind, by the way), we must give readers all the information so they can approach the data with a similar methodology as we did.  Providing context is a major point of Tufte’s, and especially important for journalists.  One of our strengths is knowledge of the beat, and being able to take a dataset and show how it’s relevant to the ins and outs of that particular industry: education, finance, etc.  If we solely provide data, or solely write stories based on observations, either way, we’re only fighting half the battle.  Put beat reporting and data analysis together though, and present it in an easily-understandable way, and then we can really pack a punch.

“Getting Started with Processing” and “Mapping,” Ben Fry, Visualizing Data

In this portion of the book, we learn the basic structure of Processing.  I was surprised at how similar it was to Actionscript, in terms of breaking down a program to its component parts. There are certain types of commands, or functions, that you perform once as part of setup, and others that you want to happen repeatedly.

It was very exciting to create this map, where I was able to put the structure into practice.  The big difference between using a programming language like this to create a map, and depending on a service like Google, is the flexibility that it allows you to have.  The issue with Google Maps, at least the way I’ve been doing them, is that they don’t really look like my own. Sure, you can customize the placemarks, and the info contained when you click on each object, but it’s always going to be a Google Map.

What I loved about working through the map exercise in this book is that I could explore different types of visualization.  I saw that expressing a value by interpolating a color between red and blue doesn’t really work, because the eye doesn’t necessarily register purple as being on a certain scale in between red and blue.

The size of dots in the center of a state seems to work well, although that’s a better methodology for marking a specific point, not the polygon of an entire state.  When looking at unemployment rates across a state, it would have been better to color the whole state, making it light blue for smaller values, and deep blue for larger values.  But it was easiest for me to think about these issues when playing with them in code myself, and I have a newfound respect for the work designers do.

Pick the wrong type of visualization, and you’ve lost communication with the user.  And if you’ve lost communication with the user, it doesn’t matter how great your story is.  No one will know it exists if it isn’t shared effectively with the world.  It’s essential that the user understand each data point in context with all the others in the set, because without that element, it’s just a set of jumbled numbers that isn’t conveying real information.

« « A list of 40 CAR-friendly news organizations
(my adventures in parsing the IRE directory)


Why technology matters: It’s about reporting » »