Building data-driven websites has many different applications, and to say that I’m close to mastering all the needed techniques would be a lie. But if there’s one thing I learned from my “summer” internship at the LA Times, it’s that with some persistence, curiosity and masterful Googling, you can find what you need to know. I’ve learned enough to know it’s not good enough to do data journalism the way it’s always been done. And I know enough to make my visions of pushing things forward start to form into reality.
I’ve been thinking a lot lately about the type of work we do. If we get a mention on Poynter or Nieman Lab, this is a great thing. But what’s more important is creating projects that mean a lot to the general public. So, while much of past data journalism has been based in politics and investigative work, and that remains a cornerstone, there’s this other part of data journ I think we should pursue. That’s structuring our journalism as data across beats. Creating projects for features as well as hard news. Because news consumers with a variety of interests deserve the interactivity, personalization and cold, hard truth that data can bring to a story. We CAN do this, and we SHOULD do this.
We have control over how we structure and present our own content, and we can do better with linking it up. Background: Started thinking about this as a simple idea when I read Derek Willis’ Rivers of Data, which takes the concept much further. The piece is from 2005, and it’s frightening how much remains true. I also wrote about this a bit for Poynter.
So, while I’m proud of all the work I did at the LAT — from a campaign contributions application to an investigative sidecar that explores how YOUR redevelopment agency is spending its money, my redesign and relaunch of the LA Times’ bestseller list as a data project, that’s kind of my favorite accomplishment of the summer.
If you look at the site, you don’t see the back end that takes a complex formula that was being calculated by hand week after week, and translates it into a function that calculates lists automatically. You don’t see the nuanced details of the internal system I created, which wouldn’t be possible without how easy the Django admin makes these sort of things. I hope it helps reduce the manual calculations for people who enter bestseller lists week after week.
But instead of standalone lists, now we can loop things together. Bestsellers lists come out every week, with the same darn structure. We can, and must, take advantage of that fact. We can, and must, present those lists differently, with better interlinking, on the Web. It can, and must, be different than how we’ve done these sort of things in print. This makes exploring bestsellers specific to the SoCal area easier for the readers/users who care about digging deep. That’s what it’s all about.
Show me the data. It’s a repeatable mantra, and one of my favorites. Don’t just tell me “The Help” has been on the list 71 weeks. Show me how it ranked, and when it ranked, each of those 71 weeks. Let me click on those weeks, and see the ranking in context.
Not because it’s some cool way to tell a story (although, I like to think it is), but because if you’re a lit geek, it matters. More information, the better.
And this project won’t become irrelevant because of stale data. The LAT will keep coming out with its bestseller list, and the database will slurp that structured information into an integrated site. Find the information you want, search for a specific title or author you care about, browse to learn more information about what you never knew existed.
Collaborate with reporters, producers, developers, analysts. Not one set of them, but anyone who’ll give it a shot. The data structure is well and good. But…the person on the books staff who’ll add in links to LAT book reviews associated with each book, that makes it even better. If the computer can’t get it done, enlist the people who really know the subject. Structuring our news as data isn’t the final piece. It’s the beginning. That structure allows us to add pieces to enhance how people can explore information. We used to just access reviews by flipping through the paper. Then, we added search engines. Now, a deep dive page about a given book’s bestseller history can link you to that review. The more ways users can get to our content, the better. How effective is it? We’ll see. But we can’t stick to structuring our news the same way, and failing to give it a shot.
Successful data projects require integration and collaboration across parts of the newsroom, and that’s something we were able to accomplish on this project. Thanks to so many at the LAT for bearing with me, during this and other endeavors, listening to some crazy ideas. Can’t wait to see how the bestsellers site grows with this new framework in place.
So, data projects, what I spend my life on now. Yeah, they’re hot these days. They’re also a way we’ve been doing some parts of journalism for decades. But here’s my philosophy, as someone who’s been dabbling/obsessing about this for a while: Don’t just do it for investigations, don’t just do it for one type of news. Don’t just see it as a technique for one type of project. Data’s not just a specialty, it’s also a way we can structure our sites themselves. All to help people explore their world. All to tell the story.
This is a philosophy I’ve been dreaming about for a while. Haven’t pushed it far enough yet. But bestsellers is a major step toward what can be, integrating data with the general newsroom process. I know it’s not an entirely new idea, but having the skills to put a vision like that into reality, to create this sort of a framework, for someone doing this for about a year, it’s a great “parting shot” (as Ben called it) for my time at the LAT.
Bet I can do even more at PBS, as the learning continues. Less than a week till I get out there. Don’t know exactly what we’re going to put together, but it’s going to be pretty darn cool. And I hope this philosophy provides a decent jumping off point for some discussions, and some of our new content.
Watch out Washington – I’m coming your way soon! Start at PBS on Oct. 25. It’s going to be a roller coaster of a ride!