« « Personal reflection: Tufte’s messing with my head

Parallelism: Packing information into visualization » »

Data Delver: Perry Swanson, The Gazette

Posted by on Jan 29, 2010 in Blog, CAR, data delvers | No Comments

When we think of computer-assisted reporting, large-scale investigative projects are often what first come to mind. There’s no question of the value and impact such endeavors can have on society.

But CAR has many other purposes, too. One is bringing evidence to and localizing breaking news, and helping people find out more about their communities. One data reporter who has done a fantastic job of using CAR for local news is Perry Swanson of The Gazette in Colorado Springs. Among other projects, what first caught my eye was the Data Geek blog he runs for the paper (one of my favorite CAR-related domain names, by the way!)

He pulls information from local data sets, and extracts relevant data from the national sets. Quick one-off posts always provide enough analysis and context to make the information valuable, without requiring a large scale team. Sometimes they’re more in-depth.  And oh yes, he works on larger projects, writes articles and maintains  the paper’s data bank Info Center as well. I was curious how he got to this point in his career, and what it’s like behind the scenes. We had the opportunity to chat last week about these topics, the future of data journalism, and his newly announced move from data journalist to communications director for a Denver think tank.

This profile of Swanson is a part of my continuing series I’m calling “Data Delvers,” where I pass on summaries, quotes and audio clips from conversations with journalists using technology to find, analyze and convey data-driven stories and/or projects to the modern audience. An extended transcript of the interview is at the bottom of this post, edited for relevance and clarity.

How the data journey began

Audio: Swanson’s start in CAR.Swanson explains why he enjoys interviewing data.

- Swanson first learned about CAR for college professors in the mid-1990s, who emphasized the importance of math and journalism.
– His first job was at The Reporter in Vaceville, Calif., where he was involved in work for Census 2000 — his first professional CAR experience. As of 2001, there was one computer in the newsroom with Internet access.
– He emphasized that the complexity of the tools were less important than making the best use of the tools he did have. Sometimes he didn’t even have access to Excel. But regardless fo the tools, he incorporated data work into his various reporting positions. “What I always try to do with data is just find something that’s surprising.”
– ” I’ve just had tools available to me and editors I’ve been working with who wanted to support that kind of work,” Swanson said.  “I’ve been lucky in that way.”

Day-to-day work: Blogging, reporting, data crunching and posting: How do you balance it?

- When he started the data job at the Gazette in April 2009, much of his work involved posting and linking to databases on Info Center, the paper’s data bank. But he wanted to stay connected to writing data journalism, but didn’t have time for long investigations. “Because I had so many other things going on, I didn’t feel like I could really do this sort of long-form journalism but I still wanted to find some way to do a little bit of journalism around the data.”
– He gets his data sets by signing up for as many press releases and email lists as he can, and then searching everything for “Colorado” and “Colorado Springs.”
– The blog’s audience varies, and is mostly local. The most popular post he ever did was an analysis of a database of government salaries. Swanson said this might be due to “a little bit of voyeurism.”
– He said he did receive feedback and comments, but hoped for more engagement than he got.  His original idea was that people would crunch the data a different way, or suggest ways to improve his methodology. This didn’t happen often.
– He also assists reporters with the data aspects of their stories when needed, and trains his colleagues in the newsroom to help them increase their data skills.

Why he’s leaving data journalism

- Last week, Swanson accepted a position as communications director for the Colorado Center on Law and Policy, an institution he’s written articles about in the past.  He said some of his reasons for leaving include the instability of the industry.

Audio: Why Swanson’s choosing to leave journalism, for now.Swanson explains why he was motivated to leave journalism and accept a position as communications director for the Colorado Center on Law and Policy.

The future of data journalism

Audio: Swanson’s take on the future of data reporting in newsroomsSwanson’s take on the future of data reporting.

-  Swanson said he wonders if data-based positions will continue to exist in newspapers, or even news organizations, but strongly believes that data work will continue to be important in the long run.  Journalists who know how to post data to the Web will be better off.

- Swanson said some journalism organizations are cutting CAR jobs as they increase their focus on more traditional reporting beats, such as crime and city hall, and that CAR isn’t viewed as a “core function.”

- More reporters are being expected to have CAR skills, and there’s less of a role for someone to specialize in it, than use it day-to-day.

- Earlier in his career, Swanson works as a demographics reporter, a role he sees as incorporating aspects of CAR.  When he first got the beat in 2006, and tried to connect with similar reporters, he found a minute number of colleagues across the nation. “If I had been properly motivated, I would have written a story about it, like for an industry publication, because it was just amazing how many different news organizations used to have a demographics reporter and had eliminated the position.”  He thinks that this may be symptomatic of CAR, because demographics and data are so closely linked.

Extended transcript of our Jan. 25 interview

Looking for more detail? Can’t get enough of that data-y goodness?  Go data diving in the data interview! Bonus quote: Upon my asking what aspiring data journalists should know, Swanson said, “You want to go crazy on ArcView.”  I DO want to go crazy on ArcView.  Mapping + databases = fabulous, and often exclusive, stories!  If I could just find a copy laying around Medill….

Can you explain your position at the Gazette? I see that you’re doing smaller short form pieces on your blog with computer-assisted reporting, which is different than how CAR is often thought of, when it’s used for large-scale projects.

The technical name for it is ‘data journalist,’ but really it involves more than just reporting based on data. I keep that blog and then I also do quite a bit of training with other people in the newsroom, just teaching them how to use the blog software, for example or how to use Excel or other programs, and I guess other things with multimedia; how to take video and edit it. I also do a number of special projects where I’m supporting another reporter. If they have a whole lot of data that they need to work with then I typically will come in and do the data side of it while they do the more traditional reporting side of it.

When I started that job back in April of 2009 we had that part of the site called Info Center where we store all of our databases with links to them. Another part of the job was just maintaining Info Center.

I wanted to have some kind of a blog to call attention to what’s going on in Info Center and the different databases that we maintain. Also, it was probably more of a personal interest of mine than anything, just to keep my mind in that place and keep myself on top of all the different data sets that are coming out or even if they’re not particularly new, just data sets that deserve a little more attention. Because I had so many other things going on, I didn’t feel like I could really do this sort of long-form journalism that you mentioned but I still wanted to find some way to do a little bit of journalism around the data. So I just thought a blog would be the best way to put all that together.

It seems like the interest that blog generates really varies a lot. And it varies by subject matter. I’ve had a hard time building a regular audience for it. There are certain people who go back to it all the time but that’s a pretty small group of people who are just interested in data, economic, demographic numbers. But as far as getting a big audience for it, it’s really been more around what kind of information it is. I acquired a database of all the salaries and names and titles for people who work for the city government here a while ago, and that probably blew away everything else in terms of how much attention it got. It might be a little bit of voyeurism going on there. I thought that was a good way to keep things going.

What I hoped to see happen, which hasn’t really happened yet, was if I could offer a quick snapshot look at a data set or something that I thought was interesting, readers would come in and interact with me about it and raise questions about it. Then I could follow up and do something more and different. Or they could raise questions about methodology; say that if I had done it differently I’d get more on-point results. It’s happened a little bit, but it hasn’t happened nearly as much as I’d hoped it would. I guess maybe that just builds over time.

Do you find most of the people that were interacting were local or did you get data-obsessed people from across the country?

I think it was mostly local. I can remember a couple of cases of government people who would get word of the work that I did and they’d send me a note. But I’m pretty sure it was mostly local people. I guess that was one thing that actually made doing the blog a little bit of a challenge. I always wanted it to be local. Colorado Springs-related data if possible. But certainly I tried not to do very much that was nationwide. It had to be either Colorado or Colorado Springs. The smaller the area you’re trying to examine with data, the more difficult it is to get data about it. Still, I was surprised at how much was out there, even just looking at Colorado or Colorado Springs-based data.

What were some of your strategies for finding these various data sets?

The easy quick way of getting a blog is to always to keep an eye out for different studies and press releases and things like that so I just signed up for as many of those releases as possible.

I have one in front of me from National Center on Children in Poverty, which is a national think tank. They will routinely come out with data that is at the county level or the metropolitan statistical area level or the state level. So that’s one way that I got quite a few data sets. I just went around to all the major think tanks and made sure to get their press releases. There’s some federal website where you can sign up for press releases from lots of statistical agencies at one time. That of course means sorting through a lot of email everyday to find something that might possibly have local information in it.

It’s pretty rudimentary what I did, I just download the PDF of the study and then look at the headline — make sure it’s something that’s remotely interesting in terms of a topic and just open up the PDF and do a search for Colorado. If I got a hit then it was likely that I’d have some chance of doing something about it. Then aside from that I’d try to look through our paper every day and just see what other reporters here were doing and a lot of times it would sort of trigger a thought about how if I pursued that subject from a slightly different angle, I bet I could work some data into it.

Also, just throughout my career I’ve always had an interest in data and so I was somewhat familiar with the different data sources. And when I was just at a loss for a blog topic, I’d go back to the same old data source I’d been to a dozen times, and try to do something different with it.

Speaking of your relationship to data through your career, can you take me through how it’s developed?

I’ve been in the business for about 10 years, so I suppose I was in college in the mid ‘90s and in those days, at least to me, computer-assisted reporting was really just starting to get going. I had some professors in college who really emphasized learning math and different computer programs. Things like that interested me. Also we talked a lot about how numbers are different than people in terms of interpreting what a person tells you versus interpreting what numbers are telling you. It’s a much different exercise.

I guess I had more fun, in some cases, working with numbers than I did with people. Although I have many years of experience with conventional reporting. Maybe the way I can capsulize it is just to say that I felt like with numbers there was a much better chance of coming across some findings that nobody could have predicted, that was just new information that you couldn’t really guess just by eyeballing it and I thought that was really interesting. That’s what I always try to do with data, is just find something that’s surprising. So that’s what attracted me to it. And then just through the years at different reporting jobs that I’ve had; it’s not like I’ve had any extraordinary tools. I’ve just had Excel, and there were times when I didn’t even have Excel. And I’ve had Access, and I’ve had MySQL. I’ve just recently been learning how to use ArcView. I’ve just had tools available to me and editors I’ve been working with who wanted to support that kind of work. I’ve been lucky in that way.

So at your first job were you hired for a traditional reporting job and then incorporated this?

My first job out of college was at The Reporter newspaper in Vacaville, Calif. That was just as the results of Census 2000 were coming out. I guess that was probably to this kind of work in a professional setting. But even then, at that time, I know that I didn’t have Excel. I didn’t really have any tools except for the web-based tools that the Census put on its website at that time. Looking back, I had one of those green screen computers.

There were better computers available at that time; it’s just that the newspaper I was working for did not use them. There was one computer in the newsroom that had Internet access, which is kind of crazy that that was true in 2001 because, of course, the Web was established at that time. It took a while for newspapers to catch up, and even now I don’t think they’re really caught up.

I was not at all expected to do anything with data in my first job, but maybe it was that exposure to the Census 2000 numbers that piqued my interest a little bit. My job after that was with the Greeley Tribune. Greeley’s a city in northern Colorado. I did have a spreadsheet program. I started out using it just to keep my address book, because I didn’t have Outlook. Then I started doing various stories where I was looking up a lot of records and there was a need to keep lists and numbers, so I just started doing it on my own there. It wasn’t until I started working at the Gazette in December 2002 that I actually had any real database tools. But I didn’t even get Access until a few years ago.

What’s always been an issue for me is trying to get the kind of tools that are designed to do this kind of work and then, of course, learning how to use them. I would not call myself an expert by any means, but I’ve just always tried to make the most of the limited tools that I’ve had.

You’re talking about learning to use the tools, you’re mostly self-taught, it sounds like. Is it just a matter of learning as you go, any strategies?

I’ve had very little formal training, but there’s always been other people in the newsroom who have similar interests and I’ve just always tried to connect with them. And ask them, when I’m having a problem with software, or even a methodological problem, if I’m not sure how to design something in a way that’s fair and accounting for all the factors, I just take it around to a couple of people in the newsroom and I just ask for their take on it.

Of course I have contacts like government agencies or non-profits or there’s a couple of economics professors and math professors that I will call once in a while just to ask them, “How do you think I should handle this?” Really it’s just been friends and contacts who have taught me and hopefully, I have taught a little something to them too, here and there.

Now, you’ve said that you’re not going to be at The Gazette much longer. Are you able to talk about what you’re doing next?

Last week I accepted a job as communications director for the Colorado Center on Law and Policy. It’s a think tank based in Denver that studies poverty, welfare, health care reform, state budget issues.

What motivated you to do that?

I think more than anything, it’s the meltdown in the news industry. There’ve been a lot of layoffs at the Gazette and of course virtually every other newspaper. And I’ve survived six or seven rounds of layoffs here and it just felt like, to me, that my number could be coming up any day.

I’m reluctant to leave journalism; I really love it. Five years ago I would have said that I hope that never happens. But now I guess I just felt like I needed to do something proactive to find a different way of making a living because I felt like the way I’m making a living now wasn’t really very reliable.

Do you feel that the data reporters are more in trouble, less in trouble, or about the same as everybody else?

I guess I’d say about the same. I spent about two years at the Gazette as the minority affairs and demographics reporter and that involved a whole lot of data analysis particularly around the census, of course. When I first started in that job I felt like I needed to reach out to other people in the same job and see if we could just talk to each other and I could learn about what good work they were doing. So I called all around Colorado and all around the country and I found some places where they would say yeah, we used to have a demographics reporter, but we eliminated that position. If I had been properly motivated, I would have written a story about it, for an industry publication, because it was just amazing how many different news organizations used to have a demographics reporter and had eliminated the position.

And when was this?

I suppose it would have been in 2006. It was an informal little survey. I was just reaching out to try to find somebody and I think I finally found one person.

Just because demographics is so much tied up in data that it feels like that kind of work is somewhat threatened. But on the other hand, demographics and data reporting is also a whole lot tied up in the Internet and there are, of course, a lot of news organizations that are putting more resources into their Internet presence, and in that way it might make a reporter safer to be able to push data onto the website and present it in a way that’s really useful for the audience.

I think there’s no doubt that kind of work has a future. I’m not sure that it has a future within news organizations as such, but I think it definitely has a future since there are many different businesses that need to offer useful data to their customers or audiences.

Why do you think there’s less of it in news organizations? Do you think people have a hard time understanding it?

No. I think that data reporting is just not quite viewed as a core function. I wish it were viewed that way. I think a lot of news organizations are steering back to what they consider the core, like police reporting and city hall reporting.

Also, I think that there’s a growing expectation that all reporters, regardless of their beat, will have some facility with analyzing data. I wholeheartedly agree with that. I think that every reporter needs to be able to use the tools that are necessary to do their job well.

Just from what I’ve seen, it seems like reporters in conventional beats are not yet really reaching out and learning how to do these data analysis techniques. I think that’s a shame. I think they should learn how to do that, but of course they’ve got a million other demands on their time that they didn’t have several years ago. So it’s hard to build in time to learn those sorts of things.

Going back to the tools you were talking about, what was your experience with ArcView? How have you found that useful in terms of data reporting?

You know, I still have not ever used that for a real project. In preparation for Census 2010 the editors here asked me to just learn how to use it, which I thought was great. We didn’t have any real money for training, unfortunately, so the best that I could do was I just bought a book and carved out some time almost every day, and just worked my way through the book. I got about halfway through it by the time I was certain I was leaving for a different job so I’ve kind of dropped it now but I think it’s a fantastic program.

Have you had any involvement with IRE (Investigative Reporters and Editors)?

I guess I’ve been a member for almost all the time since I was in college. I’ve been to a couple of their conferences. I did acquire a data set from them one time: the National Bridge Inventory. I was one of a thousand reporters across the country that did the bridge inventory story after the Minneapolis bridge collapsed.

Any recommendations of what you’d push for an aspiring data reporter to learn; things you wish you’d known?

You want to go crazy on ArcView because that has tremendous potential not just for newspaper reporting, but it also has a lot of application outside the newspaper business, or I should say the news organization business. And it’s just fun to use.

« « Personal reflection: Tufte’s messing with my head

Parallelism: Packing information into visualization » »