Public Broadcasting Service
Starting October 2010, I took my passion over to public media for the first “real job.” I continue to experiment with how data can tell stories, and how we can make the most of it by bringing interactivity to projects. Delighted to have much of the same freedoms I’ve had up to this point, and now we can really get going. No more interning! Our new PBS site (and what I’m working on right now) hasn’t launched yet, but once it does, watch this space for updates, as well as the main blog.
Los Angeles Times
From April to October 2010, I created data-driven applications as an intern at the Los Angeles Times’ Data Desk and latimes.com.
- I relaunched the LAT’s bestseller site as a structured database. Now, each week’s list are connected, and you can see a title or author’s success ebb and flow over time.
- Created an application as part of an investigative project looking at where California redevelopment housing dollars were going. Users could see the specifics of their local agency’s budget, and whether they qualified for three “warning signs” that might mean money was being misused.
- Built an updatable application to track campaign contributions to Prop 19, which was used up to the final vote in 2010. My wonderful colleagues continued updating even after I departed.
- As a predecessor to the previous app, I created a framework for tracking campaign contributions during elections. This was first implemented in June 2010 for a re-launch of the LA Times’ Proposition 8 Campaign Contributions database, previously done in Caspio. Programmers can do better!
The user can search contributions by many different parameters, link to specific donations, and view state breakdowns for the amount of dollars given opposing and supporting the proposition.
Medill School of Journalism, Northwestern University
From January to March 2010, I embarked on an independent study of data visualization, which you can read all about in the “Class” category. The final project explored the history of art galleries in Chicago. Why have some galleries stayed in the area for decades, even through economic ups and downs, and why have others left? What does it take for a gallery to persist? I compiled a database from scratch, and used it to feed a searchable interactive application created via Django (get code on GitHub), as well as a Flash visualization. I then reported out the story as a more traditional print article, and reflected on the whole experience in my blog.
One of my side projects during this time was delving full-force into teaching myself Web development, specifically via Python and Django. People say these frameworks are powerful, but I didn’t truly grasp just how powerful until I decided to sit down and complete an app in a day. In February, before the Academy Awards, I found myself wondering at what age the most people won acting awards. The resulting application, “Age and the Academy Awards,” is here, and the source code can be found on GitHub here or you can download this zip file.
Going back further, I confirmed my love for data-driven work through this large-scale data project, a joint effort between the Medill | Washington graduate class from Fall 2009, the Tribune Company and the Center for Responsive Politics. We took a close look at lobbyists for health care-related organizations, and analyzed their ties to Washington lawmakers and committees with a stake in the health care debate. This involved creating our own database and culling information from various sources, a task we performed as a class.
I served as one of the students on the “data team,” which was responsible for analyzing the data, finding the stories within it, and providing necessary information to the graphics and writing teams at Medill and the Tribune. We also cleaned and quintuple-checked the data, preparing it for public release in a searchable database on the Tribune web site. The Tribune story using our data hit the front pages of the Chicago Tribune and Los Angeles Times on Dec. 20, 2009, and was featured on CNN and MSNBC. The Medill-written story is here.
Also while at the Washington bureau of the Medill News Service, I covered a speech by the Greek Orthodox Ecuminical Patriarch Bartholomew, in which he spoke of a moral duty for his followers to preserve the world’s water supply. But how much danger is the world’s water supply in? I produced a story analyzing and visualizing data from the Environmental Protection Agency to delve deeper.
I did a shadow day at the Center for Responsive Politics to learn more about watchdog organizations, and while there, I wrote this data-based backgrounder on the pro-Israel lobby.
I would be remiss if I didn’t mention the work that jumpstarted my formal data journalism experience — and that was a computer-assisted reporting class I took with Derek Willis, a newsroom developer with the Interactive News Technology Group at the New York Times. Here’s the syllabus outline listing the topics we covered, where the primary skills we focused on included advanced Excel and SQLite. You can learn more about our assignments and readings by exploring the class site.
In support of a philosophy of transparency, I’ve posted the logs from my midterm and final. They follow my stream of consciousness thought processes and SQL statements that I use to analyze data sets. The midterm log looks at this data set from the National Agricultural Statistics Service, looking at county crops. And the final log looks at border crossing data from the Bureau of Transportation Statistics (I also made an interactive graph for this project.)