« « Bringing data journalism into curricula

Data Delver: Mark Schaver, Louisville Courier » »

Self-teaching data and programming skills

Posted by on Mar 25, 2010 in Blog, CAR, django, programming, theory | 7 Comments

So, you think data journalism and programming are some valuable skills you’d like to learn.  Unfortunately, your school doesn’t offer this as a track, or maybe you’re not even in school anymore.  You know people will help you, and you think it all seems cool, but it can be difficult knowing just where to start.  I know I felt forced to stand still for too many months in terms of my programming skills — I was frozen by an overwhelming feeling.  Again, it shouldn’t be this hard to just figure out how to get started.

This past week, I received a few emails asking me to address this issue.  I’ve said some of this before, but here’s my perspective on how to go about teaching yourself, when you don’t have the luxury of it being pre-established in your curriculum.

Read on for an adaptation of my marathon-length email — you really don’t want to suffer through the entire piece as it was originally written at an ungodly hour, I promise.


Hello, fellow journalist seeking to conquer data and programming!  How do you get rolling?  Here’s what I did.

Before you jump into programming proper, especially if you have an eye toward Web frameworks, I would argue that you’ve got to start by understanding how relational databases work.  Relational does not refer to how multiple databases relate to each other (what I first thought), but how various columns in a database, or even a spreadsheet, relate to each other.

Get familiar with Excel and Access, hopefully you’re using some data in your reporting already.  Next, install the Firefox extension SQLite Manager , and import some .csv data sets.  (CSV means the “Values” are “Separated” by “Comma[s]“.  Hence, Comma-Separated Values.)  You can save Excel docs as csv, and sometimes get csvs directly from govt web sites.  Look at the introduction to SQL commands here.  Then start using these skills on your data in the SQLite manager.  Ask yourself some questions about the data, write queries to answer them.  Think about how data is broken down into various pieces, or columns.  Example: To some, an obituary is a block of text.  To someone familiar with relational databases, it’s a series of columns.  Date of birth, date of death, occupation 1, spouse, children, cities lived in, etc.  Database skills will be important in both managing data for data apps, and in understanding the data-based structure Django itself runs on.  Essential.

Think Python: How to Think Like A Computer Scientist is part of how I learned so far.  Still haven’t gotten through the whole thing, but found up through Chapter 5 to be ample prep to make me comfortable enough to dive into Django.  I also highly recommend the Head First Programming book.  Yes, it’s offline, and costs money, but I found it to be well worth it.  It will introduce you to programming concepts, using Python.  But it’s not teaching Python, there’s a difference.  What it will do is help you understand the backbone of programming, you’ll learn what classes and methods are.

(As an aside, Head First is how I’ve learned a lot of other stuff, too, including Javascript.  I was even able to give a mini-talk on Javascript at NICAR based off of what I learned, and answer most questions from people who do this professionally.  Point is — use this series.  They have a great SQL book that may also be helpful.)

Django:  Django Book is great, and so are the docs, but they’re hard to jump into.  Try following along with the tutorial first.  It’ll walk you through the process of building an app.  This may take several hours.  Took me two full days.  Don’t just follow the steps, but try to internalize what’s happening.  Actually, installing Django can be tougher than using the actual language, esp. on a Mac.  After you make this practice app, start thinking about another project you want to make.  Stick with a similar structure to this sample project, for starters.

Think of what you want to do, then start using the docs and Django book and Google to look up those specific commands.  This sounds like a haphazard way to learn — but really makes it much easier.  Haphazard is the journalist’s way, deadlines, adrenaline rush, etc.  That’s all well and good, but I think you need a basis first.  I don’t think it’s possible to do your first app on deadline.  If you think it will take a day, it, well, won’t.  And I’d hate to let editors down like that.  But you’ll keep improving, bit by bit.  In the end, you’ve just got to build.  I recommend working your way through the other stuff first, because it enables you to understand what’s possible.  But when you’re building your own projects, that’s when the fun truly begins.  And trial by error is your friend.

Also, check out Chris Amico’s (@eyeseast) blog posts on prereqs and required reading for another viewpoint on getting started in Django, if you haven’t done so already.  He’s the resident data geek at the Newshour.  What I’ve done so far dovetails a bit with his suggestions, but I did some things a little differently.  It’s a different journey for everyone.

One thing about journo-programming I’ve learned is that you don’t get given slack just because you don’t come from a coding background.  Maybe that sounds harsh, but that’s because it is.  Your code better be as solid as the person next to you, and it doesn’t matter where either of you went to school, or what you majored in.  And you don’t get a more flexible deadline because it’s a new skill.  Your story better be done on time, and accurate, and easy to read.  Same with your app.  The refreshing part about this is coding is a great equalizer.  There’s no reason you won’t be given as much of a chance as anyone else.  You don’t need money to get the software, and any newsroom can develop apps if people have time and persevere to get the skills.  The skills will come easier to some, but even if it takes a while, you WILL get it.

Remember that you are not alone.  If you have a specific question about how to do something, tweet it.  If that fails, start pinging people specifically.  And I’m certainly happy to help.  Almost all of the programmer-journalists got to where they are because someone helped them.  So they like to pay it forward.  I’m only where I am this rapidly because of the marvelous Derek Willis and the entire CAR/programming/NICAR-L community.  As I like to put it, find your Derek.  There’s about a thousand people on the NICAR-L listserv, last time I checked.  We can spread the mentors around.  Find your person and people.  What you’re looking for is someone who believes in helping people get going on this. There’s a lot of them.

Also google your questions, subscribe to the Django-users Google group and ask questions.  You may not understand everything, or at first, much of anything that comes across that listserv.  I still don’t get what’s being discussed most of the time.  But it helps to understand what’s possible, and start looking into what other people are doing.

I approach figuring this out like a beat, get to know the key people, think about what I’m curious about, find the answers.  Instead of reporting the answers in an article, I store them in my brain, and use it to make stuff.

And practice.  A lot.  Whether it’s SQL or Python or Django.  Do something w/coding every day if you can.  Or at least 3 times a week.  Or else it goes stale.  I know it’s hard to balance, trust me.  The rewards are vast, though.  Not many people can do this, so if you can, you’re differentiating yourself in a key way.  Plus, it’s just cool.

Let’s continue the journey together!

« « Bringing data journalism into curricula

Data Delver: Mark Schaver, Louisville Courier » »