« « Committing fact errors in visualizations

How to “Group By” in Excel » »

Changes in the numbers of students majoring in programming and social sciences

What is a traditional path to programming nowadays? It’s a question I’ve been thinking about a lot, esp. in the realm of the journalist-programmer. So many people from the older school of journalism came through using databases as tools to help with reporting, not because they took a class in it. That’s certainly encouraging for someone like me.

But that started the wheels turning in my brain — Are people still getting computer science degrees? Is the major more prevalent now than 5 years ago? 20 years ago? How do the number of comp. sci. majors compare to..social science, let’s say, another field that dovetails nicely with CAR (and yet another one I didn’t major in, but that’s beside the point.)

Questions like these are always good candidates for my weekly Processing projects. This week — time-related data! I went to the National Science Foundation, where they track the student getting various degrees in various fields. It’s broken down in lots of interesting ways, across gender, race, state, field of interest in the sciences, etc. A future project is the gender breakdown — where are all the female CAR/data/programmer journalists? I know a few, but still…we’re in my minority. It’s not surprising, I remember my own experiences growing up as the daughter of an Argonne computer scientist, and there weren’t many women around that building in the ’90s either, and there still aren’t as many as I would like to see.

Back to the point….For this project, I decided to look at the number of computer science degrees granted nationwide, across 40 years, from 1966 to 2006 (the most recent data available). I compared this to the same numbers for engineering and social science. The idea being to compare CS to one very technical applied field, and one more theoretical one, but all related to data in some capacity. I was amazed to see just how much more popular social science degrees were across the board, I hadn’t realized the distinction was that stark.

I also took note of the fact that the number of computer science degrees were higher in 2006 than ever before, but not that much higher than the dot-com boom in the late 1990s. But overall, there’s an upward trend. Not everyone’s going the self-taught route, although I imagine those with degrees must still self-teach, because you can’t possibly be taught every language that will be invented during your working life.

In terms of creating the actual project itself, I struggled with getting the data in an acceptable format for Processing. Excel was spitting out a text-delimited file, which is apparently distinct from a text-separated values file. Nuances, nuances. And then, I left the commas in the middle of numbers. It looked right to my eye, but there was no way a computer was going to understand that as a float. Note for next time!

I’m not completely happy with this final product. The code is taken almost verbatim from Ben Fry’s “Visualizing Data” Processing book (which I highly recommend). And for this type of comparison, I would have preferred a way to place all three graphs next to each other, so you could look at the data for engineering and CS degrees side-by-side. The y axis label is also a bit close to the numbers.

But what I am proud of is that I was able to pull off a tabbed graph. This is what the lovely Tribune graphics people did for our joint health care lobbying project, and at the time I remember thinking that it must be really tough to do. Now, I understand how it’s done, and was able to do it with a book holding my hand while I did. Next time, I’ll be even more self-sufficient.

If anyone wants encouragement for embarking on the journey of learning programming, I would argue this is a darn good reason. That feeling of satisfaction when you can make things happen, and truly become master of the machine, is simply amazing. What seems like magic, and impossible, becomes real and tangible. It’s just another tool in your arsenal. And then you use your journalistic creativity to think of how best to use it. How cool is that?

« « Committing fact errors in visualizations

How to “Group By” in Excel » »
  • http://www.anthonydebarros.com Anthony DeBarros

    Nice graphing, and I agree that having the lines all one one tab would be better.

    Also, you might consider re-graphing this to look at comp sci degrees as a percentage of all degrees awarded each year. The raw numbers tell part of the story, but you’re more likely to wind up reporting that xx% of bachelor’s degrees in 19xx were for computer science, compared to xx% in 19xx. Given population growth and (I assume) more degrees awarded each year, the raw numbers aren’t the best way to measure change.

    [Reply]

  • http://www.michelleminkoff.com MichelleMinkoff

    Thanks, Tony! I was wondering about that myself. (I confess to doing this under deadline, after spending the day hanging with the Chicago Tribune apps team out here…) I’m going to expand it, revamp the graph and try to report out a more in-depth story.

    I’ve used percentages in the past, and was playing to see if this would be a case in which actual numbers worked. In retrospect, percentages would have been better.

    In your experience, are there any cases in which actual numbers are suitable for comparisons, or are percentages preferable when comparing across time in every circumstance?

    [Reply]

  • http://www.anthonydebarros.com Anthony DeBarros

    In general, raw numbers are key when quantity itself is the story — for example, the number of people born in 1990 vs. the number of people born in 2000. But when we’re examining the components of a universe, then percentages of the whole are a better gauge — for example, the percent of the population in 1990 that was white compared with the percent in 2000. In that case, to get a sense of how America’s demographics have changed, it wouldn’t matter whether the raw number of white people was higher or lower. What’s significant is how the percentage of the population that is white has changed over time.

    [Reply]