Tutorial: Stacked bar graph on deadline with Google Chart Tools

ONA 2011, Once Upon A Datum, Michelle Minkoff

We're going to make an interactive stacked bar graph showing the change (or lack thereof) in the breakdown of categories of crime in various states between the totals for 2010 and 2011. The final piece we're heading for is this: http://michelleminkoff.com/crime-stats/. Click on any screenshot to view a full-size version. You can also grab the completed project as a zipped file here, download, customize and upload to your server. More links related to this session, and charting, may be found here.
  1. To download the raw data we're going to use, go to this link: http://www.fbi.gov/about-us/cjis/ucr/crime-in-the-u.s/2010/crime-in-the-u.s.-2010/tables/10tbl04.xls. Click on the "Download Excel" link just below where it says "Crime in the United States by Region...". An Excel file should now appear in your download folder. It will look like this: http://www.fbi.gov/about-us/cjis/ucr/crime-in-the-u.s/2010/crime-in-the-u.s.-2010/tables/10tbl04.xls.

  2. When we look at the raw data, we see it needs a little bit of cleanup in order to get just the info we need, which is a list of states, and numbers of how many times incidents in different categories occurred in 2009, and the same in 2010. Start by deleting every other column, starting with C. We're going to go with absolute numbers here, because the bottom line is how many incidents occur. Changes in rate and population allow for easier comparison in some instances, but regardless of population change, a certain number of people were affected by violent crime. Qualifying measures don't change that. Then we'll delete rows 1-3, and then 5, to get rid of extra headers that are cluttering up our data. This dataset also includes summaries for regions, so delete any row that's not a state, such as "Northeast". Move items from 2010 rows onto the same line as 2009, so it's altogether, and then rename/add headers to name each column of data, also identifying the year (Larceny Theft 09, Robbery 10, etc.) Highlight all the numbers, right-click on them, click format cell, click the number tab, click the number category, uncheck the "Use Thousands Separator box" and click okay. This will change "1,000" to "1000". Adding commas in the middle of numbers makes it hard for the program to understand that a number is a number and not a word.

  3. And now we have our formatted Excel data!

  4. But we need to convert it into a more Web-friendly format! A popular one is called JSON, although something named XML also works quite well. It's not difficult to do this conversion, thanks to a tool released by Shan Carter of the New York Times called Mr. Data Converter. We'll need to give it up some data, so copy the data, including the headers, and all of the actual numbers. Go to Mr. Data Converter at http://shancarter.com/data_converter/. Copy your data into the top field.

  5. In the drop-down menu, where it says "Output", go ahead and select "JSON-Row Arrays". Now, there's a whole bunch of text in the bottom field. Before we go any further, let's talk about the structure of our project files. We'll have four: an html page where we tell the different content pieces to live, a css page where we tell the computer how those content items should look,a JavaScript page where we write basic lines of programming to provide interactivity, and a JSON file, where we put this formatted data we're creating right now. Take the text that Mr. Data Converter spit out, copy it, and paste it into a new file, name it whatever you want, ending it with .json. It's a good idea to put these four files in one folder. Let's make them all now. Name each one what you'd like, making sure they each end in .html, .css, .js (for JavaScript), and .json, respectively.

  6. Let's tweak that JSON file a little bit now. Break out the first line, which should list all of your headers, cut it, and paste it above everything else. Add headers= before it, to give this data a name. Add allCrimeData= before the rest of the data, to give the actual numbers a name. Note that each state has its own line. That way, we can switch the data we use to make the graph easily, and jump between different states.

  7. Follow the structure of the file below to create our basic HTML page. In the head section, we title the page, and add links to all those other files, so the computer knows they exist. You could leave the CSS file out, if you want to put your CSS directly on this page. Lines 7-10 must be included to make the chart work, it loads some files from Google to make charts go. Then, in the body section of the page, you really need just one line. Create a div, and give it an id name you'll remember. When we create the chart, we need to tell it what area of the page, or div, to live in. So, here it is.

  8. I'm kind of an overachiever, so I'd like to add a few more elements to the HTML page, to be more thorough about notes about the data, and make it easy to flip between different states. So, I'm going to add a few things to the HTML now, but you could skip this for the simplest chart. I added some divs to create a headline and descriptive text, or chatter, about this page. I created a very simple form (lns 32-25) that allows you to select states, but right now, there are no choices in it, but the select element has an id name so I can find it easily. And under the chart, I have a div to enter a disclaimer about certain states' data, when I need to.

  9. We said that CSS determines how things look. That's what this file does. I'm adjusting size, margin, adding some borders. If you have questions about CSS, google it, and look for a site called w3schools.com. It'll help.

  10. So far, we haven't actually made our graph, or done any coding at all. Let's go into that .js file. Different blocks of code are called functions. Just like every paragraph we write should have a topic, every function we write should have a purpose. Then, we put it altogether, ultimately, just like wrapping up a story with an intro or conclusion. This first one is called drawVisualization(), so you can guess what it does. Blue lines with slashes are comments, you don't need to add them, but they explain what's happening.
    First, it's about restructuring the data in a table this graphing library understands. Instead of an Excel spreadsheet, this graphing library uses a DataTable. So, we'll create one (line 20). We'll lay out the years that will be in our graph (ln 22). Add a column to that DataTable to hold our years. Add additional columns, for each item in our heading, we create a column, fill it with the header, just like filling out the first row in Excel.
    Now, we define how much data the DataTable should hold. This is determined by how many rows we have (52, with states + DC + nationwide data). Fill out the first item in each of those rows pulling in the year from that particular row. Next, we'll divide the remaining data in 2, half of the values are for 2009, half for 2010. We start counting our loop at 1 (even though we typically start counting loops at 0), because we skip the year. Divide that number by 2 to get how many columns must be filled for each year. Set the value of the number based on the appropriate number, then set a special value for the formatted number (which displays when you hover over it) separately (add commas back in). Do all this for 2009 data (lns 44-48) and 2010 (lns 50-54).

  11. We've set up the data now, but still haven't drawn the chart. Remember we created that home for the chart, that div in HTML. We name the chart, and tell the chart to go in that special section. Then we have a part called chart.draw, which...draws the chart. If you've ever customized a Twitter widget or anything to embed, it's the same idea. Set your options. Want to add a legend? Just tell the computer what side of the graph to draw it on. We tell it the graph is stacked, which allows us to divide the bar into sections, or crime categories. Put labels on your axes, x horizontal, y vertical. Define what colors you want to use. Make the background color gray, add a border that's 1 pixel thick (strokeWidth).

  12. Quick note: How do you decide what colors you want to use? One of my favorite tools is ColorBrewer. http://colorbrewer2.org/. It's technically made for maps, but it's great for charts, too. Tell it how many categories you have (9 in this case), and what type of data you have (is it a range of numbers? are there positives/negatives? or just different categories/ That's sequential, diverging and qualitative data, respectively.)

  13. Remember how we had to format the numbers with commas, so it displays nicely when you hover on the graph? We need to write some code to make this happen. We're notNow, we add a separate block of code, which does this. Here's a secret lesson of coding -- you're not the only person. This is the Power of Google. Look up what you want to do. Sure, enough find an addCommas() function here, and copy/paste. http://www.mredkj.com/javascript/nfbasic.html.

  14. We could stop here, and be almost done. But we have data between multiple states, and it'd be nice to be able to switch the graph around, depending on what info I care about. If you remember, we have a drop-down menu in our HTML page. We need to tell the data to change when we pick a new state. We'll write a quick piece of code to that, called changeState(). We're using something called jQuery, which makes JavaScript a little easier. We brought it in on the HTML page. First, we test to make sure the selected menu item has a value. If it does, we continue. We clear our disclaimer section, so two disclaimers don't display at once. We select the value of the state that we picked in the menu, and give it a name. Then we select a new row from the data that corresponds to that state. (ln 87). Now, we use what's called a conditional statement. If the value is a certain state, then display this disclaimer. We go through this with a few disclaimers that were included with the original dataset. That way, disclaimers only show when corresponding to a relevant state. After that, we redraw the graph with a new data.

  15. One last step to bring it all together. We tell this big function, like a conclusion, to wait until all the pieces are ready. First, we set up the dropdown menu, and fill in all the states from th data, adding a --- between the nationwide data selector (United States), and the individual states, to separate them (lns 112-117). Then, we attach that changeState function to run every time a selected item in the menu changes. Finally, run that changeGraph function for the United States, which is the default. We want to display the appropriate disclaimers, even if no one clicked on that drop down menu yet. And, voila! We did it!!! meminkoff@gmail.com or @michelleminkoff on Twitter with questions. It'd be my privilege if we could continue to learn and work together.