« « Tools to help bring data to your journalism

Join the data convo at PBS NewsNav » »

Using Google spreadsheets as your database

Posted by on Nov 22, 2010 in Blog, data visualizations | 22 Comments

UPDATE 2/7/12: The whole, copyable script has been added to the bottom of this post, at the request of a few folks. Hopefully that’s helpful.

The database is a key part of data journalism.  We spend a good amount of time making structured of the unstructured, and then we need to shove it somewhere so a web app can serve up relevant pieces, perform calculations to help us understand nuance.  Straight text files just have their limits. But does that mean you can’t build a data app that dynamically updates if you don’t know the difference between MySQL and PostgreSQL and the newest Harry Potter movie? — Oh wait, that last one’s a sequel, not SQL! *Wow, that was a bad pun.  Sorry. It’s a sickness.*  No, no. no.  You can use Google Spreadsheets, a fantastic tool already being used in many newsrooms to foster collaboration. I wouldn’t trust it with my multi-year investigation, but it can serve your data into an HTML table on your website, and will adjust as you (or other approved contributors) add to the spreadsheet.

Would you like an example?  While I was at my LA Times internship, there was this thing called the Bell scandal.  If you read this blog, you’re probably plugged in enough to have heard of it, but for the uninitiated, in the city of Bell, an LA Times report on the high salaries of the city manager and other top administrators generated…interest. Turns out Bell wasn’t the only city where this was an issue.  So, people started asking, “How much does my city manager make?”  Sounds like a job for the data folks!

We came up with this.  I can’t take much of the credit here, reporting interns and staffers did the legwork to get the data, and Web producer Anthony Pesce did much of the design and did the front-end implementation, but through Tribune’s proprietary content management system, as opposed to our “guerilla warfare” separate server approach that houses the Data Desk projects.

The part for us Data Desk devs was how to pull this collaborative spreadsheet into the HTML template housed in the CMS.  I churned out a rough version of the quick backend stuff overnight, and Ken Schwencke and Ben Welsh helped make it run lighter and smoother, and made many improvements I couldn’t have thought of.

Here’s a walkthrough, in case it’s of use to anyone out there, and also because I’d love to hear about related tips, tricks and improvements you have. It’s not any big secret, all available by viewing source on the Bell database web page.

A caveat: This is ABSOLUTELY NOT to say you don’t need to use a database backend anymore. You have more control over a database, it can take bigger chunks of information, do more complicated analysis, not to mention that you know exactly when it’s going public, and what going public means. (Maybe I’m just a control freak.) But if your data set is small, and you don’t have a full development server, that shouldn’t be a complete road block. There are cool, and better yet, useful, things you can do.

This quick project will depend on Javascript, a key way to get interactivity into your website without resorting to technologies that are less compatible across platforms because of their newness (HTML5) or their proprietary platforms (Flash). I’m actually spending more time on this language than Python/Django at the moment, for PBS-related reasons.

Here we go!

1. First, create your Google Spreadsheet, add in all your data, and then hit Publish in the publication settings.  At this point, it will be open to the public.  This is necessary because the computer needs to be able to “see” your spreadsheet, which it can’t if the spreadsheet is privacy protected.

2. In your blank doc, create an area to type Javascript.  Tell the computer what to expect by typing:

<script type="text/javascript">// <![CDATA[
<pre lang="JAVASCRIPT">var pre_html = '
<table id = "salaries">
<thead id="anthonyHeader">
<tr>
<th>City</th>
<th>Name</th>
\
<th>2009 taxablecompensation*</th>
\
<th>Note</th>
<th>Pending request</th>
<th>Population</th>
<th>Median income</th>
<th>Contract</th>
</tr>
</thead>
<tbody>';

4. Create an empty Javascript variable which I called actual_html.  This is the html that correspodns to each record in the spreadsheet, that is, each row.  Leave it as an empty string (two quotes next to each other, indicating no text) for now, we’ll get back to it.

var actual_html='';

5. Create a variable to hold the end of the table which will run after all records are completed.  Close that table up.

var post_html = '</tbody>
</table>
'

6. Calculate the length of filled-out information in the spreadsheet. Json refers back to the spreadsheet, feed refers to how we’re pulling the spreadsheet, entry tells the computer we’re looking at each row as a separate item or entry, and length…well…you get that one.

&nbsp;Json.var len = json.feed.entry.length

7. Loop through the spreadsheet as many times as there are records.  After each loop, increase the counter variable by one, so the count moves forward.

for (var i=0; i<len; i++) {

8. First item of business, try to grab something from the rollover description field on the given row.  These are notes that show up if LAT reporters had additional details to add for context.

&nbsp;var d = json.feed.entry[i].gsx$rolloverdescription.$t;

9. Check if d exists, in this case, to see if there is text for a rollover or not.  Depending on whether it exists, we then generate a row with a whole block of HTML.  This isn’t a super efficient way to do it, but got the job done on a deadline.  You can go look it all up in our source (just go to the link, view source, and search for ‘function displayContent(json)’ and you’ll get to the right place), but here’s a general structure, built from an excerpt of the code.

actual_html += ['
<tr>
<td>', \
'<a target="_blank" title="Mapping L.A." href="',\
 json.feed.entry[i].gsx$mappinglink.$t, '">', json.feed.entry[i].gsx$city.$t, '</a>',&nbsp;'</td>
</tr>
'].join('');

I cut some lines out, but that starts the table row, calls in a url, generates a tooltip for it, brings in the attribute from the spreadsheet, close the row, all between the [].  Then, there’s a .join(“), which means combine everything into one large block of text, and separate them with nothing.  It makes more sense if you think of joining ten items as part of a list where you might want to combine into one string of text, but shove commas in between. One more note: The actual_html += means we append, or add, each version of this variable to the last, so as we loop through the spreadsheet, we’re recording the code for all the rows.  Otherwise, you would only see the last row.

10. Finally, grab a div you’ve set up where the table should go, by getting it by its id, and adjust its inner_html to take in all the table code.

document.getElementById('salary_table_container').innerHTML = pre_html + actual_html + post_html
11. Close your script.

// ]]></script>
12.  This is a little backward, but include a reference to your actual spreadsheet, and tell the computer from now on you will be referring to it by another name — that’s what the “callback=” does.

 
How do you make it pretty?  Use HTML and CSS.  We also used Jquery, and a fantastic plugin called DataTables.  It was sort of my not-so-secret weapon of interactivity awesomesauce over at the LAT, and you probably will see it again soon at PBS.  The beauty of open source, it can follow you wherever you go, West or East.  I’ll do a separate post on DataTables sometime soon.

Here’s a copyable version of the complete script:

<html>
    <head>
    <title>Completed sample for turning a Google Spreadsheet into an HTML Table</title>
    <!-- Add a div where the table will go -->
 
    <script type='text/javascript'>
        //Write a function that allows us to add commas to raw numbers in spreadsheet for display.  2413 becomes 2,413.
        function addCommas(nStr)
        {
        	nStr += '';
        	x = nStr.split('.');
        	x1 = x[0];
        	x2 = x.length > 1 ? '.' + x[1] : '';
        	var rgx = /(\d+)(\d{3})/;
        	while (rgx.test(x1)) {
        		x1 = x1.replace(rgx, '$1' + ',' + '$2');
        	}
        	return x1 + x2;
        }
 
        //this will be executed after we "fetch" the contents of a spreadsheet. The json parameter will hold our spreadsheet's data
        function displayContent(json) {
            //start an html table and write out our headers.  Using <td> tags so it isn't bold and centered, which is the <th> default.
            var pre_html = '<table id = "salaries"><tr><td>City</td><td>Name</td>\
    <td>2009 taxable<br />compensation*</td>\
    <td>Population</td><td>Median income</td></thead><tbody>';
            //Create an empty string to hold the HTML. We will put table data here.
            var actual_html='';
            //After we grab the table, close the HTML table.
            var post_html = '</tbody></table>'
            //figure out how many rows our spreadsheet has
            var len = json.feed.entry.length
 
            //loop through the spreadsheet, gathering data
            for (var i=0; i<len; i++) {
                //for each row, add the following to actual HTML, grabbing it as a list, and then joining the list together as one long string.
                //Uses HTML for table cells, and then grabs attributes from the spreadsheet, using appropriate syntax. Enter your table header in the Google spreadsheet between            
                //the gsx$ and .$t.
                actual_html += [
                    '<tr><td>', 
                    '<a target="_blank" title="Mapping L.A." href="', 
                    json.feed.entry[i].gsx$mappinglink.$t, 
                    '">', 
                    json.feed.entry[i].gsx$city.$t, 
                    '</a>', 
                    '</td><td>', 
                    '<a target="_blank" title="City website" href="',  
                    json.feed.entry[i].gsx$personwebsite.$t, 
                    '">', 
                    json.feed.entry[i].gsx$name.$t, 
                    '</a>', 
                    '<br /><span>', 
                    json.feed.entry[i].gsx$title.$t, 
                    '</span>', 
                    '<a target="_blank"  href="', 
                    json.feed.entry[i].gsx$email.$t, 
                    '"></a>', 
                    '</td><td>', 
                    addCommas(json.feed.entry[i].gsx$compensation.$t), 
                    '</td>', 
                    '<td>', 
                    addCommas(json.feed.entry[i].gsx$population.$t), 
                    '</td>',
                       '<td>', 
                        addCommas(json.feed.entry[i].gsx$medianincome.$t), 
                        '</td>','</tr>'
                ].join('');  
            }
            //put all three of our HTML strings into our div we made at the top of the page
            document.getElementById('salary_table_container').innerHTML = pre_html + actual_html + post_html
 
        }   
    </script>
</head>
<body>
<div id="salary_table_container"></div>
 
<!-- Grab the data from our google spreadsheet, using the callback to call the function we just created -->
<script src="http://spreadsheets.google.com/feeds/list/0AlO_LanGCFltdGlsa2hSNHhOdHlCYmpFeXdlY0ZNS2c/od6/public/values?alt=json-in-script&amp;callback=displayContent" type="text/javascript"></script>
 
</body>
</html>
Questions in the meantime?  Just holler.
« « Tools to help bring data to your journalism

Join the data convo at PBS NewsNav » »
  • http://apps.chicagotribune.com Brian Boyer

    This technique can also be applied server-side. Our homicides app (http://homicides.redeyechicago.com/) and all our table-setter apps (http://blog.apps.chicagotribune.com/2010/11/19/tostones-easy-tablesetter-testing-and-deployment/) use Google spreadsheets as a backend.

    [Reply]

    Mark Ng Reply:

    You can also combine with the google charts API to create charts that go with the data, too. I knocked together a proof of concept a while back : ( http://hack.markng.co.uk/broadband/ )

    [Reply]

  • Michelle Minkoff

    Thanks for the added examples, guys! Trying it server-side is def. on the to-do list.

    [Reply]

  • Tef

    Hi,

    I try to modify this code. I use my own spreadsheet but It won’t run. $t is undefined. Can anyone tell why $t is undefined.

    Thanks,
    T.

    [Reply]

  • CKAYE

    It ran “as is” for me. Thanks for keeping your source spread sheet available Michelle! I’d love to see the spread sheet as it is in Google Docs. Can you provide a read only link?

    [Reply]

  • Greg Papuga

    Great example, exactly what I have been looking for – Thanks! Question though, I’d like to take this just example just a little bit farther and was hoping you could help. I have a spreadsheet with dates (and other information) in it that were posted from a Google Form text field. The dates that were posted are in no particular order in the spreadsheet column, however I would like to extract that spreadsheet data via the JSON method above and have the information from the spreadsheet displayed on a webpage in date order. How easy is it to tweak the above example to meet that criteria? and can you point me in the right direction or help me with it?

    Thank you!

    [Reply]

    Michelle Minkoff Reply:

    Just two lines of code to sort that data, shouldn’t be bad. Wrote up an example here: http://michelleminkoff.com/2012/11/21/learning-how-to-sort-a-javascriptjson-object-with-underscore/, let me know if that helps. If you need more info, feel free to shoot me a follow-up email via the Contact Me tab at the top of the site.

    [Reply]

  • Pingback: Michelle Minkoff » Learning how to…sort a JavaScript/JSON object with Underscore()

  • http://www.facebook.com/people/Ben-Marwick/520471502 Ben Marwick

    This is really helpful, I can reproduce your example and modify it for my own spreadsheets. One thing I\’m curious about is how to query with the json. For example, how would your code look if you just wanted to show only the records where \’title\’ = \’City manager\’? Thanks again for sharing this great resource!

    [Reply]

  • Constantin

    Hi,

    By blank doc, do you mean making a new script within the spreadsheet (under tools?)

    [Reply]

    Michelle Minkoff Reply:

    Neither. These are instructions for creating a Web page using JavaScript, which you then can post on the Web somewhere (could be on your own website, or embedded in a WordPress post, or anywhere else. A “blank doc” is a page you create in a text editor (like TextWrangler, TextMate, Sublime Text) to write custom code for a web page, which you then publish online.

    The Google spreadsheet is being used as a place to store the data for your project, but not where the final visible display lives, in this example.

    Hope that helps.

    [Reply]

    Constantin Reply:

    Thank you for your response! the project im working on is specifically with google sites. do you know if I can just create my own webpages and still be able to use app scrips? Sorry if this is too off topic of your blog post.

    [Reply]

    Michelle Minkoff Reply:

    I don’t use Google Sites a lot myself, but here’s a link that may help — it explains how to add custom HTML, CSS and JavaScript (which we use in this tutorial/example), to a Google Site. https://support.google.com/sites/answer/2500646?hl=en

  • Yukesh

    hi i m doing a project where i get my google form response in a spreadsheet. Now i want that spreadsheet responses to be converted to a SQLite database or any database which later connected to my android app…….please help me out with a solution.

    [Reply]

  • John

    If I wanted to do this, how would I change the code to link to my public spreadsheet? I’m just attempting to copy your code and use it as a code block in squarespace which works. But when I try to change your spreadsheet URL to my public one on Google, it doesn’t work. Hmm.

    [Reply]

  • Jonathan Baillie Strong

    Great post here Michelle, but it seems like the front end example of where you published the data is no longer available?
    (http://www.latimes.com/news/local/la-me-salary-database-js,0,6202025.htmlstory doesn’t show the data)

    [Reply]

  • Shannon Augustine

    Curious – are you able to pull data from multiple worksheets? I have this spreadsheets with 5 tabs and I am trying to develop one page with two worksheet tabs data in it, but it seems to not be working – it overrides the first one source.

    [Reply]

  • Joel

    I’m curious how to use a document with multiple sheets as well… Each sheet would correspond with it’s own DIV.

    [Reply]

  • Joel

    I’ve tried virtually everything I can think of, to get this to work with my Google spreadsheet, and can’t.

    It works fine, when I copy and paste your demo, however, not with my own Spreadsheet. Nothing shows up.

    [Reply]

  • Vyacheslav Maltsev

    Миха, спасибо! Это просто охуенно! То что искал :)

    [Reply]

  • Vyacheslav Maltsev

    Code not working in html-box. It says: failed to load external url full?alt=json-in-script&callback=listEvents

    Any ideas what happened? Spreadsheet still can be opened.

    [Reply]