Data Journalism

Recent posts

StateImpact Reporters’ Toolbox: Where to Start?

Steven Depolo / Flickr

StateImpact Policies and Protocol:

What are These Things Called “Topic Pages”?

Building Your Posts and Adding Fun Stuff:

Here is a map of some of the wonderful features you can add to your site. Click on any highlighted feature to learn how to add that feature to a post.  Greatest hits include:

Google Fusion Maps with Joe Wertz

StateImpact Oklahoma’s Joe Wertz, mapping wizard, shows us how to create an interactive map using Google Fusion Maps and some raw data in this quick and simple tutorial.

Here is the data that you can use, if you’d like to try this at home.

(Don’t be alarmed! The audio doesn’t kick in until about ten seconds in.)

Creating a Legend

Of special note, around minute 31 (31:00), Joe walks us through creating a legend for your map. To do that (in the new and improved version):

  1. In the map view, click on the little arrow on top of the “Map of…” tab and navigate down to “change map styles.
  2. Whether you are dealing with points or polygons, make sure that your bucket ranges are exactly as you’d like them to appear in your legend.
  3. Click on “automatic legend” (your last option on the list to the left).
  4. Create a title for your legend.
  5. Decide where you’d like your legend to appear (typically in the bottom right-hand corner,  unless your map makes a different corner preferable).
  6. Hit “Save.”

Embedding an Iframe Into Your Site

At minute 32 (32:35), Joe shows us how to embed an iframe with your map and legend right into your site.

  1. Click on Share (in the upper right-hand corner) and publish it to the web. You’ll see your current sharing settings listed towards the top. If it doesn’t say “Public” hit the blue “Change” and select “Public on the web.” Continue reading

Summer Camp: Excel Formulas with Kyle Stokes

In this 28-minute webinar, StateImpact Indiana reporter Kyle Stokes walks you through some Excel basics, including:

  • using cell references!
  • copying formulas down a column!
  • calculating sum and difference!
  • and calculating rate of change!

To follow along, download the data here.

And for quick reference, here is a sampling of some of our favorite Excel formulas for journalists, below: Continue reading

Summer Camp: Excel Basics with Molly Bloom

Let StateImpact Reporter Molly Bloom lead  you through some basics of using Excel, including: 
  • Importing data from txt and csv files!
  • Sorting and filtering data!
  • and Creating easy and super-powerful pivot tables for data analysis and organization!

To experiment with the files used in this video:
  1. Report Card data: Go here and click on File –> Download as –> Microsoft Excel.
  2. Enrollment data is here.
  3. School levy data is here

Webinar: The Basics of Data Journalism

Data: It’s not just for computer geeks anymore.

There is an awful lot that data can do for you and your stories if you get over the idea that it is something for other, more computer-savvy people and get into the habit of working it into your daily routine. Think of it as simply another–but infinitely more authoritative–source, one that allows you to speak with more authority, see beyond the clutter and the “he said-she said” and find trends and facts that aren’t otherwise available.

The digitization of government records offers great opportunity for reporters to hold public agencies accountable, increase government transparency and inform the public–but only if we know how to ask for, obtain and use it. So don’t be afraid. Be psyched.

In the webinar below, I walk through the basics to get you started. (See below THAT for a mini-recap and a whole bunch of links to the tools and sites mentioned in the video.)

The Basics of Data Journalism (April 11, 2013) from NPR Digital Services on Vimeo.

Continue reading

2013 NICAR’s Greatest Hits

The annual computer-assisted reporting conference organized by IRE and NICAR is a treasure trove of tips, tools and inspiration. There is always something for just about anyone in the news

Pete Karl II / Flikr

People share notes, experiences and know-how at NICAR and IRE conferences. It's cool like that.

industry–from the old-school newspaperman who won’t send an email to the young, enthusiastic programming geek–and everyone in between.

In fact, if you consider yourself to be very much in between–or maybe even slightly towards the old-school side of things, this post is for you. We’ve sifted through the labyrinth of tipsheets, blog posts and (almost) exhaustive collections of all that was generously shared, referenced or demonstrated at this year’s conference to bring you some of the most useful data-driven reporting tools offerred at this year’s event.

We’ve organized it into seven broad categories:

  1. General CAR Tips & Best Practices from the Pros
  2. Research Tools
  3. Social Media Tools
  4. Data Cleaning
  5. Excel
  6. Inspiration: A Small Collection of some of the Best Data-Driven Stories of 2012 (with a special emphasis on energy, education and economy stories)
  7. Advanced Coursework in Database Analysis, Super Stealth Spy Stuff and Web Scraping

We haven’t tested everything just yet, so if you find any of this particularly useful–or not–please do tell us about it in the comments section or via email.

That said, we hope you’ll find at least some of the collection below as useful and inspiring as we do. Enjoy!

Continue reading

Getting That Data Out of That Ugly PDF

A real-life pdf received by Amanda Loder of StateImpact New Hampshire. She had about 15 pages like this that she wanted to put into a sortable table. And she did it!

Has some government agency sent you a completely messy, crookedly scanned copy of an Excel print-out? Are they claiming that it would be impossible for them to share with you the original spreadsheet?

Don’t despair! There is still hope for you.

There is a very special trick called “optical character recognition” (or, “OCR” if you’re cool) that can help you covert those fuzzy tables into actual, usable Excel spreadsheets. While OCR software can be costly,  we have found at least one website that can help you out for much less scratch: Online OCR. The only caveat is that they only let you do about five pages for free. After that, you have to sign up and get a password and pay something like 7 cents a sheet. Annoying, but still better than manual data entry.

But be warnedYou’ll want to go through and make sure that your numbers still add up to whatever they add up to in your original pdfs. Depending on the quality of the scan, a 7 might look like a 1, a 3 like an 8… You get the idea. Make sure you dutifully clean and check the work. And check it again.

And then pat yourself on the back for overcoming yet another obstacle in your quest for government transparency. Well done, you!

Legislative Coverage, StateImpact Style

Here they come! It’s time for some lawmakin’.

Some of you have one year of covering the legislature for StateImpact under your belts. For states like Texas, where the legislature convenes every other year, this will be a first spin through. Either way, it should be an interesting ride. Odd-year sessions following an election are when the bulk of state legislative business gets done. And it blows by fast. So be ready!

Here are some ideas on how to be thinking about covering the legislature in a StateImpact sort of way.   Continue reading

Advanced Data Tutorial: Converting PDFs To Spreadsheets

As we’ve all discovered, many government agencies prefer releasing records in portable document format, or PDF. Sometimes that’s helpful, for example with narrative text files. But not so much for data.

This tutorial will show you one free way to convert PDFs with tabular data into spreadsheets. The data I’ll use comes from a PDF I converted recently: The number of sworn police officers for the top 50 municipalities in the United States. Here’s what the file looks like:

You can’t copy/paste this text into a spreadsheet, unfortunately, and you don’t want to waste time or risk a correction by typing the data manually. So let’s convert it.

Continue reading

Google Image Charts

Data visualizations in the form of bar, column and line charts can help readers spot important trends in the numbers behind the stories, and Google’s Image Chart Editor makes it relatively easy to produce simple but effective charts for your posts.


  • Only use bar or line charts
  • All axis labels and legend text set at 11px
  • No grids
  • Choose from three widths that correspond to our 12-column grid: 220px (3 columns), 300px (4 columns) or 620px (8 columns)
  • Start with a vertical dimension that will create a golden proportion rectangle by multiplying your chosen width by .618. The vertical dimension should be appropriate to the distribution of the data, so adjust from your golden rectangle as necessary. Continue reading