Data: It’s not just for computer geeks anymore.
There is an awful lot that data can do for you and your stories if you get over the idea that it is something for other, more computer-savvy people and get into the habit of working it into your daily routine. Think of it as simply another–but infinitely more authoritative–source, one that allows you to speak with more authority, see beyond the clutter and the “he said-she said” and find trends and facts that aren’t otherwise available.
The digitization of government records offers great opportunity for reporters to hold public agencies accountable, increase government transparency and inform the public–but only if we know how to ask for, obtain and use it. So don’t be afraid. Be psyched.
In the webinar below, I walk through the basics to get you started. (See below THAT for a mini-recap and a whole bunch of links to the tools and sites mentioned in the video.)
Five Ways to Find the Data
You find data in the same way that you find any other source for a story: with your journalist’s curiosity, by following tips and fact-checking claims… In short, though shoe-leather reporting. Remember, data can give you insights, but is very rarely THE story.
- Call the agency you’re interested in and see how they keep the data and if they have it online. Operate under the assumption that it is public information. Even if they agree to send it over, be sure to send an email outlining exactly what you want just to be clear and to start a record of your request.
- Online forms, charts and statistics all provide clues that there is a database to be found. Follow the paper trail and think about where this information goes.
- Ask academics, industry folks, watchdog groups, activists and worker bees where to find what you’re after.
- Google’s Advanced Search function can also help you find data if you search for specific file types (such as .xls) and the url of the agency that might hold the given dataset.
- If you need to file a Freedom of Information Act or Sunshine request, check out the law the agency operates under before you make your request. Make your request clear and focused. You may also wish to explicitly ask for information in ‘disaggregated’ or ‘granular’ form.
You’ve got the Data, Now Go Forth and Report
Once you have the data, analyze it. And we’re talking about understanding every field in a spreadsheet. Know where it came from and check the math for errors.
And then do the following:
- Get data dictionary and/or code sheet to help interpret it.
- Interview your data. And take notes. Ask it simple questions to start out with (e.g., averages, maximums, minimums, top tens, etc.).
- Look at different ways to measure it (e.g., per capita, rate of change, change over time, etc.).
- Put it into groups (e.g., geographical, historical, demographic) and compare those.
- Approach data with caution. Human error happens. A lot.
- Use tools like Open Refine to help clean and standardize your data.
My Favorite Tools for Data Journalism
You’ve got the data. You’ve analyzed it. Now you want to do a little more with it. Here are the essential tools to help you tell the story.
- Excel: Start with the simple actions like adding and subtracting and then move to pivot tables. They allows users to count the number of times something comes up, aggregate data or work out averages. (Tip: the “Help” function is Excel is really useful for explaining pivot tables and the many other wonders of Excel.)
- Google Fusion: This is a free, versatile program that offers several functions (merge files, create graphs, charts, maps, filterable tables). Google also makes it easy to share your files and collaborate with coworkers.
- Google Charts: This is also free and helps you visualize the data.
Graduate-level tools that you’ll want to learn (or find someone to operate) when you get into more advanced data analysis include:
- To examine, migrate data: TextWrangler
But in the meantime, here are some of my favorite resources for getting your feet wet and becoming more comfortable with data-driven reporting:
- NICAR conference, bootcamps, listserv
- IRE Resource Center
- Numbers in the Newsroom, by Sarah Cohen
- Data Journalism Handbook (written by a consortium of international rock stars of data-driven journalism)
- UC Berkeley’s Knight Digital Media Center (especially this one on using Google Spreadsheets)
And for inspiration: