Cleaning and Recoding
Before you can analyze your data, you’ll need to check for missing data and recode some variables.
Collecting and using data is hard work! You’re bound to run into some situations where you are missing data. Maybe the program participants skipped a survey question because they didn’t understand the wording, or maybe the program staff forgot to enter the information into your nonprofit’s database. Either way, you need to know how much missing data you’re dealing with so you can find a solution. In this tutorial, you’ll learn how to automatically color-code your blank cells to visualize how much missing data you’re dealing with.
You’ll need to determine exactly how many people are in your dataset (and much missing data you’re dealing with). I teach you how to use three formulas to scan for missing data:
- =count() tells you how many cells contain numeric data
- =countblank() tells you how many of those cells are empty
- Then, use =sum() to make sure the =count() and =countblank() add up
Another important data cleaning step is to determine whether your dataset contains any duplicate entries. I teach you two strategies to scan for duplicate ID numbers:
- Pivot table tallies
- Color-coding (a conditional formatting feature)
Bonus! Download the spreadsheet used in this video.
Use =if to recode variables.
Here’s a second video showing you how to use =if to recode variables.