Intro to Survey Analysis

Lately, more people have approached me with questions on how to analyze survey data. There are very interesting problems associated with survey analysis. In dealing with surveys, I have seen mistakes made in creating them. Here are some general rules to follow when making surveys.

  1. For multiple choice answers, make sure that your intended audience always has an option to answer, even if that means incorporating an “other” answer.
  2. If you are performing pre and post survey analysis, do not differ in your categorical questions even the slightest.  E.g. if the pre-survey asks:
    1. What is your income? A)<20K    B) 20-50K    C) >50K

    Then make sure your post survey has the exact same phrasing and answers.  This helps with analyzing the effects before and after an experiment.

  3. Be careful with using too many short-answer or write in answers.  Before you know it all of your analysis will be using text mining to answer questions.  Text mining can be very hard and complicated.
  4. Keep questions short and simple.  The more you confuse your target audience, the more confusing your results will be.
  5. If you are using a rating-scale throughout the survey, be consistent.  Surveys that switch rating scales tend to confuse the audiences.
  6. Low response rates lead to bias.  Try to get a large audience (See # 7).
  7. I always encourage incentives for surveys.  Just be careful that the incentive doesn’t make your audience biased.  E.g. offering an entry for a tablet drawing could draw a younger, more tech-savvy crowd.  If that doesn’t impact your audience, then go for it.

I will not cover things related to determining audience, supervising data collection, obtaining consent, etc.

When it comes to the actual analysis, the first part is data entry.  This is a time consuming, monotonous, and generally torturous activity.  Depending on the size of the audience and response, there are a few options.

  1. Manual labor: If you are an academic researcher, you can probably offer undergraduate credits for data entry.  If not, then you are either on your own, or you can hire someone to help you.
  2. Software conversions:  There exists software to convert paper surveys or electronic documents into data.  They can range from free to expensive, but beware; you get what you pay for.  This method will probably involve lots of data janitoring (Yes, go ahead and add the word janitoring to your word/phone dictionary).
  3. Website Survey Services:
    1. Surveymonkey
    2. Fluidsurveys
    3. Sogosurvey
    4. Wufoo
    5. Surveymoz
    6. Smartsurvey
    7. Surveygizmo
    8. Surveyanalytics

I highly recommend using one of these online services.  They help you create the survey, do data entry, and even do some simple high level analysis for you (summary statistics mostly).

Now we get to move onto data janitoring.  Some data entry methods are probably more prone to errors than other methods.  Software data entry, in which programs try to determine what people have physically written down are the most prone to errors and will require some heavy data cleanup.  Manual entry is probably the easiest as the data should be clean to start with and in a form that is pretty easy to use.  The survey websites will deliver clean data, but some data janitoring might be needed to get the data in a useable form.

In order to do analysis, most surveys will require dealing with silly human errors.  Spelling errors, grammatical errors, abbreviations, and more will make it hard to do analysis.  If you have write-in/short answer questions you have two options to overcome this problem.  The first is overcoming it with a large enough sample size, such that one person writing “This place rox” will not have an impact on the many others who write “This place rocks”.  The second option, given a smaller data set, is to run spell check.  Be careful with this option because spell check can be overactive.  For example, if people write proper nouns starting in lowercase, spell check will want to look for English words to replace them.

Finally we have arrived at the data analysis part.  Give yourself a pat on the back, the hard stuff is over.

This entry was posted in analysis, data and tagged , , . Bookmark the permalink.

Leave a Reply