Survey Monkey and Mechanical Turk – The Verification Code

Survey Monkey and Mechanical Turk

Mechanical Turk (MTurk) has become an important data collection tool for social scientists – especially experimental political science research. For a very low cost, one can collect thousands of responses, and there is a growing literature regarding the representativeness of the samples collected from MTurk (see this 2011 article by Berinsky, Huber, and Lenz). I have used MTurk several times with the Qualtrics Survey Suite. It is easy to link Qualtrics with MTurk by using their Web Service element (see this fantastic tutorial for linking Qualtrics with MTurk). Essentially, MTurk allows “requesters” to link “workers” to an external survey link. The “workers” verify that they took the entire survey to enter a randomly generated code into MTurk after completion. The survey software that the requester uses must be able to generate this code and store it within the dataset. The requester then verifies that the code that was entered is the code that was generated and then approves the submission so the worker can get paid.

I love Qualtrics! It is by far the best survey software I have ever used. When your institution is willing to pay the cost, it is the best option. However, when you work at a smaller institution or have a reasonable budget, Qualtrics might be off the table. Survey Monkey can do almost anything Qualtrics can do. If you pay for the base subscription, it is about 1/3 of the cost per year than Qualtrics with unlimited surveys and survey responses. It allows for randomization of the treatment, skip logic, question, logic, etc. I switched from Qualtrics to Survey Monkey mainly because of the cost, and I have been reasonably happy. However, when I went to collect responses from MTurk, I quickly realized that Survey Monkey does not allow the user to generate random strings and show them to the user. It also does not allow scripts or custom HTML or Java code. The most expensive subscription allows for custom variable collection, but for the cost, you might as well get Qualtrics, and the custom variable option is unnecessarily complex and does not really get the job done. I scoured the web looking for solutions and finding none. I contacted Survey Monkey, and they were also without a solution. So after some thought, I have come up with an answer – it is not perfect, but it works!

The Solution

If you are in this position, this information might be of use to you. After creating your survey, create a text variable to be shown to the user at the end of the survey. Find a random string generator on the web such as RANDOM.org and generate some random strings. I used a 10 character string with both letters (upper and lower case) and numbers and made sure they were all unique. In Survey Monkey, I used the Random Assignment feature to construct 20 different “text” variables, each consisting of one of the random strings that I generated earlier. Survey Monkey only allows random assignment for up to 20 treatments, each with a 5% chance of being displayed. When the user completes the survey, they will be shown one of the randomly generated strings, which will then be stored in the dataset. The worker copies the random string and pastes it into MTurk. This makes it easy to verify that a worker has taken your survey and simulates an actual random number generator. At the end of the survey, your randomly assigned string will be shown to the user, and they will enter that code into MTurk, making verification easy.

It is not perfect, but it works. There are a few precautions. First, make sure only to allow one survey per IP address to ensure that workers cannot stack responses. Second, always verify the code submitted with the results in the Survey Monkey dataset. This is because even with the random assignment, the same code can be shown more than once (this is because each text string has a 5% chance of being displayed, and thus strings have the possibility of being shown multiple times). However, this will not matter if the code matches with the variable in the dataset with the proper timestamp. If a user were to copy a previous code and use it again, you would be able to verify that that code was actually generated in Survey Monkey by looking at the code that was actually displayed to the user at the time of survey completion. Also, do not use the same set of 20 codes for every batch. It is time-consuming, but I replace the random assignment “text” variables with a different set of random strings for every batch.  You can also be sure that different workers are taking the survey by checking their Worker ID in MTurk.

Of course, this is not as good as having an actual random number generator built into the survey itself – as Qualtrics has successfully done. Hopefully, Survey Monkey will add something similar in the future. But it does get around the problem with a simple fix that allows for verification and gets your workers paid.

If you have comments or suggestions, I would love to hear them. Maybe I am overlooking something that could go wrong with the process. But generally, all you are trying to do is get a code you can match up in both MTurk and the dataset that can verify your workers. This is the only solution that I have found for Survey Monkey without paying for more functionality and then messing around with custom variables (see the link to learn more about Custom Variables in Survey Monkey). I find it odd that Survey Monkey does not allow scripts to be embedded into their surveys. I also find it strange that even free services like Google Docs can do this sort of thing. Anyway, the functionality in Survey Monkey is adequate, and with the simple fix, anyone can use the software with MTurk.

I hope someone finds this useful! If you have a better solution, please let me know!

Update: I want to thank Amazon’s Mechanical Turk for crediting me for this solution. Apparently, it has helped a lot of people! Here is the link to the Amazon tutorial based on this blogpost.