Dr. Johanna Choumert-Nkolo, Henry Cust, Callum Taylor
In the blog post series, we present the results from our recent paper:
Choumert-Nkolo J., Cust H., Taylor C. (2019) Using paradata to collect better survey data: Evidence from a household survey in Tanzania. Review of Development Economics
Our first post in this series, ‘What are paradata?’, can be found here.
In this second post, we turn to one of the most commonly used types of survey paradata; timestamps. Timestamps refer to questions within the questionnaire tool which record the time certain actions take place within an interview, for example at the start and end of a questionnaire.
Timestamps provide extremely useful information throughout all stages of a research project. They can, for example, be used to identify interviewers with relatively short or long interviewing times, revealing potential mis-practice or the need for additional supervision or training. An extended list of the uses for timestamps can be found in our previous blog or in our article.
Timestamps can be programmed at any point within a questionnaire, such as at the beginning and end of certain sections. This allows researchers to check the length of important modules and detect any interviewers that could be cutting corners. Section timestamps can also provide important information to researchers during training and piloting by helping to identify particularly time-consuming or bloated sections within your questionnaires.
Throughout our survey, we used 24 timestamps to give us a detailed picture of the time taken to complete certain groups of questions. This included 21 hidden timestamps (automatically triggered upon answering of specified questions) used to record the times of certain sections and questions. These proved to be vital in enabling us to reach our target interview length.
During the pilot, we found an average length of 113 minutes per interview which is almost double the intended target time of 60 minutes. Through careful restructuring and elimination of questions, informed by section-level breakdowns, we were able to reduce the length of the questionnaire, while still retaining important sections and questions for our research. Question exclusion was considered on a section-by-section basis targeting those sections where there was the most to be gained in terms of time, and the least to be lost in terms of meeting our research aims. Between piloting and the end of fieldwork more than 100 questions were removed from the tool. This resulted in the average time of interviews falling below our target.
Timestamps can also be used after the conclusion of fieldwork to assess the quality of the data and give insights into the behaviour of respondents. In our paper we investigate two key insights in this regard. Firstly, we review the time taken for different types of question; quantitative, factual questions about the household and its members, and perception-based questions where the respondent may have to consider their answers. Secondly, we look at timestamps taken on every row of a roster section of the questionnaire on the topic of household assets.
Our analysis reveals that perception-based questions do take significantly longer to answer than factual questions. We also reveal inconsistencies in the order some questions were answered, and provide explanations for other interviewers idiosyncrasies.
Our discussion of timestamps shows how they can be useful for the planning and preparation of survey fieldwork, how they can help researchers monitor in-field activities, and how they can be used to evaluate data quality in the post-field phase.
While our timestamps did not uncover any particular issues, researchers should be conscious of the potential issues that may be uncovered through such analysis, such as interviewers not following survey protocols precisely, which could lead to a wide range of bias and measurement errors. We strongly recommend that future surveys make use of multiple timestamps spread throughout all sections of the survey.