Dr. Johanna Choumert Nkolo & Marie Mallet
Why should researchers and survey managers care about data protection?
In the era of dematerialisation (i.e. a paperless era), ensuring the security of data collected in field is paramount. The recent examples of major data breaches and cyber-attacks calls for even further reflection and review of protocols to manage and store personally identifiable information. This blog post provides key recommendations to secure data in field and during data transfers using EDI’s wealth of experience from 15 years in the field in East Africa and beyond.
First of all, what is data security? To define it we can use the CIA triad: Confidentiality, Integrity and Availability. The Confidentiality of information means that data should be protected from disclosure to unauthorised parties. The Integrity of information means protecting it from being altered by unauthorised parties. The Availability of information means that it only should be accessible to authorised parties.
Data protection is closely related to research ethics. International Review Boards (Ethics Committees, Ethical Review Boards, etc.) require organisations to have a Data Protection Policy to ensure the privacy of the personal data collected, especially for sensitive personal data.
In this context of increasing demand and need for data protection systems, in a world where information is more digitised and where researchers and other actors of research projects are interconnected via the internet and the cloud, how can we ensure data protection? In the next section, we share insights into EDI’s expertise and protocols to protect data along with the CIA principals.
How does EDI ensure data security during the field data collection?
At EDI, after the project and questionnaire design, robust data security procedures start in field at interview level, with the respondent, and are maintained through to the transfer of the final cleaned datasets to the researchers. This is integral to our policy to collect data in a secure manner with the best quality. During the data collection phase of a project, we can identify three levels where strict data security protocols have to be put in place. The first level of security is during an interview and relates to the respect of the privacy and confidentiality of the respondent. The second level is the storage of the data collected after an interview. Finally, the third level of security, during the fieldwork, is the transfer of the data collected from field to the Headquarters (HQ) at the EDI Bukoba office and onto our partners.
1. During an interview
At the start of every interview, the interviewer reads the consent statement with the respondent, informing the respondent that the data collected will not be shared with anyone other than the research team. Depending on the nature of the project, the consent statement may only be read by the interviewer (e.g. for phone survey) or may be required to be on a paper form to additionally collect the signature of the respondent. The consent statement should state if the datasets will be anonymised i.e. each respondent will be assigned a unique identification number (Respondent ID). The Respondent ID will ensure that the name and other personal identifiable information cannot be linked to the answers given. It is crucial that the information regarding the storage and the use of the data is shared with the respondent to build a relationship of trust between the interviewer and the respondent, especially for sensitive surveys.
In most surveys, during the interview, complete and strict privacy is observed. Indeed, the presence of another household member or any third person may dramatically affect the quality of the data. At EDI, we train our interviewers to establish a safe and private environment for the entire interview. This can be done by proposing to the respondent to be interviewed outside the home, in a place where the respondent feels comfortable speaking openly. Should an interview be interrupted, such as a guest coming in the room, the interviewer immediately stops the interview and locks the device to protect the data collected from being accessed by anyone else outside the research team.
2. In field after an interview
At EDI, interviews are conducted using Computer Assisted Personal Interviewing (CAPI) software, surveybe. Once the interview is completed, the data is kept securely on the interviewer’s device used to conduct the survey where the interview file is encrypted. For most EDI projects, electronic devices are used and are protected by a password only known by the EDI Project Research Team. In addition, an extra layer of security can be used in field on tablets by using file or folder locker applications that hide all interview files. The interviews will be made visible using a password. For surveys using other electronic devices, encryption software (such as VeraCrypt) may be used to encrypt the interview files once completed. The interviewers are also trained on computer/tablet best practice to make sure that the data is kept safe before being transferred to HQ.
3. During the data transfer from field to HQ
One of the numerous advantages of CAPI is to collect and transfer the data collected in field to the Coordination Team located in Bukoba (HQ) on a daily basis. After the end of each field day, the supervisor meets the interviewers and securely collects all of the interview files from that day. Before transferring the files via the cloud, each individual file is checked by the interviewers and supervisors, before transferring them to HQ who take over control once received from field. Interviewers always keep a backup of all the interviews that they have conducted on their tablet/computer in encrypted form.
How does our electronic survey software contribute to data security?
Data security is one of our top priorities in the development of surveybe.
First of all, the software automatically encrypts the newly collected interview file, this means that no one can access the file without having both the surveybe designer and the correct surveybe questionnaire. The surveybe designer (used for building the questionnaire) is deliberately not installed on the interviewer’s tablet devices, mitigating the risk of data vulnerability in the unlikely event of a device being stolen or lost in field. Besides, the reference data (all the external data used from a separate table in a questionnaire, such as the geographic data of the sample, data of the respondents from the previous round for a follow-up survey etc.) are encrypted in the questionnaire and cannot be accessed outside the surveybe implementer. Thus, it means the reference data are encrypted on the interviewer’s device and can’t be extracted in a readable format such as an Excel spread sheet.
Surveybe also offers the possibility to set the scope of the question at the design stage, this is particularly useful to anonymise data by not displaying personally identifiable information in the final datasets. Thus, surveybe has the unique feature to let the designer set the security scope for each question, therefore controlling what data is included and visible at different stages of the data cycle.
Figure 1. The different question scopes mode in surveybe
An example of the private scope mode is presented in the screen shots below.
Figure 2. Setting the scope of a question in the surveybe designer
Figure 3. Visualisation of the question in the surveybe implementer in the first interview file created (image on the left) and the same interview file when re-opened (image on the right)
The surveybe team is also currently developing a self-completion functionality which will lock a particularly sensitive screen after being completed by the respondent, making it inaccessible to view and input data into the interview file (including by the interviewer).
In addition to the above security data functionalities, surveybe is also compatible with other security software and applications to fit with a user’s own protocols of file infrastructure. The flexibility of surveybe allows the user to benefit from the internal protection already included in surveybe plus any other security level offered by the software used in combination. With surveybe, the user is free to choose the desired level of data security without being limited by survey software incompatibility constraints. For example, on tablets, surveybe can be associated with any locker application that makes a file or folder invisible on the device, hence the surveybe interview file or other interview file attachment (such as pictures or audio recordings) collected from the interview can be made invisible with such application.
Further resources on data protection
- CAPI software for data encryption during fieldwork
- Surveybe CAPI software developed by Economic Development Initiatives (EDI) Limited
- Training and readings on ethics and data protection
- Online course on “Protecting Human Research Participants”, National Institutes of Health (NIH) Office of Extramural Research
- Belmont Report: Ethical Principles and Guidelines for the Protection of Human Subjects of Research, The National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research (1979)
- Ethical Codes & Research Standards, Office for Human Research Protections (OHRP), U.S. Department of Health & Human Services (HHS)
- Institutional review boards (IRB) and Federalwide Assurances (FWA), OHPR
- General Data Protection Regulation full Regulation available here and 12 steps to take available here
- Data sharing and storage, and file encryption software programs
- Box, file sharing and storage
- Boxcryptor, file encryption software
- Dropbox, file sharing and storage
- Semaphor, file encryption software
- VeraCrypt, disk encryption software