In the context of rapid urbanisation in low- and middle-income countries (LMICs), understanding individual mobility patterns is increasingly important. Technological advancements now allow for remote collection of GPS data, offering new research and evaluation opportunities. This blog post explores the potential and implications of remote GPS data collection, detailing techniques our team developed in a recent study conducted on behalf of the World Bank Development Impact Evaluation (DIME) from 2022-2024. The study aims to record incidents of violence against women in public spaces amongst female university students in Dar es Salaam, Tanzania. Utilising computer-assisted telephone interviewing (CATI) surveys, our approach enables the collection of multiple GPS locations from each person, including their home, workplace, and other sites. This multi-location approach enriches both the depth and breadth of research possibilities, allowing researchers to explore a wide range of questions about transport, urbanisation, economic trends, and changing behaviours.
Comparison with other remote approaches
As research moved towards remote data collection at the start of the pandemic, initial guidance was to remove geolocations from CATI surveys[1],[2].
More recently, researchers have tested remote respondent-led online approaches to capture a household GPS [3]. However, this had limited success in Sub-Saharan Africa due to less widespread internet connectivity and penetration of smartphones. Moreover, in studies where multiple event-based locations are required, researchers are met with an additional logistical challenge since the respondents can only share their present location using this method.
Respondent-led SMS approaches are more inclusive for respondents without smartphones; however, even with the connectivity barrier removed, these self-completed tasks often suffer from low response rates due to respondent reluctance or digital literacy barriers.
It is possible to use an Enumerator-led prompt-based approach during a CATI survey to zero in on a location, though in our experience this can be clunky with maps slow to load, leading to respondent fatigue or frustration. This also relies on enumerators being familiar with the study’s geographic area during the interview.
How do we collect GPS remotely?
To address these challenges, we developed a post-survey geocoding approach. Here, Enumerators guide respondents through prompt questions about nearby landmarks, or transportation hubs like schools, shops, public transport stations, or significant buildings close to the desired location. Using these answers, an Enumerator or Analyst can then leverage tools such as Python, Stata, Google Maps, and OpenStreetMap to identify nearby GPS coordinates to act as a proxy for these locations. By bypassing the drivers of low response rates and respondent fatigue, this method yields a high response rate producing proxies of reasonable accuracy. Ultimately, creating a complete geographical dataset.
What are the advantages of this approach?
Geospatial technologies in general have opened up new possibilities for survey research in development and humanitarian settings. For instance, coordinates data from GPS can provide information on population movement and mobility patterns, critical to addressing key information gaps. In this project specifically, we managed to collect several location GPS coordinates via phone surveys accurately and cost-efficiently for each participant from the study sample. This method of remotely collecting location information is relatively new[4] in the development field and opens doors to collecting location information in remote areas, particularly in scenarios where there is low connectivity and smartphone penetration. Besides its cost-effectiveness, this method contributes to improving services and citizen lives, including understanding the geospatial distribution of social issues, such as identifying locations of high incidence of Gender Based Violence.
How to overcome ethical requirements?
In parallel with the benefits of using geospatial technologies to identify locations in the humanitarian and development contexts, their use also presents ethical dilemmas. Privacy, safety, data security, and data storage concerns are among the potential risks to be considered when geocoding respondents’ household locations. Considering the sensitive nature of this study and the population affected by this research, we had to ensure the risk of identifying and tracking respondents’ location was minimised as much as possible. We took necessary measures to maximise data security when collecting, storing, and processing those data. For example, we acquired the consent of participants from the start and encrypted the data collected in-transit, end-to-end and at-rest. Furthermore, we de-identified the data before processing them through Python, OpenStreetMaps (OSM) and Google Maps.
Assessing the validity of this approaches
We assessed measurement error by validating true vs. geocoded GPS locations finding reasonable accuracy in urban settings (median average of 215m away using 7 data points Q1=180m; Q3=343m), precise enough for sub-community-level identification. However, further validation is needed in rural settings, due to fewer landmarks for geocoding, potentially resulting in lower precision or multiple respondents from separate households recording the same landmark proxy.
Despite not producing an exact household location, remotely geocoding proxy locations facilitates timely, cost-effective and relevant data collection. This method opens new doors for large-scale remote surveys aiming to capture the dynamics of an increasingly mobile population.
[1] DIME (Published: Nov 2021), Preparing for Remote Data Collection – Dimewiki. [online] Available at: https://dimewiki.worldbank.org/Preparing_for_Remote_Data_Collection
[2]Bhajibhakare, S., Chopra, A., Gupta, P., Patel, M. and South, J-Pal. (2020). Transitioning to CATI: Checklists and Resources. [online] Available at: https://www.povertyactionlab.org/sites/default/files/research-resources/transitioning-to-CATI-Checklists.pdf
[3] Young Lives ‘I’m (NOT) losing you’: collecting GPS data remotely to track survey participants in a pandemic | Young Lives. [online] Available at: https://www.younglives.org.uk/news/im-not-losing-you-collecting-gps-data-remotely-track-survey-participants-pandemic [Accessed 17 Apr. 2024]
[4] Young Lives ‘I’m (NOT) losing you’ project have collected remote GPS data using text/ WhatsApp/Telegram sent to participants to self-record their GPS locations. This method presents some limitations such as GPS data loss due to signal dropouts, dead batteries, signal loss when initialising the GPS from the mobile device.
Authors: Aurélie Gerbier, Johanna Choumert Nkolo, Patrick Minja, Rachel Bowers