Dr. Johanna Choumert Nkolo and Elena Perra
At EDI, we strive to provide the highest data quality by using and contributing to international best practices for questionnaire design, sampling, interviewer training, quality control, monitoring of paradata, etc. In previous blog posts (here and here), we raised the lack of consensus on the definition of “data quality” within the community of researchers in development economics and survey methodology.
While much existing research discusses the importance of using high quality data, both for primary and secondary data, and the implications and costs of poor data quality (Jerven, 2017; Jerven and Johnston, 2015; Sandefur and Glassman, 2015), few authors give a precise definition of data quality. In this blog post, we combine different bibliographic sources in order to propose a definition and typology of the dimensions of data quality.
Data quality is a multidimensional concept and should be “treated as an intrinsic concept, independent of the context in which data is produced and used” (Strong et al., 1997). Data quality is commonly defined as “fitness for use”, that is “fit for use by data consumers” (Strong et al., 1997).
Wang and Strong (1996) provide four generic categories of data quality:
- Intrinsic data quality
- Accessibility data quality
- Contextual data quality
- Representational data quality
In the table below, we reference various sources presenting the dimensions of data quality and regroup them under one of the four categories listed above. This provides an overview of data quality in the literature and allows us to see how the multiple dimensions of data quality interact with one another. This helps us to define what data quality really means. Researchers should be aware of these features and take them into account when designing data collection and research projects.
References
Australian Bureau of Statistics, 2011. Data Quality Statement Questions. General Purpose: Survey Data.
Eurostat Working Group, 2003. Assessment of Quality in Statistics. Presented at the Standard Quality Report, Methodological Documents, Eurostat, Luxembourg.
Jerven, M., Johnston, D., 2015. Statistical Tragedy in Africa? Evaluating the Data Base for African Economic Development. The Journal of Development Studies 51, 111–115.
Jerven, M., 2017. How Much Will a Data Revolution in Development Cost? Forum for Development Studies 44, 31–50.
OECD, 2002. Quality Framework for OECD Statistics.
Sandefur, J., Glassman, A., 2015. The Political Economy of Bad Data: Evidence from African Survey and Administrative Statistics. The Journal of Development Studies 51, 116–132.
Strong, D. M., Lee, Y. W., Wang, R. Y., 1997. Data quality in context. Communications of the ACM, 40(5), 103-110.
USAID, 2010. Performance Monitoring & Evaluation TIPS conducting data quality assessments (No. 18).
Wand, Y., Wang, R.Y., 1994. Anchoring Data Quality Dimensions in Ontological Foundations. Communications of the ACM 39.
Wang, R.Y., Strong, D.M., 1996. Beyond Accuracy: What Data Quality Means to Data Consumers. Journal of Management Information Systems 12, 5–33.
World Bank, 2009. Data Quality Assessment Framework (DQAF) for the International Comparison Program (ICP): paper for session five (English). International Comparison Program (ICP). Washington, DC; World Bank Group.