Reading: A Comparison of Data Quality Assessment Checks in Six Data Sharing Networks


A- A+
Alt. Display
  • Login has been disabled for this journal while it is transferred to a new platform. Please try again in 48 hours.

Empirical research

A Comparison of Data Quality Assessment Checks in Six Data Sharing Networks


Tiffany J. Callahan ,

Computational Bioscience Program, University of Colorado Denver Anschutz Medical Campus
About Tiffany J.
X close

Alan E. Bauck,

Kaiser Permanente Northwest
About Alan E.
X close

David Bertoch,

Children’s Hospital Association
X close

Jeff Brown,

Harvard Pilgrim Health Care Institute
About Jeff
X close

Ritu Khare,

Department of Biomedical and Health Informatics, Children’s Hospital of Philadelphia
X close

Patrick B. Ryan,

Observational Health Data Sciences and Informatics
About Patrick B.
X close

Jenny Staab,

Kaiser Permanente Northwest, Portland
About Jenny
X close

Meredith N. Zozus,

Department of Biomedical Informatics, University of Arkansas for Medical Sciences
About Meredith N.
X close

Michael G. Kahn

Department of Pediatrics, University of Colorado Denver Anschutz Medical Campus
About Michael G.
X close


Objective: To compare rule-based data quality (DQ) assessment approaches across multiple national clinical data sharing organizations.

Methods: Six organizations with established data quality assessment (DQA) programs provided documentation or source code describing current DQ checks. DQ checks were mapped to the categories within the data verification context of the harmonized DQA terminology. To ensure all DQ checks were consistently mapped, conventions were developed and four iterations of mapping performed. Difficult-to-map DQ checks were discussed with research team members until consensus was achieved.

Results: Participating organizations provided 11,026 DQ checks, of which 99.97 percent were successfully mapped to a DQA category. Of the mapped DQ checks (N=11,023), 214 (1.94 percent) mapped to multiple DQA categories. The majority of DQ checks mapped to Atemporal Plausibility (49.60 percent), Value Conformance (17.84 percent), and Atemporal Completeness (12.98 percent) categories.

Discussion: Using the common DQA terminology, near-complete (99.97 percent) coverage across a wide range of DQA programs and specifications was reached. Comparing the distributions of mapped DQ checks revealed important differences between participating organizations. This variation may be related to the organization’s stakeholder requirements, primary analytical focus, or maturity of their DQA program. Not within scope, mapping checks within the data validation context of the terminology may provide additional insights into DQA practice differences.

Conclusion: A common DQA terminology provides a means to help organizations and researchers understand the coverage of their current DQA efforts as well as highlight potential areas for additional DQA development. Sharing DQ checks between organizations could help expand the scope of DQA across clinical data networks.

How to Cite: Callahan TJ, Bauck AE, Bertoch D, Brown J, Khare R, Ryan PB, et al.. A Comparison of Data Quality Assessment Checks in Six Data Sharing Networks. eGEMs (Generating Evidence & Methods to improve patient outcomes). 2017;5(1):8. DOI:
  Published on 12 Jun 2017

Galley file missing.

Please contact support [at]

comments powered by Disqus