Reading: Design and Refinement of a Data Quality Assessment Workflow for a Large Pediatric Research N...

Download

A- A+
Alt. Display
  • Login has been disabled for this journal while it is transferred to a new platform. Please try again in 48 hours.

Model / Framework

Design and Refinement of a Data Quality Assessment Workflow for a Large Pediatric Research Network

Authors:

Ritu Khare ,

The Children's Hospital of Philadelphia, US
X close

Levon H. Utidjian,

The Children's Hospital of Philadelphia, US
X close

Hanieh Razzaghi,

The Children's Hospital of Philadelphia, US
X close

Victoria Soucek,

Seattle Children’s Hospital, US
X close

Evanette Burrows,

The Children's Hospital of Philadelphia, US
X close

Daniel Eckrich,

Nemours Children’s Health System, US
X close

Richard Hoyt,

Nationwide Children’s Hospital, US
X close

Harris Weinstein,

The Children's Hospital of Philadelphia, US
X close

Matthew W. Miller,

The Children's Hospital of Philadelphia, US
X close

David Soler,

The Children's Hospital of Philadelphia, US
X close

Joshua Tucker,

The Children's Hospital of Philadelphia, US
X close

L Charles Bailey

The Children's Hospital of Philadelphia, US
X close

Abstract

Background: Clinical data research networks (CDRNs) aggregate electronic health record data from multiple hospitals to enable large-scale research. A critical operation toward building a CDRN is conducting continual evaluations to optimize data quality. The key challenges include determining the assessment coverage on big datasets, handling data variability over time, and facilitating communication with data teams. This study presents the evolution of a systematic workflow for data quality assessment in CDRNs.

Implementation: Using a specific CDRN as use case, the workflow was iteratively developed and packaged into a toolkit. The resultant toolkit comprises 685 data quality checks to identify any data quality issues, procedures to reconciliate with a history of known issues, and a contemporary GitHub-based reporting mechanism for organized tracking.

Results: During the first two years of network development, the toolkit assisted in discovering over 800 data characteristics and resolving over 1400 programming errors. Longitudinal analysis indicated that the variability in time to resolution (15day mean, 24day IQR) is due to the underlying cause of the issue, perceived importance of the domain, and the complexity of assessment.

Conclusions: In the absence of a formalized data quality framework, CDRNs continue to face challenges in data management and query fulfillment. The proposed data quality toolkit was empirically validated on a particular network, and is publicly available for other networks. While the toolkit is user-friendly and effective, the usage statistics indicated that the data quality process is very time-intensive and sufficient resources should be dedicated for investigating problems and optimizing data for research.

How to Cite: Khare R, Utidjian LH, Razzaghi H, Soucek V, Burrows E, Eckrich D, et al.. Design and Refinement of a Data Quality Assessment Workflow for a Large Pediatric Research Network. eGEMs (Generating Evidence & Methods to improve patient outcomes). 2019;7(1):36. DOI: http://doi.org/10.5334/egems.294
197
Views
23
Downloads
11
Citations
  Published on 01 Aug 2019

Galley file missing.

Please contact support [at] ubiquitypress.com

comments powered by Disqus