Reading: Assessing and Minimizing Re-identification Risk in Research Data Derived from Health Care Re...

Download

A- A+
Alt. Display
  • Login has been disabled for this journal while it is transferred to a new platform. Please try again in 48 hours.
Special Collection: HCSRN Special Collection

Model / Framework

Assessing and Minimizing Re-identification Risk in Research Data Derived from Health Care Records

Authors:

Gregory E. Simon ,

Kaiser Permanente Washington Health Research Institute, Seattle, WA, US
About Gregory E.
MD, MPH
X close

Susan M. Shortreed,

Kaiser Permanente Washington Health Research Institute, Seattle, WA, US
X close

R. Yates Coley,

Kaiser Permanente Washington Health Research Institute, US
X close

Robert B. Penfold,

Kaiser Permanente Washington Health Research Institute, Seattle, WA, US
X close

Rebecca C. Rossom,

HealthPartners Institute, Minneapolis, MN, US
X close

Beth E. Waitzfelder,

Kaiser Permanente Hawaii Center for Health Research, Honolulu, HI, US
X close

Katherine Sanchez,

Baylor Scott and White Research Institute, Dallas, TX, US
X close

Frances L. Lynch

Kaiser Permanente Northwest Center for Health Research, Portland, OR, US
X close

Abstract

Background: Sharing of research data derived from health system records supports the rigor and reproducibility of primary research and can accelerate research progress through secondary use. But public sharing of such data can create risk of re-identifying individuals, exposing sensitive health information.

Method: We describe a framework for assessing re-identification risk that includes: identifying data elements in a research dataset that overlap with external data sources, identifying small classes of records defined by unique combinations of those data elements, and considering the pattern of population overlap between the research dataset and an external source. We also describe alternative strategies for mitigating risk when the external data source can or cannot be directly examined.

Results: We illustrate this framework using the example of a large database used to develop and validate models predicting suicidal behavior after an outpatient visit. We identify elements in the research dataset that might create risk and propose a specific risk mitigation strategy: deleting indicators for health system (a proxy for state of residence) and visit year.

Discussion: Researchers holding health system data must balance the public health value of data sharing against the duty to protect the privacy of health system members. Specific steps can provide a useful estimate of re-identification risk and point to effective risk mitigation strategies.

How to Cite: Simon GE, Shortreed SM, Coley RY, Penfold RB, Rossom RC, Waitzfelder BE, et al.. Assessing and Minimizing Re-identification Risk in Research Data Derived from Health Care Records. eGEMs (Generating Evidence & Methods to improve patient outcomes). 2019;7(1):6. DOI: http://doi.org/10.5334/egems.270
388
Views
132
Downloads
4
Citations
5
Twitter
  Published on 29 Mar 2019

Galley file missing.

Please contact support [at] ubiquitypress.com

comments powered by Disqus