US researchers have developed a way to protect patients' privacy while maintaining researchers' ability to analyse patient-specific genetic and clinical data.
The current standard for medical indexing, the International Statistical Classification of Diseases and Related Health Problems (ICD), works by assigning a unique number sequence to any health condition, including disease, symptom, or injury.
Patient privacy can be threatened when personal information is linked to genetic information using ICD codes available through public databases and electronic medical records.
Grigorios Loukides and colleagues at the Vanderbilt University in Nashville, Tennessee developed a method that replaces single ICD codes with a series of related codes.
The technique generalises clinical information so that patients remain anonymous while providing the medical and genetic connections that researchers rely upon for correlation studies.
The researchers tested the algorithm's data protection performance against simulated malicious attacks using actual information from more than 2,600 patients. They assumed a potential hacker knew a patient's identity, some or all of a patient's ICD codes, and whether the patient record was included in released data.
The technique resisted attempts to uncover a patient's private information, and maintained the data integrity necessary to retain useful information for validating genome-wide studies, said the authors.
The full article, Anonymization of electronic medical records for validating genome-wide association studies, by Grigorios Loukides, Aris Gkoulalas-Divanis, and Bradley Malin, appears in Proceedings of the National Academy of Sciences (PNAS).