Enhanced Phenotype Identification of Common Ocular Diseases in Real-World Datasets.
Stein Joshua D, An Hong Su, Andrews Chris A, Pershing Suzann, Mungle Tushar, Bicket Amanda K, Rosenthal Julie M, Zhang Amy D, Lee Wen-Shin, Ludwig Cassie
AI Summary
This study developed enhanced algorithms using comprehensive EHR data to accurately identify glaucoma, DR, and AMD patients, outperforming ICD codes. This improves real-world research and patient management.
Abstract
Objective
For studies using real-world data, accurately identifying patients with phenotypes of interest is challenging. To identify cohorts of interest, most studies exclusively use the International Classification of Diseases (ICD) billing codes, which can be limiting. We developed a method to accurately identify the presence or absence of 3 common ocular diseases (diabetic retinopathy [DR], age-related macular degeneration [AMD], and glaucoma) using electronic health record (EHR) data.
Design
Database study.
Participants
Three thousand nine hundred fourteen eyes from 1957 patients at 2 Sight OUtcomes Research CollaborativE (SOURCE) Ophthalmology Data Repository sites.
Methods
We developed enhanced phenotype identification (EPI) algorithms that search EHR fields, including eye examination findings, orders, charges, medication prescriptions, and surgery data for evidence that a patient has glaucoma, DR, or AMD. We trained our EPI models using gold standard assessments of the EHR by ophthalmologists for the presence/absence of these conditions, compared the performance of our EPI models to models developed using ICD codes alone, and validated the performance of model using data from another SOURCE site.
Main outcome measures
Area under the receiver operating curve (AUC), area under the precision-recall curve (AUPRC), and model calibration.
Results
The AUCs of our EPI models were better than ICD-only models for glaucoma (0.97 vs. 0.90), DR (0.997 vs. 0.98), and AMD (0.99 vs. 0.95). The AUPRCs of our EPI models were also much better than ICD-only models for glaucoma (0.79 vs. 0.32), DR (0.96 vs. 0.84), and AMD (0.74 vs. 0.55). When testing on patients from a second SOURCE site, the AUC and AUPRC for glaucoma (0.93, 0.74), DR (0.98, 0.77), and AMD (0.96, 0.64) were slightly worse than the primary site but still quite high. However, for all 3 conditions, model calibration was worse at the second site.
Conclusions
Leveraging machine learning, we developed EPI models to accurately identify most patients with glaucoma, DR, and AMD in real-world datasets. The EPI models significantly outperform ICD-only models in identifying patients confirmed to have these conditions. These findings underscore the potential of using comprehensive EHR data combined with advanced machine learning techniques to improve the accuracy of patient phenotype identification, leading to better patient management and clinical outcomes.
Financial disclosures: Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.
Shields Classification
Key Concepts5
The area under the receiver operating curve (AUC) for enhanced phenotype identification (EPI) models was better than ICD-only models for glaucoma (0.97 vs. 0.90), diabetic retinopathy (DR) (0.997 vs. 0.98), and age-related macular degeneration (AMD) (0.99 vs. 0.95).
The area under the precision-recall curve (AUPRC) for enhanced phenotype identification (EPI) models was much better than ICD-only models for glaucoma (0.79 vs. 0.32), diabetic retinopathy (DR) (0.96 vs. 0.84), and age-related macular degeneration (AMD) (0.74 vs. 0.55).
When testing on patients from a second SOURCE site, the AUC and AUPRC for glaucoma (0.93, 0.74), diabetic retinopathy (DR) (0.98, 0.77), and age-related macular degeneration (AMD) (0.96, 0.64) using enhanced phenotype identification (EPI) models were slightly worse than the primary site but still quite high.
Enhanced phenotype identification (EPI) models, leveraging machine learning and comprehensive electronic health record (EHR) data, accurately identify most patients with glaucoma, diabetic retinopathy (DR), and age-related macular degeneration (AMD) in real-world datasets.
Enhanced phenotype identification (EPI) models for glaucoma, diabetic retinopathy (DR), and age-related macular degeneration (AMD) were developed by searching electronic health record (EHR) fields, including eye examination findings, orders, charges, medication prescriptions, and surgery data.
Related Articles5
Comparison of Deep Learning and Clinician Performance for Detecting Referable Glaucoma from Fundus Photographs in a Safety Net Population.
Cohort StudyArtificial Intelligence and Ophthalmic Clinical Registries.
Systematic ReviewEvaluating a Foundation Artificial Intelligence Model for Glaucoma Detection Using Color Fundus Photographs.
Observational StudyContinuous 24-hour intraocular pressure monitoring in normal Chinese adults using a novel contact lens sensor system.
Observational StudyFeasibility and acceptance of artificial intelligence-based diabetic retinopathy screening in Rwanda.
Observational StudyIs this article assigned to the wrong chapter(s)? Let us know.