Global Search

Search articles, concepts, and chapters

Ophthalmol SciDecember 20240 citations

Artificial Intelligence Models to Identify Patients at High Risk for Glaucoma Using Self-reported Health Data in a United States National Cohort.

Ravindranath Rohith, Naor Joel, Wang Sophia Y


AI Summary

AI models successfully used self-reported health data to predict glaucoma risk (AUROC up to 0.890). This could enable prescreening in low-resource settings, guiding referrals for specialized eye exams.

Abstract

Purpose

Early glaucoma detection is key to preventing vision loss, but screening often requires specialized eye examination or photography, limiting large-scale implementation. This study sought to develop artificial intelligence models that use self-reported health data from surveys to prescreen patients at high risk for glaucoma who are most in need of glaucoma screening with ophthalmic examination and imaging.

Design

Cohort study.

Participants

Participants enrolled from May 1, 2018, to July 1, 2022, in the nationwide All of Us Research Program who were ≥18 years of age, had ≥2 eye-related diagnoses in their electronic health record (EHR), and submitted surveys with self-reported health history.

Methods

We developed models to predict the risk of glaucoma, as determined by EHR diagnosis codes, using 3 machine learning approaches: (1) penalized logistic regression, (2) XGBoost, and (3) a fully connected neural network. Glaucoma diagnosis was identified based on International Classification of Diseases codes extracted from EHR data. An 80/20 train-test split was implemented, with cross-validation employed for hyperparameter tuning. Input features included self-reported demographics, general health, lifestyle factors, and family and personal medical history.

Main outcome measures

Models were evaluated using standard classification metrics, including area under the receiver operating characteristic curve (AUROC).

Results

Among the 8205 patients, 873 (10.64%) were diagnosed with glaucoma. Across models, AUROC scores for identifying which patients had glaucoma from survey health data ranged from 0.710 to 0.890. XGBoost achieved the highest AUROC of 0.890 (95% confidence interval [CI]: 0.860-0.910). Logistic regression followed with an AUROC of 0.772 (95% CI: 0.753-0.795). Explainability studies revealed that key features included traditionally recognized risk factors for glaucoma, such as age, type 2 diabetes, and a family history of glaucoma.

Conclusions

Machine and deep learning models successfully utilized health data from self-reported surveys to predict glaucoma diagnosis without additional data from ophthalmic imaging or eye examination. These models may eventually enable prescreening for glaucoma in a wide variety of low-resource settings, after which high-risk patients can be referred for targeted screening using more specialized ophthalmic examination or imaging.

Financial disclosures: The author(s) have no proprietary or commercial interest in any materials discussed in this article.


Key Concepts4

Artificial intelligence models, specifically XGBoost, achieved the highest AUROC of 0.890 (95% confidence interval [CI]: 0.860-0.910) for identifying patients at high risk for glaucoma using self-reported health data from surveys.

DiagnosisCohortCohort studyn=8205 patientsCh1Ch10

Across different artificial intelligence models (penalized logistic regression, XGBoost, and a fully connected neural network) developed to predict glaucoma risk using self-reported health data, AUROC scores ranged from 0.710 to 0.890.

DiagnosisCohortCohort studyn=8205 patientsCh1Ch10

Explainability studies of artificial intelligence models predicting glaucoma risk revealed that key features included traditionally recognized risk factors such as age, type 2 diabetes, and a family history of glaucoma.

DiagnosisCohortCohort studyn=8205 patientsCh1Ch10

In a cohort of 8205 patients from the nationwide All of Us Research Program, 873 (10.64%) were diagnosed with glaucoma based on International Classification of Diseases codes extracted from electronic health record (EHR) data.

EpidemiologyCohortCohort studyn=8205 patientsCh10

Is this article assigned to the wrong chapter(s)? Let us know.