Ophthalmol SciFebruary 2026Journal Article

Comparison of RETFound and a Supervised Convolutional Neural Network for Detection of Referable Glaucoma from Fundus Photographs.

Authors

Kyle Bolo, Tran Huy Nguyen, Sreenidhi Iyengar, Zhiwei Li, Van Nguyen, Brandon J Wong, Jiun L Do, Jose-Luis Ambite, Carl Kesselman, Lauren P Daskivich, Benjamin Y Xu

Artificial IntelligenceDiagnosis & Screening

Summary

RETFound and VGG-19 models effectively detect referable glaucoma. RETFound excels with limited data and diverse populations, while cropping images improves performance but may reduce generalizability.

Abstract

PURPOSE

To compare the performance of a vision transformer-based foundation model (RETFound) and a supervised convolutional neural network (VGG-19) for detecting referable glaucoma from fundus photographs.

DESIGN

An evaluation of diagnostic technology.

PARTICIPANTS

Six thousand one hundred sixteen participants from the Los Angeles County Department of Health Services Teleretinal Screening Program.

METHODS

Fundus photographs were labeled for referable glaucoma (cup-to-disc ratio ≥0.6) by certified optometrists. Four deep learning models were trained on cropped and uncropped images (training N = 8996; validation N = 3002) using 2 architectures: RETFound, a vision transformer with self-supervised pretraining on fundus photographs, and VGG-19. Models were evaluated on a held-out test set (N = 1000) labeled by glaucoma specialists and an external test set (N = 300) from University of Southern California clinics. Performance was assessed while varying training set size and stratifying by demographic factors. xRAI was used for saliency mapping.

MAIN OUTCOME MEASURES

Area under the receiver operating characteristic curve (AUC-ROC) and threshold-specific metrics.

RESULTS

The cropped image VGG-19 model achieved the highest AUC-ROC (0.924 [0.907-0.940]), which was comparable (= 0.07) to the cropped image RETFound model (0.911 [0.892-0.930]), which achieved the highest Youden-optimal performance (sensitivity 82.6% and specificity 88.2%) and F1 score (0.801). Cropped image models outperformed their uncropped counterparts (RETFound 0.889 [0.868-0.909], VGG-19 0.898 [0.879-0.917]) within each architecture (< 0.001 for AUC-ROC comparisons). The uncropped image RETFound model performed best on external data (0.886 [0.849-0.924] vs. the next-highest 0.797 [0.746-0.848],< 0.001 for AUC-ROC comparisons). RETFound models had a performance advantage when trained on smaller datasets (N < 2000 images), and the cropped image RETFound model performed consistently across ethnic groups (= 0.20), whereas the others did not (< 0.04). Performance did not vary by age or gender. Saliency maps for both architectures consistently included the optic nerve.

CONCLUSIONS

Although both RETFound and VGG-19 models performed well for classification of referable glaucoma, foundation models may be preferable when training data are limited and when domain shift is expected. Training models using images cropped to the region of the optic nerve improves performance regardless of architecture but may reduce model generalizability.

FINANCIAL DISCLOSURES

Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.

Keywords

Artificial intelligenceConvolutional neural networkFoundation modelGlaucoma screeningRETFound

More by Kyle Bolo

View full profile →

Steps to Measurement Floor of an Optical Microangiography Device in Glaucoma.

2021Am J Ophthalmol9 citations

Acute Angle Closure Incidence in a Large Countywide Safety Net Teleretinal Screening Program.

2025JAMA Ophthalmol1 citations

Safety of Pharmacologic Dilation: Incidence and Risk Factors of Acute Angle Closure in a Nationwide Cohort.

2026Am J Ophthalmol

Top Research in Artificial Intelligence

Browse all →

Digital technology, tele-medicine and artificial intelligence in ophthalmology: A global perspective.

2021Prog Retin Eye Res492 citations

Deep learning in ophthalmology: The technical and clinical considerations.

2019Prog Retin Eye Res447 citations

Efficacy of a Deep Learning System for Detecting Glaucomatous Optic Neuropathy Based on Color Fundus Photographs.

2018Ophthalmology262 citations

This article has not yet been placed in the Knowledge Library.

Discussion

Comments and discussion will appear here in a future update.