OphthalmologyJanuary 2021Research Support, Non-U.S. Gov't

Explaining the Rationale of Deep Learning Glaucoma Decisions with Adversarial Examples.

Authors

Jooyoung Chang, Jinho Lee, Ahnul Ha, Young Soo Han, Eunoo Bak, Seulggie Choi, Jae Moon Yun, Uk Kang, Il Hyung Shin, Joo Young Shin, Taehoon Ko, Ye Seul Bae, Baek-Lok Oh, Ki Ho Park, Sang Min Park

Optic Nerve & DiscArtificial Intelligence

45 citations 1 influential

Summary

Adversarial explanation increased the explainability over GradCAM, a conventional heatmap-based explanation method. Adversarial explanation may help medical professionals understand more clearly the rationale of DLMs when using them for clinical decisions.

Abstract

PURPOSE

To illustrate what is inside the so-called black box of deep learning models (DLMs) so that clinicians can have greater confidence in the conclusions of artificial intelligence by evaluating adversarial explanation on its ability to explain the rationale of DLM decisions for glaucoma and glaucoma-related findings. Adversarial explanation generates adversarial examples (AEs), or images that have been changed to gain or lose pathologic characteristic-specific traits, to explain the DLM's rationale.

DESIGN

Evaluation of explanation methods for DLMs.

PARTICIPANTS

Health screening participants (n = 1653) at the Seoul National University Hospital Health Promotion Center, Seoul, Republic of Korea.

METHODS

We trained DLMs for referable glaucoma (RG), increased cup-to-disc ratio (ICDR), disc rim narrowing (DRN), and retinal nerve fiber layer defect (RNFLD) using 6430 retinal fundus images. Surveys consisting of explanations using AE and gradient-weighted class activation mapping (GradCAM), a conventional heatmap-based explanation method, were generated for 400 pathologic and healthy patient eyes. For each method, board-trained glaucoma specialists rated location explainability, the ability to pinpoint decision-relevant areas in the image, and rationale explainability, the ability to inform the user on the model's reasoning for the decision based on pathologic features. Scores were compared by paired Wilcoxon signed-rank test.

MAIN OUTCOME MEASURES

Area under the receiver operating characteristic curve (AUC), sensitivities, and specificities of DLMs; visualization of clinical pathologic changes of AEs; and survey scores for locational and rationale explainability.

RESULTS

The AUCs were 0.90, 0.99, 0.95, and 0.79 and sensitivities were 0.79, 1.00, 0.82, and 0.55 at 0.90 specificity for RG, ICDR, DRN, and RNFLD DLMs, respectively. Generated AEs showed valid clinical feature changes, and survey results for location explainability were 3.94 ± 1.33 and 2.55 ± 1.24 using AEs and GradCAMs, respectively, of a possible maximum score of 5 points. The scores for rationale explainability were 3.97 ± 1.31 and 2.10 ± 1.25 for AEs and GradCAM, respectively. Adversarial example provided significantly better explainability than GradCAM.

CONCLUSIONS

Top Research in Optic Nerve & Disc

Browse all →

Efficacy of a Deep Learning System for Detecting Glaucomatous Optic Neuropathy Based on Color Fundus Photographs.

2018Ophthalmology701 citations

Relationship between Optical Coherence Tomography Angiography Vessel Density and Severity of Visual Field Loss in Glaucoma.

2016Ophthalmology398 citations

Inflammation in Glaucoma: From the back to the front of the eye, and beyond.

2021Prog Retin Eye Res372 citations

In the Knowledge Library

An Overview of GlaucomaArtificial IntelligenceExplainable Ai In Glaucoma An Overview of GlaucomaArtificial Intelligence In GlaucomaExplainable Ai

Discussion

Comments and discussion will appear here in a future update.