https://doi.org/10.1051/epjconf/202636704004
Explainable Multi-Modal Skin Lesion Classification with a Hybrid CNN-Transformer
1 Department of Computer Science and Engineering, Ramaiah University of Applied Sciences, Bengaluru, India
2 Assistant Professor Department of Computer Science and Engineering Ramaiah University of Applied Sciences Bengaluru, India
* Corresponding author: This email address is being protected from spambots. You need JavaScript enabled to view it.
Published online: 29 April 2026
Abstract
Fast and accurate identification of skin lesions is important for the outcome of patients. The evaluation of lesions is subjective, and poor quality images may limit accuracy. Deep learning models can be an alternative; however, many of them lack interpretability or do not combine different types of data. The current research presents an innovative, interpretable multimodal system for diagnosing skin lesions that overcomes many of these limitations. A hybrid neural network was created that uses a CNN-Transformer architecture and EfficientNetV2-B0 backbone to process and extract visual patterns from dermoscopy images. Additionally, this model was integrated with a second network that uses the HAM10000 dataset in order to incorporate and process historical patient information. The model has been class-balanced by using SMOTE to ensure strong performance. The model provides transparency by using Explainable AI (XAI) methods, primarily with Grad-CAM for visual and LIME for tabular features. Overall, this multimodal system produces an adaptable, reliable and effective diagnostic tool with an overall classification accuracy of 80.04% and an Area Under the Curve (AUC) of 0.95. Our results suggest that multimodal data combined with a transparent hybrid architecture produces an effective tool for enhancing clinician support, diagnostic confidence and provides a framework for clinical deployment in real-world practice.
© The Authors, published by EDP Sciences, 2026
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

