CRCDKD: A novel architecture for medical skin cancer classification on the imbalanced HAM10000 dataset
Abstract
Addressing the critical challenge of imbalanced data in medical skin cancer classification, this paper proposes a novel Categorical Relation-Preserving Contrastive Decoupled Knowledge Distillation (CRCDKD) framework tailored for the HAM10000 dataset, a widely recognized benchmark for skin disease image analysis. To mitigate biases toward majority classes and enhance diagnostic reliability across all categories, the architecture integrates a mean-teacher paradigm with categorical relation-preserving contrastive learning, augmented by a newly proposed Decoupled Mean Teacher Knowledge Distillation (DMTKD) module. This synergistic approach decouples feature learning and knowledge distillation, enabling dynamic optimization of performance trade-offs between underrepresented and dominant categories while accelerating model convergence. The results demonstrated that the proposed framework achieved a balanced multiclass accuracy (BMA) of 84.45%, alongside an overall accuracy of 89.41%, precision of 83.27%, recall of 84.85%, specificity of 97.50%, F1-score of 83.39%, and an area under the curve (AUC) of 98.41%, surpassing state-of-the-art techniques. These metrics highlight significant improvements over existing methods, particularly in minority-class accuracy and balanced performance (BMA), with the DMTKD module offering unprecedented flexibility to adapt decision boundaries. The proposed framework not only advances skin cancer detection for imbalanced medical datasets but also introduces a generalizable paradigm for fairness-aware deep learning in healthcare applications, ensuring robustness across diverse clinical scenarios.
Commun. Math. Biol. Neurosci.
ISSN 2052-2541
Editorial Office: [email protected]
Copyright ©2025 CMBN
Communications in Mathematical Biology and Neuroscience