Handling severe data imbalance in chest X-Ray image classification with transfer learning using SwAV self-supervised pre-training

Hery Harjono Muljo, Bens Pardamean, Gregorius Natanael Elwirehardja, Alam Ahmad Hidayat, Digdo Sudigyo, Reza Rahutomo, Tjeng Wawan Cenggoro

Abstract


Ever since the COVID-19 outbreak, numerous researchers have attempted to train accurate Deep Learning (DL) models, especially Convolutional Neural Networks (CNN), to assist medical personnel in diagnosing COVID-19 infections from Chest X-Ray (CXR) images. However, data imbalance and small dataset sizes have been an issue in training DL models for medical image classification tasks. On the other hand, most researchers focused on complex novel methods instead and few explored this problem. In this research, we demonstrated how Self-Supervised Learning (SSL) can assist DL models during pre-training, and Transfer Learning (TL) can be used in training the models, which can produce models that are more robust to data imbalance. The Swapping Assignment between Views (SwAV) algorithm in particular has been known to be outstanding in enhancing the accuracy of CNN models for classification tasks after TL. By training a ResNet-50 model pre-trained using SwAV on a severely imbalanced CXR dataset, the model managed to greatly outperform its counterpart pre-trained in a standard supervised manner. The SwAV-TL ResNet-50 model attained 0.952 AUROC with 0.821 macro-averaged F1 score when trained on the imbalanced dataset. Hence, it was proven that TL using models pre-trained through SwAV can achieve better accuracy even when the dataset is severely imbalanced, which is usually the case in medical image datasets.

Full Text: PDF

Published: 2023-02-06

How to Cite this Article:

Hery Harjono Muljo, Bens Pardamean, Gregorius Natanael Elwirehardja, Alam Ahmad Hidayat, Digdo Sudigyo, Reza Rahutomo, Tjeng Wawan Cenggoro, Handling severe data imbalance in chest X-Ray image classification with transfer learning using SwAV self-supervised pre-training, Commun. Math. Biol. Neurosci., 2023 (2023), Article ID 13

Copyright © 2023 Hery Harjono Muljo, Bens Pardamean, Gregorius Natanael Elwirehardja, Alam Ahmad Hidayat, Digdo Sudigyo, Reza Rahutomo, Tjeng Wawan Cenggoro. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Commun. Math. Biol. Neurosci.

ISSN 2052-2541

Editorial Office: office@scik.org

 

Copyright ©2024 CMBN