Comparative analysis of self-supervised pre-trained vision transformers and convolutional neural networks with CheXNet in classifying lung conditions

Gregorius Natanael Elwirehardja, Steve Marcello Liem, Maria Linneke Adjie, Farrel Alexander Tjan, Joselyn Setiawan, Muhammad Edo Syahputra, Hery Harjono Muljo

Abstract


Classifying lung diseases from images has been a challenging task for Deep Learning (DL) methods. Self-Supervised Learning (SSL) in particular has been widely recognized to be effective for pre-training, especially new methods such as DINOV2 ViT/S-14 and ConvNeXt-V2. In this research, Transfer Learning (TL) was conducted on the two models by using the NIH CXR-14 dataset to perform 15-class classification. Additionally, SwAV ResNet-50, DINO ViT-S/16, and CheXNet were adopted as the baselines. Evaluation results showed that DINOV2 ViT-S/14 is superior to the other three models pre-trained on ImageNet with 0.743 macro-averaged AUC, but is inferior to CheXNet which was pre-trained using the same NIH CXR-14 dataset albeit without the ”No Finding” class. However, the CheXNet only obtained 0.773 AUC with 0.328 recall. Further analysis on feature separability showed that both CheXNet and DINOV2 ViT-S/14 were unable to extract meaningful features that differentiate the ”No Finding” class with the other 14 lung conditions, confirming the finding from a previous study that this dataset’s labels is noisy, rendering it unsuitable for downstream TL tasks. However, DINOV2 ViT-S/14 showed similar attention visualizations with CheXNet on some classes despite being pre-trained on natural images from ImageNet. Therefore, despite the unsatisfactory performance in this dataset, DINOV2 holds great potential in similar future studies, but it may require pre-training the models on medical image datasets.

Full Text: PDF

Published: 2025-01-27

How to Cite this Article:

Gregorius Natanael Elwirehardja, Steve Marcello Liem, Maria Linneke Adjie, Farrel Alexander Tjan, Joselyn Setiawan, Muhammad Edo Syahputra, Hery Harjono Muljo, Comparative analysis of self-supervised pre-trained vision transformers and convolutional neural networks with CheXNet in classifying lung conditions, Commun. Math. Biol. Neurosci., 2025 (2025), Article ID 22

Copyright © 2025 Gregorius Natanael Elwirehardja, Steve Marcello Liem, Maria Linneke Adjie, Farrel Alexander Tjan, Joselyn Setiawan, Muhammad Edo Syahputra, Hery Harjono Muljo. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Commun. Math. Biol. Neurosci.

ISSN 2052-2541

Editorial Office: [email protected]

 

Copyright ©2025 CMBN