Predicting recurrence in differentiated thyroid cancer: a comparative analysis of various machine learning models including ensemble methods with chi-squared feature selection

Karli Eka Setiawan

Abstract


Differentiated thyroid cancer (DTC) is a kind of cancer that affects the endocrine system and is the fastest-growing cancer diagnosis globally. This research focused on the development of machine learning models for predicting the presence and absence of recurrence of DTC using 23 different machine learning models, such as SVM with various kernels (linear, polynomial, and radial basis functions), logistic regression, Naïve Bayes, Decision Tree, K-Nearest Neighbor, ensemble bagging using various base learners, random forest, ensemble stacking using various base learners, ensemble boosting using various base learners with AdaBoost, and Gradient Boosting Machine, as the first research contribution. This research continued the previous research, which shared the dataset, namely Differentiated Thyroid Cancer Recurrence. The second contribution of this research was the implementation of chi square as a feature selection method for our 23 machine learning models. The best machine learning model without using feature selection (Scenario 1) was random forest with 94% precision, recall, and f1-score in predicting the presence of recurrence of DTC and 98% precision, recall, and f1-score in predicting the absence of recurrence of DTC. Meanwhile, after the use of feature selection (Scenario 2), we found ten machine learning models that can achieve maximum potential by achieving 100% accuracy in predicting both the presence and absence of recurrence of DTC.

Full Text: PDF

Published: 2024-04-29

How to Cite this Article:

Karli Eka Setiawan, Predicting recurrence in differentiated thyroid cancer: a comparative analysis of various machine learning models including ensemble methods with chi-squared feature selection, Commun. Math. Biol. Neurosci., 2024 (2024), Article ID 55

Copyright © 2024 Karli Eka Setiawan. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Commun. Math. Biol. Neurosci.

ISSN 2052-2541

Editorial Office: [email protected]

 

Copyright ©2024 CMBN