Modeling of stroke risk using synthetic minority oversampling technique in multivariate adaptive regression spline model
Abstract
The health sector represents the third goal of Indonesia's Sustainable Development Goals (SDGs), which is focused on ensuring healthy lives and promoting well-being for individuals of all ages. Stroke is one of the leading causes of mortality globally and is classified as a non-communicable disease (NCD). Early detection and accurate risk prediction are essential to prevent stroke occurrences and reduce related mortality rates. This study evaluates the impact of the Minority Oversampling Technique (SMOTE) on Multivariate Adaptive Regression Splines (MARS) models in predicting stroke risk, particularly in addressing imbalanced datasets. The data used in this study was collected from Universitas Airlangga Hospital (RSUA) between June and August 2023. To assess the effectiveness of SMOTE, we compare the performance of MARS models with and without oversampling. The results show SMOTE- MARS achieves higher accuracy compared to MARS model (93.50% vs. 89.00%), sensitivity (97.70% vs. 94.50%), specificity (80.70% vs. 73.97%), and AUC (89.20% vs. 84.23%), These results underscore the importance of addressing class imbalance in stroke prediction datasets to achieve more accurate and reliable outcomes. Incorporating SMOTE into MARS models proves to be a highly effective approach for enhancing predictive performance, offering a valuable tool for early stroke risk detection and prevention strategies.
Commun. Math. Biol. Neurosci.
ISSN 2052-2541
Editorial Office: office@scik.org
Copyright ©2025 CMBN