Enhancing breast cancer detection using machine learning on data from Cuban women
Abstract
Breast cancer is becoming as the predominant cause of cancer-related mortality among women globally. Accurate and early diagnosis in detecting the presence of breast cancer can bring a positive impact in reducing mortality rates. This research explored the capabilities of a machine learning approach in detecting the presence of breast cancer in patients undergoing screening based on patient background parameters. This study utilized a publicly available dataset entitled Breast Cancer Risk Factors in Cuban Women obtained from Mendeley Data. This research contribution is the exploration and experiment of various machine learning models, such as support vector machine (SVM) using various kernels as our proposed model, logistic regression as our baseline model, and random forest as a comparison model with the best model in previous research that provided this dataset, with the result that our methodology, especially in handling preprocessing data and feature engineering, can improve most tested machine learning models to achieve perfect scores (100% accuracy, precision, recall, and F1-score), except for the SVM with radial basis function (RBF) kernel.
Commun. Math. Biol. Neurosci.
ISSN 2052-2541
Editorial Office: [email protected]
Copyright ©2025 CMBN
Communications in Mathematical Biology and Neuroscience