Improving the accuracy of the machine learning predictive models for analyzing CHD dataset
Abstract
The problem to classify big data is an important one in machine learning. There are multiple ways to classify data, but the support vector machine (SVM) has become a great tool for the data scientist. In this paper we examine several modifications of the support vector machine algorithm that achieve better efficiency in terms of accuracy, F1 precision and CPU time when classifying test observations in comparison to the standard SVM algorithm. To make the modifications faster than standard SVM we use a special methodology which splits the input dataset into n folds and combine it with input data transformations. Each time we execute the process, one of the folds is saved as a test subset and the rest of the folds are applied for training. The process is executed n times. In the proposed methodology we are looking for the pair of subsets which produces the highest accuracy result. This pair is saved as an output SVM model.
Copyright ©2024 JMCS