Supervised learning for imbalance sleep stage classification problem
Abstract
Sleep is commonly associated with physical and mental health status. Monitoring sleep quality from the dynamic of sleep stages during the night can be valuable. Data from the wearable device has the potential to be used as predictors to predict the sleep stage. Machine learning methods have been proposed to learn patterns within the data for the sleep-wake classification. The main challenge is the nature of imbalanced sleep, which means more sleep stages will be found in the data than in the wake stages. In this study, we utilized five different supervised methods complemented by three strategies to handle the imbalanced data problem. We implemented Random Forest, Support Vector Machine, XGBoost, Dense Neural Network (DNN), and Long-Short Term Memory (LSTM), to a publicly available dataset that consists of three features captured from a consumer wearable device and the labelled sleep stages. Among all the models, the DNN method was found to have the best performance, achieving a 12% higher specificity score (predictive capability for minority class) while using all features in the model. This achievement was affected by the implementation of custom class weight and SMOTE oversampling strategy. The class weight parameter avoided the model ignoring the minority class by giving more weight for this class in the loss function. The feature engineering process seemed to obscure the time-series characteristics within the data. This is why LSTM, as one of the best methods for time-series data, failed to perform well in this classification task. Our proposed method therefore can provide an insight into constructing more robust ML-based sleep quality prediction pipelines.
Commun. Math. Biol. Neurosci.
ISSN 2052-2541
Editorial Office: [email protected]
Copyright ©2024 CMBN