A hybrid multi-GARCH transformer model for stock price volatility forecasting
Abstract
Accurate forecasting of financial market volatility remains a central challenge due to nonlinear dynamics, volatility clustering, and regime-dependent behavior in financial time series. While econometric volatility models capture conditional heteroskedasticity, deep learning architectures provide flexible nonlinear modeling capabilities. However, existing approaches often treat these paradigms independently or combine them post hoc, limiting their ability to jointly exploit statistical structure and representation learning. This study addresses this gap by proposing a hybrid architecture that integrates GARCH-family models directly within an attention-based Transformer through representation-level learning. We develop a Multi-GARCH Transformer architecture that embeds conditional variance estimates from multiple GARCH-family models into the input representation of a Transformer encoder. This hybrid architecture functions as a heterogeneous ensemble in which complementary volatility dynamics are integrated at the feature level and adaptively weighted through self-attention. The analysis uses daily stock returns of Safaricom from 2015 to 2024.The study combines descriptive statistics, econometric modeling, and transformer-based hybrid forecasting. Model performance is evaluated using quasi-likelihood (QLIKE), RMSE, and MAE, alongside statistical comparison using the Diebold-Mariano test. Empirical findings confirm that the Multi-GARCH-Transformer ensemble achieves superior out-of-sample volatility forecasting performance by embedding complementary econometric signals within an attention-based architecture. This hybrid model delivers the strongest performance under QLIKE and exhibits statistically significant improvements validated through Diebold-Mariano testing. The observed training behavior indicates more stable convergence and stronger generalization than non-hybrid architectures, reinforcing the effectiveness of representation-level ensemble learning. The results highlight the advantages of integrating econometric theory with attention-based deep learning through feature-level hybridization. By jointly modeling structured volatility dynamics and nonlinear temporal dependencies, the approach improves forecasting robustness, interpretability, and predictive accuracy. This study contributes to the intersection of financial econometrics and deep learning by introducing a hybrid ensemble modeling strategy for complex and evolving market environments.
Commun. Math. Biol. Neurosci.
ISSN 2052-2541
Editorial Office: [email protected]
Copyright ©2025 CMBN
Communications in Mathematical Biology and Neuroscience