Evaluation of the performance of LASSO, group LASSO, and sparse group LASSO in identifying factors that have a significant influence on dengue hemorrhagic fever in Indonesia

C. Wirdiastuti, D. Sulistiowati, F.K. Mutya, R.N. Amalina, N. Yahputri

Abstract


Dengue Hemorrhagic Fever (DHF) remains a major public health challenge in Indonesia, driven by complex interactions among climatic, demographic, socio-economic, health capacity, and environmental factors. Identifying significant determinants in high-dimensional data with grouped explanatory variables requires appropriate regularization techniques. This study evaluates and compares the performance of LASSO, Group LASSO, and Sparse Group LASSO (SGL) in identifying factors influencing DHF incidence across Indonesian provinces. Three penalized regression models were applied to data comprising 29 predictors organized into five thematic groups. Model performance was assessed using the coefficient of determination (R²) for data testing, with additional analysis of variable selection patterns to evaluate parsimony and epidemiological interpretability. LASSO was employed to identify dominant individual predictors, Group LASSO to assess the contribution of thematic variable groups, and SGL to simultaneously perform selection at both group and individual levels. The results indicate that SGL provides the best predictive performance, achieving an R² of 94.94%, followed by LASSO (88.91%) and Group LASSO (73.94%). LASSO produced the most parsimonious model, selecting only four socio-economic and demographic variables, but overlooked meaningful group structures. Group LASSO retained three complete groups (Demographic, Socio-Economic, and Sanitation/Residential Environment) but exhibited reduced accuracy due to the inclusion of weakly contributing variables. In contrast, SGL selected 23 relevant variables while preserving group structures, revealing that DHF incidence is primarily associated with socio-economic factors (unemployment, labor force size, poverty), demographic pressure, sanitation and housing conditions, and rainfall. In conclusion, Sparse Group LASSO emerges as the most effective method for identifying significant DHF determinants, offering an optimal balance between predictive accuracy, model parsimony, and epidemiological interpretability. The findings advocate for the adoption of integrated, data-driven regularization approaches in developing targeted DHF prevention and control strategies in Indonesia.


Full Text: PDF

Published: 2026-03-26

How to Cite this Article:

C. Wirdiastuti, D. Sulistiowati, F.K. Mutya, R.N. Amalina, N. Yahputri, Evaluation of the performance of LASSO, group LASSO, and sparse group LASSO in identifying factors that have a significant influence on dengue hemorrhagic fever in Indonesia, Commun. Math. Biol. Neurosci., 2026 (2026), Article ID 29

Copyright © 2026 C. Wirdiastuti, D. Sulistiowati, F.K. Mutya, R.N. Amalina, N. Yahputri. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Commun. Math. Biol. Neurosci.

ISSN 2052-2541

Editorial Office: [email protected]

 

Copyright ©2025 CMBN