Nutrients, Vol. 18, Pages 660: A Maturation-Aware Machine Learning Framework for Screening the Nutritional Status of Adolescents
Nutrients doi: 10.3390/nu18040660
Authors:
Hatem Ghouili
Zouhaier Farhani
Narimen Yousfi
Halil İbrahim Ceylan
Amel Dridi
Andrea de Giorgio
Nicola Luigi Bragazzi
Noomen Guelmami
Ismail Dergaa
Anissa Bouassida
Background: Malnutrition in adolescents remains a significant public health issue worldwide, with undernutrition and overweight often coexisting. Accurate nutritional screening during adolescence is complicated by variability in biological maturation and class imbalance, particularly among underweight adolescents. Objective: This study aims to develop and validate machine learning models for classifying the nutritional status of adolescents, accounting for class imbalance and biological maturation, and to evaluate model stability and variable importance at different stages of peak height velocity (PHV). Methods: In this cross-sectional study, 4232 adolescents aged 11 to 18 years were recruited from nine educational institutions in Tunisia. Their nutritional status was classified according to the International Obesity Task Force (IOTF) BMI thresholds into three categories: underweight (14.4%), normal weight (68.3%), and overweight (17.2%). Ten anthropometric, behavioral, and maturation-related predictors were analyzed. Six supervised machine learning algorithms were evaluated using a 70/30 stratified split between training and test sets, with five-fold cross-validation. Class imbalance was addressed by ROSE combined with cost-sensitive learning. Model performance was assessed using accuracy, Cohen’s kappa coefficient, macro F1 score, sensitivity, specificity, and AUC. Results: The cost-sensitive Random Forest (RF) model achieved the best overall performance, with an accuracy of 0.830, a macro F1 score of 0.767, a macro-AUC of 0.921, and a macro- sensitivity of 0.743. The class-specific sensitivities were 0.70 (underweight), 0.91 (normal weight), and 0.62 (overweight), with no major misclassification between the extreme categories. Performance remained stable across the different maturation phases (accuracy from 0.823 to 0.839), with optimal discrimination in the pre-PHV (macro-AUC = 0.936; sensitivity for underweight = 0.82) and post-PHV (macro-AUC = 0.931) periods. Body mass was the main predictor (importance = 1.00), followed by waist circumference (0.34–0.53). The importance of age for classifying underweight increased significantly from the pre-PHV (0.10) to the post-PHV (0.75) period. A two-stage hierarchical model further improved underweight detection (stage 1 AUC = 0.911; sensitivity = 0.732). Conclusions: A cost-sensitive RF model, combined with ROSE, provides robust classification of adolescents’ nutritional status maturation, significantly improving underweight detection while preserving overall accuracy. This approach is particularly well-suited to public health screening in schools as a first-stage assessment that requires clinical confirmation and promotes a maturation-aware interpretation of nutritional risk among adolescents.
Background: Malnutrition in adolescents remains a significant public health issue worldwide, with undernutrition and overweight often coexisting. Accurate nutritional screening during adolescence is complicated by variability in biological maturation and class imbalance, particularly among underweight adolescents. Objective: This study aims to develop and validate machine learning models for classifying the nutritional status of adolescents, accounting for class imbalance and biological maturation, and to evaluate model stability and variable importance at different stages of peak height velocity (PHV). Methods: In this cross-sectional study, 4232 adolescents aged 11 to 18 years were recruited from nine educational institutions in Tunisia. Their nutritional status was classified according to the International Obesity Task Force (IOTF) BMI thresholds into three categories: underweight (14.4%), normal weight (68.3%), and overweight (17.2%). Ten anthropometric, behavioral, and maturation-related predictors were analyzed. Six supervised machine learning algorithms were evaluated using a 70/30 stratified split between training and test sets, with five-fold cross-validation. Class imbalance was addressed by ROSE combined with cost-sensitive learning. Model performance was assessed using accuracy, Cohen’s kappa coefficient, macro F1 score, sensitivity, specificity, and AUC. Results: The cost-sensitive Random Forest (RF) model achieved the best overall performance, with an accuracy of 0.830, a macro F1 score of 0.767, a macro-AUC of 0.921, and a macro- sensitivity of 0.743. The class-specific sensitivities were 0.70 (underweight), 0.91 (normal weight), and 0.62 (overweight), with no major misclassification between the extreme categories. Performance remained stable across the different maturation phases (accuracy from 0.823 to 0.839), with optimal discrimination in the pre-PHV (macro-AUC = 0.936; sensitivity for underweight = 0.82) and post-PHV (macro-AUC = 0.931) periods. Body mass was the main predictor (importance = 1.00), followed by waist circumference (0.34–0.53). The importance of age for classifying underweight increased significantly from the pre-PHV (0.10) to the post-PHV (0.75) period. A two-stage hierarchical model further improved underweight detection (stage 1 AUC = 0.911; sensitivity = 0.732). Conclusions: A cost-sensitive RF model, combined with ROSE, provides robust classification of adolescents’ nutritional status maturation, significantly improving underweight detection while preserving overall accuracy. This approach is particularly well-suited to public health screening in schools as a first-stage assessment that requires clinical confirmation and promotes a maturation-aware interpretation of nutritional risk among adolescents. Read More
