An Interpretable Voting Ensemble Model for Obesity Classification Using Lifestyle and Physical Health Attributes
DOI:
https://doi.org/10.64751/ijdim.2024.983Abstract
Obesity has become a significant global health concern that requires reliable and interpretable classification approaches to support early intervention and reduce long-term health complications. This study presents an explainable ensemble learning framework for multi-class obesity classification using behavioral, lifestyle, and anthropometric attributes obtained from the UCI Machine Learning Repository obesity dataset. The implemented pipeline performs data cleaning, categorical label encoding, feature scaling, Recursive Feature Elimination with Cross Validation (RFECV) using Logistic Regression for feature optimization, and Synthetic Minority Over-sampling Technique for Nominal and Continuous features (SMOTENC) to address class imbalance. A comprehensive comparative evaluation is conducted across multiple machine learning algorithms including Adaboost, Perceptron, GaussianNB, SGD, SVM, KNN, MLP, Decision Tree, ExtraTrees, Bagging Classifier, Random Forest, Gradient Boosting, LogisticRegressionCV, XGBoost, and LightGBM. Based on comparative performance analysis, an Extended Voting ensemble combining Gradient Boosting, XGBoost, LightGBM, and CatBoost is selected as the final prediction framework. To improve transparency and interpretability, Explainable Artificial Intelligence (XAI) techniques including LIME and SHAP are employed to provide local and global explanations of model behavior. The final trained model is deployed through a Flask-based web application to support real-time obesity level prediction. Experimental results demonstrate superior classification capability, achieving 99.3% accuracy with equally strong precision, recall, and F1-score performance and a ROC–AUC score of 1.000, indicating reliable and interpretable obesity classification for practical decision-support applications.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.






