Classification of Diabetes Using Ensemble Machine Learning Techniques

Ashisha GR; Anitha Mary X; Mahimai Raja J

doi:10.12694/scpe.v25i4.2873

PDF

Published: Jun 16, 2024

DOI: https://doi.org/10.12694/scpe.v25i4.2873

Keywords:

Diabetes Machine learning gradient boost Ensemble Voting Classifier Random Over Sampling

Ashisha GR

Electronics and Instrumentation Engineering, Karunya Institute of Technology and Sciences, Coimbatore, India

Anitha Mary X

Robotics Engineering, Karunya Institute of Technology and Sciences, Coimbatore, India

Mahimai Raja J

Computer Science Engineering, Karunya Institute of Technology and Sciences, Coimbatore, India

Abstract

Diabetes is a widespread chronic condition that impacts people all over the globe and requires a clear and timely diagnosis. Untreated diabetes leads to retinopathy, nephropathy, and damage to the nervous system. In this context, Machine Learning (ML) might be used to detect health problems early, diagnose them, and track their progress. Ensemble techniques are a promising approach that combines many classifiers to improve forecast accuracy and resilience. This study investigates the categorization of diabetes using an ensemble machine learning technique known as a voting classifier. Using a variety of classifiers, including Light Gradient Boosting Machine (LightGBM), Gradient Boost classifier (GBC), and Random Forest (RF). The predictions are aggregated using voting methods to get a final classification result. The research is carried out using two benchmarking datasets: the Pima Indian Diabetes Dataset (PIDD) and the German Dataset. The Boruta technique is used to choose the best attributes from the datasets, while the Random Over Sampling approach balances the range of classes and eliminates abnormal data using the interquartile range approach. The findings showed that the combination of the Boruta feature selection algorithm and ensemble Voting Classifier performed better for both PIDD and German datasets with an accuracy of 93% and 90% respectively. These algorithms are evaluated and the maximum accuracy is produced using the combination of the Boruta feature selection algorithm and ensemble Voting Classifier. This research helps medical professionals in the early prediction of diabetes, reducing physician’s time.

Issue

Vol. 25 No. 4 (2024)

Section

Special Issue - Unleashing the power of Edge AI for Scalable Image and Video Processing

Article Sidebar

Main Article Content

Abstract

Article Details