Department of Computer Sciences

Permanent URI for this collection

https://repository.run.edu.ng/handle/123456789/72

Browse

Now showing 1 - 2 of 2

APerformance Study of Selected Machine Learning Techniques for Predicting Heart Diseases
(Springer, 2025-04) Olorunfemi, Blessing O.
Heart Disease remains a leading cause of mortality worldwide. It alarmingly rises at a quick rate, making early heart disease prediction crucial for effective prevention and timely intervention. Heart disease diagnosis is a difficult process that requires technical skills and accuracy to complete. With improvements in technology, computing has lent its voice to simplify the diagnosis of various health problems. Machine learning uses past or existing history to predict future results. Various machine learning techniques have been developed over the years and used in predicting heart diseases with various levels of performance. Identifying the best-suited machine learning technique to use for prediction purposes can be a challenging task. This research work analyses the performance of seven (7) machine learning techniques, comprising AdaBoost Algorithm, KNN, Logistic Regression, Naïve Bayes Classifier, Random Forest, SVM, and XGBoost. The heart disease dataset was downloaded from the UCI repository and analysed using Python programming language in the Jupyter Notebook environment. A comparative analysis of the seven (7) techniques was performed based on Accuracy, Precision, and Recall. From the results obtained, KNN, Random Forest, and XGBoost showed superior performance over the others with an accuracy of 100%, AdaBoost Algorithm followed with an accuracy of 92.2%, SVM followed with an accuracy of 91.71%, Naïve Bayes Classifier followed with an accuracy of 88.29% while Logistic Regression has the least accuracy of 86.34%. KNN, RF, and XGBoost outperformed AdaBoost, SVN, and LR
Efficient diagnosis of diabetes mellitus using an improved ensemble method
(Scientific Reports, 2025-01) Olorunfemi, Blessing O.
Diabetes is a growing health concern in developing countries, causing considerable mortality rates. While machine learning (ML) approaches have been widely used to improve early detection and treatment, several studies have shown low classification accuracies due to overfitting, underfitting, and data noise. This research employs parallel and sequential ensemble ML approaches paired with feature selection techniques to boost classification accuracy. The Pima India Diabetes Data from the UCI ML Repository served as the dataset. Data preprocessing included cleaning the dataset by replacing missing values with column means and selecting highly correlated features using forward and backward selection methods. The dataset was split into two parts: training (70%), and testing (30%). Python was used for classification in Jupyter Notebook, and there were two design phases. The first phase utilized J48, Classification and Regression Tree (CART), and Decision Stump (DS) to create a random forest model. The second phase employed the same algorithms alongside sequential ensemble methods—XG Boost, AdaBoostM1, and Gradient Boosting—using an average voting algorithm for binary classification. Evaluation revealed that XG Boost, AdaBoostM1, and Gradient Boosting achieved classification accuracies of 100%, with performance metrics including F1 score, MCC, Precision, Recall, AUC-ROC, and AUC-PR all equal to 1.00, indicating reliable predictions of diabetes presence. Researchers and practitioners can leverage the predictive model developed in this work to make quick predictions of diabetes mellitus, which could save many lives.

Browse

Browsing Department of Computer Sciences by Author "Olorunfemi, Blessing O."

Results Per Page

Sort Options