Machine Learning-Driven Stroke Prediction with Explainable Models and Real-Time Analysis

Authors

  • Syed Althaf Author
  • Mohammed Ammar Author
  • Mohammed Abdul Rab Rayyan Author
  • Mohammed Abdul Bari Ayaan Author
  • Abdullah Saleh Hasan Abdul Hak Author

DOI:

https://doi.org/10.64751/

Abstract

The increasing need for early detection of stroke-related conditions has led to the development of intelligent healthcare prediction systems that can analyze patient data efficiently and accurately. This project presents a machine learning-based healthcare classification system designed to identify whether a given dataset corresponds to normal or stroke conditions. The application provides a user-friendly interface where users can upload a healthcare dataset, visualize the distribution of normal and stroke cases through graphical representation, and perform data preprocessing by splitting the dataset into 80% training and 20% testing data. The system incorporates multiple machine learning algorithms, including Random Forest, Logistic Regression, Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Naïve Bayes, XGBoost, and CatBoost, to perform classification. Each model is trained individually, and its performance is evaluated using accuracy and confusion matrix analysis. Among these, Random Forest and CatBoost achieved the highest accuracy of 95%, while KNN and XGBoost also demonstrated strong performance with 91% and 89% accuracy respectively. Logistic Regression, SVM, and Naïve Bayes provided moderate results, contributing to a comprehensive comparison of different techniques. The confusion matrix visualization helps in understanding prediction performance, where correct classifications dominate and misclassifications are minimal. Additionally, the system includes a comparison module that evaluates all algorithms based on multiple performance metrics, allowing users to identify the most effective model. A prediction feature is also integrated, enabling users to upload new test data and obtain real-time classification results indicating whether the data corresponds to a stroke or normal condition. Overall, the proposed system reduces manual effort in healthcare data analysis and supports accurate and fast decision-making. By combining multiple machine learning techniques and interactive visualization, it serves as a practical tool for stroke prediction and healthcare data classification.

Downloads

Published

2026-04-16

How to Cite

Syed Althaf, Mohammed Ammar, Mohammed Abdul Rab Rayyan, Mohammed Abdul Bari Ayaan, & Abdullah Saleh Hasan Abdul Hak. (2026). Machine Learning-Driven Stroke Prediction with Explainable Models and Real-Time Analysis. International Journal of Data Science and IoT Management System, 5(1), 987-994. https://doi.org/10.64751/