From Data to Diagnosis: Effective Machine Learning-based Heart Disease Prediction

Authors

  • Tianyi Lu Author

DOI:

https://doi.org/10.61173/7fqjnk03

Keywords:

Machine learning, heart diseas, artificial intelligence

Abstract

Heart disease is a leading cause of mortality worldwide, contributing to nearly 18 million deaths annually. Early detection is critical but remains a significant challenge due to the limitations of traditional diagnostic methods, which can be prone to human error. This study aims to enhance heart disease prediction using machine learning (ML) by comparing the performance of three ML models: Support Vector Machine (SVM), Random Forest (RF), and XGBoost. A dataset containing 12 features from 918 patients was used, with preprocessing steps such as one-hot encoding for categorical variables and MinMax scaling for numerical features. The models were trained and evaluated using 5-fold cross-validation to ensure robustness. Random Forest demonstrated the highest accuracy at 82.78%, followed closely by SVM (82.67%) and XGBoost (81.58%). The feature importance analysis identified ST_Slope as the most significant predictor of heart disease, providing important insights into which features are most influential in the diagnosis process. While the Random Forest model outperformed the others, this study also highlights the need for better interpretability in ML models, especially in medical applications where understanding the relationships between features is crucial. Future research should focus on improving model transparency to bridge the gap between accuracy and practical application in clinical settings.

Downloads

Published

2024-10-29

Issue

Section

Articles