HEARTSMART: IMPROVED CVD RISK PREDICTION VIA RECURSIVE FEATURE ELIMINATION: VALIDATION ON EXTENDED DATASET

Authors

  • Waqas Tariq Paracha
  • Haleema Inam
  • Maliha Manzoor

Keywords:

HEARTSMART, IMPROVED CVD RISK PREDICTION, RECURSIVE FEATURE ELIMINATION, VALIDATION ON EXTENDED DATASET

Abstract

Heart disease continues to rank among the leading causes of mortality worldwide, posing serious challenges for global healthcare systems. Early detection and precise risk prediction are vital for reducing death rates and ensuring timely medical interventions. This study investigates a machine learning–based framework to enhance heart disease prediction by employing Recursive Feature Elimination (RFE), a robust feature selection technique that systematically removes less significant features to boost model performance and minimize computational costs. Initially, the research utilized a real-world dataset comprising 70,000 patient records sourced from Kaggle. [1] To further strengthen the analysis, the dataset was expanded to 100,000 samples using the Synthetic Minority Oversampling Technique (SMOTE) in Python, enabling a more balanced and enriched data representation. Multiple machine learning algorithms were then applied, including Random Forest, Decision Tree, Naïve Bayes, K-Nearest Neighbor (KNN), and XGBoost, to evaluate their predictive capabilities. Among these, the Random Forest classifier continued to demonstrate superior results, achieving a high accuracy of 99.55% and an AUC of 1.00 on the augmented dataset, showing a minor yet promising improvement over the original performance. The findings confirm the effectiveness of RFE in isolating the most relevant features, thereby improving interpretability, enhancing model efficiency, and reducing unnecessary computational burden. By removing redundant or irrelevant features, RFE ensures that the model focuses on the most critical indicators of heart disease risk. This research contributes to the advancement of a predictive framework capable of assisting healthcare professionals in making more informed clinical decisions. With an accurate and efficient model, early detection and proactive treatment planning become more feasible, ultimately improving patient outcomes and reducing the global burden of heart disease through the integration of machine learning in medical diagnostics.

Downloads

Published

2025-06-30

How to Cite

Waqas Tariq Paracha, Haleema Inam, & Maliha Manzoor. (2025). HEARTSMART: IMPROVED CVD RISK PREDICTION VIA RECURSIVE FEATURE ELIMINATION: VALIDATION ON EXTENDED DATASET. Spectrum of Engineering Sciences, 3(6), 1093–1120. Retrieved from https://sesjournal.com/index.php/1/article/view/551