Skip to main content Skip to main navigation


Predicting the Risk of Heart Failure Based on Clinical Data

Jasminder Kaur Sandhu; Umesh Kumar Lilhore; Poongodi M; Navpreet Kaur; Shahab S. Band; Mounir Hamdi; Celestine Iwendi; Sarita Simaiya; M.M. Kamruzzaman; Amirhosein Mosavi
In: Human-centric Computing and Information Sciences (HCIS), Vol. 12, Pages 1322-1355, Kora Information Processing Soc (KIPS-CSWRG)), 12/2022.


The disorder that directly impacts the heart and the blood vessels inside the body is cardiovascular disease (CVD). According to the World Health Organization reports, CVDs are the leading cause of mortality worldwide, claiming the human life of nearly 23.6 million people annually. The categorization of diseases in CVD includes coronary heart disease, strokes, and transient ischemic attacks (TIA), peripheral arterial disease, aortic disease. Most CVD fatalities are caused by strokes and heart attacks, with an estimated one-third of these deaths currently happening before 60. The standard medical organization "New York Heart Association" (NYHA) categorize the various stages of heart failure as Class I (with no symptoms), Class II (mild symptoms), Class III (comfortable only when in resting position), Class IV (severe condition or patient is bed-bound), and Class V (unable to determine the class). Machine learning-based methods play an essential role in clinical data analysis. This research presents the importance of various essential attributes related to heart disease based on a hybrid machine learning model. The proposed hybrid model SVM-GA is based on a support vector machine and the genetic algorithm. This research analyzed an online dataset obtainable at the UCI Machine Learning Repository with the medical data of 299 patients who suffered from heart failures and are classified as Class III or IV as per the standard NYHA. This dataset was collected through patients' available follow-up and checkup duration and involved thirteen clinical characteristics. The proposed machine learning models were used to calculate feature importance in this research. The proposed model and existing well-known machine learning based-models, i.e., Bayesian generalized linear model, ANN, Bagged CART, Bag Earth, and SVM, are implemented using Python and various performance measuring parameters, i.e., accuracy, processing time, precision, recall, F-measures are calculated. Experimental analysis shows the proposed SVM-GA model strengthens in terms of better accuracy, processing time, precision, recall, F-measures over existing methods.