Heart disease detection using machine learning methods: a comprehensive narrative review
Review Article

Heart disease detection using machine learning methods: a comprehensive narrative review

Mohammadreza Hajiarbabi ORCID logo

Computer Science Department, Purdue University Fort Wayne, Fort Wayne, IN, USA

Correspondence to: Mohammadreza Hajiarbabi, PhD in Computer Science. Assistant Professor, Computer Science Department, Purdue University Fort Wayne, 2101 E Coliseum Blvd, Fort Wayne, IN 46805, USA. Email: hajiarbm@pfw.edu.

Background and Objective: Heart disease is one of the diseases that is responsible for the death of millions of people each year worldwide. It is considered one of the main diseases in middle-aged and elderly people. The increasing rate of heart disease cases, high mortality rate, and medical treatment expenses necessitate early diagnosis of symptoms. The aim of this study is to conduct an extensive review of various state-of-the-art methods in heart disease detection and to perform a comparative analysis of their outcomes.

Methods: Given the seriousness of these issues, researchers have developed various methods for detecting heart diseases. This paper presents a systematic and detailed review of methods used for heart disease detection. Research papers published in well-known journals relevant to the topic of heart disease diagnosis were analyzed. Research findings are presented in a table for better understanding.

Key Content and Findings: Based on the search conducted, research in heart disease detection can be divided into three main categories: heart disease detection based on standard clinical information, heart disease detection based on electrocardiogram (ECG) and phonocardiogram (PCG), and heart disease detection based on X-ray images. The state-of-the-art methods in each field are discussed.

Conclusions: For heart disease detection based on standard clinical information, extreme gradient boosting machine (XGBoost), Random Forest, Ensemble Learning, and Neural Network methods were found to be better than other classic machine learning methods. Applying dimensional reduction techniques to the data, such as principal component analysis (PCA) and feature selection, can produce better results. For heart disease detection based on ECG signals, convolutional neural network still appears to be among the best methods.

Keywords: Heart disease detection; machine learning; electrocardiogram signals (ECG signals); phonocardiogram signals (PCG signals); X-ray

Received: 06 November 2023; Accepted: 26 March 2024; Published online: 26 June 2024.

doi: 10.21037/jmai-23-152


Heart disease is one of the major causes of human deaths. According to the World Health Organization (WHO), around one-third of all deaths worldwide, approximately 18 million people, are related to heart disease (1). Factors such as alcohol and tobacco use, poor diets, and insufficient exercise increase the likelihood of developing heart disease, which can manifest symptoms such as obesity and high blood pressure (2). Since these symptoms can overlap with those of other diseases, obtaining an accurate diagnosis is crucial to reduce life-threatening risks. Machine learning techniques offer a promising avenue for predicting and diagnosing heart disease.

Various methods exist for diagnosing heart problems. In this review, we focus on three main research areas for heart disease detection: detection based on standard clinical information, detection based on electrocardiogram (ECG) signals, and detection based on X-ray images. Other types of heart disease detection are not covered in this review paper.

In recent years, significant research has been conducted on this topic. It is essential to gather and analyze this research, classify and summarize the results. To create a valuable review of heart disease detection techniques using machine learning methods, our search focused on publications in high-impact factor journals (above 3.0), well-known conferences, or papers with high citations (at least 20 citations). This paper is divided into four main sections. Section 1 is the introduction. Section 2 describes the methods of the search, including the search criteria and information sources. Section 3 contains the evaluation of the research papers and a detailed survey of heart disease detection methods. Finally, Section 4 presents a summary of the study and the conclusion. The author presents this article in accordance with the Narrative Review reporting checklist (available at https://jmai.amegroups.com/article/view/10.21037/jmai-23-152/rc).


Research methodology

The primary objective of this literature review was to identify the most effective methods for detecting heart disease using machine learning techniques. Unlike other review papers that often focus on a single aspect of heart disease, this review paper addresses two main questions:

  • What are the major machine learning techniques used for heart disease detection?
  • What are the main characteristics of datasets available for heart disease research?

By addressing these questions, this review aims to provide a comprehensive understanding of the various machine learning approaches and datasets utilized in heart disease detection research.

Search strategy

Applying a well-planned search is crucial to gather data in the desired field. Various sources such as conference papers, journal papers, case studies, etc., were examined. Additionally, websites containing relevant keywords such as “heart disease” and “machine learning methods for heart disease” were searched. Table 1 presents the search terms utilized in the search process, while Table 2 provides a summary of the search strategy.

Table 1

Search terms that were used in the searching process

Search terms
     “Heart Disease”
     “Deep Learning”
     “Convolutional Neural Network”
     “Cardiovascular Disease”
     “Machine Learning”
     “Scar segmentation”
     “Heart segmentation”
     “Atrial segmentation”
     “Vision transformers”
     “gradient boosting”
     “Random Forest”
     “Decision Trees”
     “Support Vector Machine”
     “Naive Bayes”

Table 2

The search strategy summary

Items Specification
Date of search Jan 8th 2023 (first search), Jan 1st 2024 (second search)
Databases and other sources searched IEEE, ACM, Springer, and Google Scholar
Search terms used Refer to Table 1 in the paper
Timeframe Jan 1st 2015–Jan 1st 2024
Inclusion criteria Only papers written in English and published between the years 2015 and 2024 were considered for inclusion
Selection process All work was done by the author

Resources of search

The search was conducted across journals including IEEE, ACM, Springer, and Google Scholar. In the initial stage, every paper that exhibited the specified characteristics was selected. Subsequently, approximately 100 papers were chosen for analysis from the pool of relevant papers identified during the search. Only papers written in English and published between the years 2015 and 2024 were considered for inclusion. Each of these papers underwent a thorough evaluation and analysis.

Selection and evaluation procedure

In the subsequent stage, the selected papers underwent further evaluation. Among papers with similar content, those that provided more detailed discussions of the methods and reported more extensive experiences were prioritized. Each paper was meticulously read, and the most significant information was extracted. As previously mentioned, the primary emphasis was placed on specified fields related to heart disease detection, including data based on standard clinical information, ECG images, and chest X-rays.

Heart disease detection review

Heart disease detection based on data from standard clinical information

Das et al. conducted research on heart disease detection utilizing various methods (3). The employed methods included extreme gradient boosting machine (XGBoost), Bagging, Random Forest, Decision Tree, K-nearest neighbor, and Naive Bayes. They utilized the “Key Indicators of Heart Disease” dataset available on Kaggle (4), which comprised 319,795 cases and over 300 features, of which 18 were selected for analysis. Among these features, nine were Boolean, five were string, and four were decimal variables. Additionally, several data processing techniques were applied, such as data cleaning, removal of duplicates, and conversion of categorical variables. An 80–20% split was employed for training and testing, respectively. Various evaluation metrics including accuracy, sensitivity, precision, F1-score, and area under the curve (AUC) were utilized for comparing the results.

Gangadhar et al. utilized the Cleveland dataset, which comprises 76 features, although only 14 features were employed in their study (5). Their research considered factors such as age, gender, chest pain, cholesterol, and resting blood pressure. The methods employed in their study included artificial neural network (ANN), support vector machine (SVM), Random Forest, Decision Tree, and K-nearest neighborhood (KNN). The highest accuracy result, reaching 84.44%, was achieved using the Neural Network method.

Khader Basha et al. employed hybrid machine learning algorithms, specifically Decision Trees and the Adaptive Boosting (AdaBoost) algorithm (6). They utilized the Framingham Heart Laboratory dataset, a subset of the Framingham Heart Study (FHS) dataset. The Framingham study is a long-term investigation focusing on genetics and environmental variables contributing to cardiovascular disease (CVD) in both men and women. The dataset comprises 16 features, with 70% of the data allocated for training and the remaining portion for testing.

Jahed et al. employed stochastic gradient descent (SGD), Decision Trees, and Random Forest methods (7). These methods were applied to the Kaggle dataset (4). Numerous experiments were conducted based on splitting the data by sex and race, revealing increased accuracy when stratifying the data by these variables.

Chopra et al. utilized Random Forest, K-nearest neighbors, Decision Trees, Logistic Regression, Naive Bayes, and Ensemble Learning (8). Principal component analysis (PCA) was also employed to reduce the dimensionality of the data. The study compared results obtained with and without applying PCA, demonstrating that employing PCA enhances the detection rate. The Cleveland dataset, containing 303 instances and 14 features, was used in their research.

Gola et al. employed the Satin Bowerbird Optimization (SBO) algorithm (9), which selects the most important features, followed by the utilization of a deep learning model for classification. Their preprocessing approach involved handling missing data, normalization, feature selection, and feature weighting using a modified Kalman filter, backpropagation, min-max normalization, and the SBO algorithm (10). The results demonstrated the superiority of this method over other approaches (9).

Shail et al. utilized similar machine learning methods compared to other researchers (11), including Random Forest, Decision Tree, KNN, and Logistic Regression. A notable aspect of their study was applying these methods to various datasets and reporting accuracy and error metrics. The datasets used included the University of California Irvine (UCI) heart disease dataset, Framingham dataset, Cleveland dataset, and cardiovascular disease dataset. Logistic regression exhibited superior performance compared to other methods in their analysis.

Shukla et al. focused primarily on preprocessing in their work (12). Their preprocessing phase encompassed handling missing values, feature selection, feature scaling, and class balancing (13,14). To address missing data, they employed Multivariate Imputation by Chained Equations (MICE). Subsequently, they utilized a hybrid approach combining a genetic algorithm with recursive feature elimination (GARFE) for feature selection. Standard scaling was applied to modify features to have a zero mean and unit standard deviation. They employed the Synthetic Minority Oversampling Technique (SMOTE) to evenly distribute samples (15). The methods used for classification included Naive Bayes, SVM, Logistic Regression, Random Forest, and AdaBoost. Logistic Regression and Random Forest exhibited the highest accuracy, while SVM had the lowest. However, details regarding the kernel and other parameters used for SVM were not provided in the paper.

Sen et al. implemented a soft voting ensemble approach utilizing various methods (16). In soft voting, the average of all probabilities generated by different methods is computed, and the class label is assigned based on the highest average probability. Gaussian Naive Bayes, CatBoost, LightGBM, XGBoost, Random Forest, and Multilayer Perceptron models were employed for soft voting. They created a dataset by combining the UCI heart disease dataset and the UCI Stalog dataset. The soft voting ensemble effectively reduced false negative and false positive values. In the context of heart disease, minimizing false negatives is preferred. The soft voting ensemble model demonstrated a reduction in false negatives compared to other machine learning methods. Standard scaling was utilized to normalize the data, and hyperparameter tuning was employed to enhance the performance of all methods.

Jain et al. utilized the cardiovascular disease dataset created by Svetlana Ulianova (17), containing 12 features and one target variable. They employed feature selection techniques (18), including recursive feature elimination and tree-based feature selection. Recursive feature elimination involves iteratively eliminating features based on their coefficient attributes, removing the least significant features from the feature set. Tree-based feature selection utilized the Extra-Tree-Classifier, which calculates feature importance using the Gini index. Features located closer to the root node are considered more important and retained. The methods employed for classification were Random Forest, SVM, KNN, and Neural Networks. Among these methods, Neural Networks demonstrated superior results.

Varshini et al. utilized PCA and Relief Feature Selection prior to applying machine learning methods (19). PCA was employed to identify patterns in the data, while Relief Feature Selection was used to select the most important features. Random Forest yielded superior results compared to other methods. The dataset used was the heart disease dataset from IEEE data, comprising 11 clinical characteristics with 1,190 instances. Some characteristics included age, sex, and resting blood pressure. Results were reported both before and after transformation.

Mahmud et al. applied various machine learning methods to a Kaggle dataset (20). Additionally, they employed different ensembling methods such as Bagging, Voting, and Stacking. For stacking, SVM, Decision Tree, and Random Forest with hyperparameter tuning were used as base classifiers, and XGBoost served as the meta-classifier. Their proposed methods demonstrated better results compared to other techniques.

Ramesh et al. conducted a comprehensive data analysis on the Kaggle dataset (21). Results indicated that 55% of the samples had heart disease, with the remaining 45% not having heart disease. The proportion of males to females was approximately 1:0.27, indicating a higher number of males. The average cholesterol value was 198, and most resting ECGs showed normal values. Outliers were observed in cholesterol and resting blood pressure features. Correlation analysis revealed a high positive correlation between exercise angina and oldpeak, while MaxHR and st_slop exhibited a high negative correlation. Gradient Boost and Random Forest yielded superior results. Feature importance analysis using Gradient Boost and Random Forest highlighted St_slope, ChestPainType, MaxHR, and Oldpeak as the most important features.

Prasanna et al. utilized reinforcement learning (RL) on the Cleveland dataset (22), considering three features: trestbps, Chol, and age. Their proposed method outperformed other machine learning methods, employing the Q-Learning method. Q-Learning estimates a group of Q-values based on pairs of states and actions, rewarding the agent based on its chosen actions, which can yield positive or negative rewards.

Abdellatif et al. employed the SMOTE to address imbalance distribution issues (23). They utilized six different machine learning classifiers to detect patient status, alongside hyper-parameter optimization (HPO) for tuning classifier hyperparameters with SMOTE. The hyperband method (HB) was used to determine the best SMOTE hyperparameters. Their proposed method, Extra Trees and SMOTE optimized with hyperband, demonstrated superior performance compared to other approaches. The workflow involved data cleaning and preprocessing, removal of missing data, and normalization using max-min. Subsequently, the data was split into training and test sets, followed by applying SMOTE to balance the data distribution. Next, machine learning models were built and evaluated, with hyperparameters updated if the model did not reach optimal performance. The SMOTE hyperparameters were also adjusted in this step. Finally, the model was evaluated and compared with other models, adhering to stopping criteria for further iterations if necessary.

Mohan et al. proposed a hybrid random forest and linear method (HRFLM) (24). HRFLM comprises four algorithms. The first algorithm involves a Decision-Tree-based partition. The second algorithm applies a machine learning algorithm to minimize the error rate. The third algorithm focuses on feature extraction using the classifier with minimal errors. Finally, the fourth part applies the classifier on the extracted features.

Ali et al. introduced an expert system for heart disease detection (25). The first SVM model is a linear model regularized using the L1 regularization method, which eliminates irrelevant features by setting their coefficients to zero. The second SVM model utilizes L2 regularization and serves as a predictive model. They proposed a hybrid grid search algorithm (HGSA) to simultaneously optimize the two SVM models. Six evaluation metrics, including accuracy, sensitivity, specificity, Matthews correlation coefficient (MCC), receiver operating characteristic (ROC) curve, and AUC, were used to assess the effectiveness of HGSA. The proposed method demonstrated superior results compared to other machine learning methods. The Cleveland dataset was utilized, with 70% of the samples used for training and the remaining for testing.

Tama et al. devised a heart disease detection system comprising three phases (26): feature selection, classifier modeling, and validation analysis. In the initial phase, they sought the best features for heart disease detection using the correlation-based feature selection (CFS) method, optimized through particle swarm optimization (PSO). Subsequently, a two-tier ensemble was established in the second phase, comprising XGBoost, gradient boosting machine (GBM), and Random Forest models. By stacking these ensembles, a final prediction was generated. The third phase involved evaluating the performance of the two-tiered ensemble across different datasets, including Z-Alizadeh, Stalog, Cleveland, and Hungarian datasets.

Fitriyani et al. developed a decision support system (DSS) for early-stage heart disease diagnosis (27). The algorithm employed for heart disease prediction was XGBoost.

Pandya et al. (28) conducted a study on heartbeat acoustic events using the Cardiac-200 and PhysioNet datasets. The PhysioNet dataset comprises 1,500 heartbeat acoustic samples without augmentation and 1,950 samples with augmentation, including events such as murmur, normal, and artifact. Their research involved analyzing various heartbeat acoustic events using different audio processing libraries to extract information from recorded heartbeat sound signals. They classified acoustic images using long short-term memory (LSTM), convolutional neural network (CNN), recurrent neural network (RNN), K-means clustering, and SVM methods. Their InfusedHeart method demonstrated superior performance compared to all other methods.

Table 3 provides a summary of the methods, datasets used, and results mentioned in this section.

Table 3

Analysis of heart disease detection based on data from standard clinical information

Reference Classifier and training algorithm Dataset description Results
(3) XGBoost, Bagging, Random Forest, Decision Tree, K-nearest neighbor, and Naive Bayes Key Indicators of Heart Disease Accuracy: 91.30%, 90.10%, 90.20%, 86.32%, 91%, 91%
Sensitivity: 92%, 92%, 92.25%, 93%, 91.75%, 91.31%
Precision: 99%, 97.30%, 97.44%, 92.18%, 99%, 99%
F1-score: 95.40%, 94.72%, 94.78%, 92.50%, 95.26%, 95%
AUC: 0.83, 0.73, 0.78, 0.58, 0.70, 0.64
(5) ANN, SVM, Random Forest, Decision Tree, KNN Cleveland dataset Accuracy: 88.44%, 83.33%, 81.67%, 73.33, 61.67%
(6) Decision Trees, Ada-boost Framingham Heart Laboratory True positive rate: 66.2%, 95.67%
Specificity: 80%, 94.65%
Accuracy: 82.64%, 97.43%
(7) SGD, Decision Tree, Random Forest Key Indicators of Heart Disease Accuracy: 76%, 95%, 96%
(8) Random Forest, K-nearest neighbors, Decision Trees, Logistic Regression, Naive Bayes and Ensemble Learning (after applying PCA) Cleveland dataset Accuracy: 77.04%, 83.06%, 78.16%, 80.32%, 85.24%, 86.88%
Precision: 79.41%, 83.33%, 77.14%, 82.35%, 82.05%, 84.21%
Recall: 79.41%, 88.23%, 79.41%, 82.35%, 94.11%, 94.11%
F1-score: 79.42%, 85.14%, 78.62%, 82.35%, 87.67%, 88.88%
(9) Satin Bowerbird optimization Cleveland and Hungarian dataset Accuracy: 90%
Precision: 94%
Recall: 91.3%
F1-score: 92.6%
(11) Logistic Regression, KNN, Random Forest, Decision Tree University of California Irvine Dataset Repository, Cleveland, cardiovascular disease, Framingham datasets Accuracy: 64.39%, 58.9%, 72.74%, 73.04%
Precision: 84.59%, 80.97%, 83%, 82.63%
Recall: 93.55%, 93.01%, 90%, 74%
F1-score: 83.33%, 90%, 83.33%, 76.67%
(12) Naive Bayes, SVM, Logistic Regression, Random Forest and AdaBoost Cleveland dataset Accuracy: 84.81%, 79.49%, 83.79%, 83.79%, 82.09%
Sensitivity: 80.47%, 74.79%, 79.09%, 77.71%, 79.87%
Specificity: 88.51%, 83.49%, 87.79%, 89.01%, 84.11%
Precision: 85.51%, 79.39%, 84.59%, 85.69%, 81.05%
F-measure: 82.89%, 77.11%, 81.81%, 81.49%, 80.39%
(16) Decision Tree, Logistic Regression, Support Vector Machine, Gaussian Naive Bayes, Catboost, LGBM, Random Forest, XGboost, Soft Voting (Gaussian Naive Bayes, Catboost, LGBM, XGBoost, Random Forest, and Multilayer Perceptron) Combination of University of California Irvine Heart Disease dataset and the University of California Irvine Stalog dataset Accuracy: 80.98%, 86.41%, 86.41%, 88.04%, 90.22%, 88.59%, 88.59%, 90.22%, 91.85%
F1-score: 78.63%, 85.32%, 85.32%, 89.22%, 91.18%, 87.85%, 88.57%, 90.38%, 91.43%
Precision: 90.20%, 91.18%, 91.18%, 89.22%, 91.18%, 92.16%, 91.18%, 92.16%, 94.12%
Recall: 84.02%, 88.15%, 88.15%, 89.22%, 91.18%, 89.95%, 89.86%, 91.26%, 92.75%
AUC: 0.8677, 0.8951, 0.8883, 0.9241, 0.9313, 0.9264, 0.9293, 0.9339, 0.9344
(17) SVM, KNN, Random Forest, and Neural networks Cardiovascular disease dataset Accuracy using recursive feature elimination: 74.59%, 68.05%, 72.66%, 80.89%
Accuracy using Extra Tree: 73.84%, 67.85%, 72.83%, 78.52%
(19) KNN, Decision Tree, SVM, Random Forest, Neural Network, Naive Bayes, Logistic Regression Heart disease dataset from the IEEE data port Accuracy (after data transformation): 85%, 86%, 80%, 90%, 87%, 80%, 83%
(20) SVM, KNN, Logistic Regression, Random Forest, Decision Tree, XGBoost, Stacking, Voting, Bagging XGBoost, Ensemble Hybrid voting Kaggle dataset Accuracy: 83%, 82%, 82%, 81%, 83%, 84%, 83.97%, 83.71%, 84%, 84.036%
Precision: 86%, 84%, 84%, 83%, 88%, 86%, 86%, 87%, 87%, 87%
Recall: 83%, 82%, 82%, 81%, 83%, 84%, 84%, 84%, 84%, 84%
F1-score: 84%, 83%, 83%, 82%, 83%, 85%, 84%, 84%, 85%, 85%
(21) Logistic regression, SVM, Random Forest, Decision Tree, XGBoost, Naive Bayes, Gradient Boosting Classifier, KNN, AdaBoost Kaggle dataset Accuracy: 83%, 84%, 87%, 79%, 86%, 86%, 88%, 70%, 85%
Precision: 83%, 84%, 87%, 78%, 86%, 86%, 88%, 69%, 85%
Recall: 83%, 83%, 87%, 79%, 86%, 85%, 87%, 69%, 85%
F1-score: 83%, 83%, 87%, 79%, 86%, 86%, 88%, 69%, 84%
(22) Q-Learning, Decision Tree, KNN Cleveland dataset Accuracy: 87.98%, 77.15%, 75.24%
Precision: 99.21%, 69.71%, 65.42%
Recall: 58.32%, 56.33%, 55.65%
F1-score: 0.7654, 1.1212, 1.3211
AUC: 0.8065, 0.7153, 0.7697
(23) HB + SMOTE + Extra Trees Cleveland and Statlog datasets collected from the University of California Irvine Accuracy*: 99.2%, 98.52% (*The first number is for Cleveland dataset and the second number is for Statlog dataset)
Precision: 98.7%, 98.13%
Recall: 99.33%, 98.09%
Specificity: 99.12%, 98.72%
(24) HRFLM Cleveland dataset Accuracy: 88.4%
Precision: 90.1%
Recall: 92.8%
F-measure: 90%
Specificity: 82.6%
(25) HGSA Cleveland dataset Accuracy*: 91.11%, 92.22% (*The first one is the Simulation results of L1-linear SVM model stacked with L2 linear SVM model. The second one is the simulation results of L1-linear SVM model cascaded with L2 SVM model with RBF kernel)
Recall: 87.80%, 82.92%
Specificity: 93.87%, 100%
MCC: 0.820, 0.851
(26) Two-tier ensemble PSO-based feature selection Z-Alizadeh, Statlog, Cleveland and Hungarian datasets Accuracy: 98.13%, 93.55%, 86.49%, 91.18%
F1-score: 96.60%, 91.67%, 86.49%, 90.91%
AUC: 98.70%, 93.42%, 85.86%, 92.98%
(27) HDPM Statlog, Cleveland datasets Accuracy: 95.90%, 98.40%
Precision: 97.14%, 98.57%
Recall: 94.67%, 98.33%
F1-score: 95.35%, 98.32%
MCC: 92%, 97%
(28) InfusedHeart Framework Cardiac-200 and PhysioNet Accuracy: 89.36%

XGBoost, extreme gradient boosting machine; ANN, artificial neural network; SVM, support vector machine; KNN, K-nearest neighborhood; SGD, stochastic gradient descent; PCA, principal component analysis; AdaBoost, Adaptive Boosting; lGBM, light gradient-boosting machine; HB, hyperband method; SMOTE, Synthetic Minority Oversampling Technique; HRFLM, hybrid random forest and linear method; HGSA, hybrid grid search algorithm; MCC, Matthews correlation coefficient; PSO, particle swarm optimization; HDPM, heart disease prediction model.

Heart disease detection based on ECG signals

Another area of research in heart disease detection involves using heart signals. Let’s take a look at papers in this area.

Fernando et al. proposed a novel method for segmenting phonocardiogram (PCG) signals into heart states (29). RNN and attention-based learning were used to segment the PCG signal. Their method demonstrated state-of-the-art performance in both human and animal recordings. Different features such as envelope features, wavelet, and Mel Frequency Cepstral coefficients (MFCC) were analyzed. Quantitative measurements were used to show which features were more important.

Many works have been done on ECG monitoring and analysis systems. In (30), Meng et al. used the IREALCARE2.0 Flexible Cardiac Monitor Patch as the device that collects ECG signals in order to create the ECG datasets. For the automatic classification and analysis of ECG signals, a deep CNN method called time-spatial constitutional neural network (TSCNN) was used. In the first stage, the original ECG signals were divided into separate heartbeats and fed to the TSCNN. Second, the features for each heartbeat were extracted. In the final stage, in order to improve classification performance and reduce the number of parameters of the network, cascaded small-scale kernel convolution was used. Methods such as dropout and batch normalization were employed to reduce overfitting.

Venkataramanaiah et al. developed a virtual environment named the VH-doctor machine to offer guidance on heart diseases (31). This virtual environment can detect heart problems in their early stages without requiring the presence of a physician. Utilizing biomedical sensors, an ARM processor, and a field programmable gate arrays (FPGA), the system is capable of detecting, testing, analyzing, and reporting the normality or abnormality of cases. In their study, ECG signal processing, feature extraction, and KNN were employed. A comparison with other state-of-the-art methods demonstrated that they achieved superior results.

Zhang et al. designed an automated system using deep learning to diagnose heart disease (32). The training and testing sets consisted of 259,789 and 18,018 ECG signals, respectively, collected from a hospital. The dataset covered more than 90% of clinical diagnoses and included 18 different classes, with 17 of them representing abnormalities and one representing normal ECGs.

Kumar et al. designed a model named Fuzz-ClusNet to detect arrhythmia from ECG signals (33). Their model was a hybrid model combining deep learning and fuzzy clustering. The algorithm had five phases. In the first phase, the ECG signals were denoised. Then, the ECG signals were segmented. The third phase involved data augmentation, where the number of training sets was increased using techniques such as shifting, flipping, etc. The next step was feature extraction, which was performed using CNN. In the fifth and final step, the extracted features were classified using fuzzy clustering.

Jamil et al. focused their research on valvular heart diseases (VHDs), known for their high mortality rates (34). To swiftly and accurately diagnose VHDs, they utilized PCG signals, aiming to reduce mortality rates. Deep learning methods were applied, employing three different frameworks for analyzing both 1D and 2D PCG raw signals. For 1D PCG, features such as linear prediction cepstral coefficients (LPCC) and Mel-frequency cepstral coefficients (MFCC) were extracted, while deep convolutional features were derived for 2D PCG analysis. Additionally, the authors employed particle swarm intelligence and genetic algorithms to automatically and efficiently select relevant features from the PCG signal data. To enhance classifier performance, they implemented vision transformer (ViT) technology. A thorough performance analysis was conducted, with ViT exhibiting the most promising results among the methods evaluated.

Le et al. utilized deep learning to detect heart problems from ECG signals (35). While the standard 12-lead ECG provides comprehensive information about the heart’s electrical activity, a simpler ECG with fewer leads could offer greater accessibility and convenience, especially for integration into wearable devices. In their study, they proposed a novel deep learning approach to identify heart problems using just three ECG leads: I, II, and V1 (indicating their placement on the patient’s body). Their architecture involved three one-dimensional CNNs to extract features from these three input ECG leads, followed by an attention module to integrate the outputs.

De Marco et al. developed a system for detecting premature ventricular contractions (PVC) using ECG signals (36). PVC is challenging to detect, particularly via ECG. The study employed various classifiers, including Decision Tree, Random Forest, LSTM, bidirectional LSTM (BLSTM), ResNet-18, MobileNetv2, and ShuffleNet. They utilized the MIT-BIH Arrhythmia Dataset for evaluation, with MobileNetv2 exhibiting superior performance. Interestingly, the study did not employ any specific feature extraction method.

Another study by De Marco et al. focused on detecting PVC using ECG signals (37). They explored five distinct deep learning architectures: LSTM, AlexNet, GoogleNet, Inception V3, and ResNet-50. Their findings indicated that ResNet-50 yielded superior results.

Table 4 provides a summary of the methods employed, datasets utilized, and the results obtained in this segment.

Table 4

Analysis of heart disease detection based on ECG and PCG signals

Reference Classifier and training algorithm Dataset description Results
(29) BLSTM with attention PCC, M3-Hu, M3-An Accuracy: 96.9%, 97.1%, 96.0%
Precision: 96.3%, 93.1%, 91.1%
Recall: 97.2%, 96.7%, 95.4%
Specificity: 97.5%, 96.7%, 96.2%
F1-score: 96.70%, 94.70%, 93.25%
(30) TSCNN Self-made dataset Accuracy: 87.20%
(31) ECG signal processing, feature extraction and KNN classifier Self-made dataset Accuracy: 99.00%
(32) ECG using CNN Self-made dataset Accuracy: 95%
Accuracy for diagnosis of normal rhythm/atrial fibrillation: 99.15%
(33) ECG using fuzzy clustering and deep neural network MIT-BIH Arrhythmia Dataset Accuracy: 98.66%, precision: 98.92%, recall: 93.88%, F1-score: 96.34%
PTB Diagnostic ECG Dataset Accuracy: 95.79%, precision: 96.29%, recall: 85.38%, F1-score: 80.37%
(34) ViT Collected and arranged data in form of dataset Mean average accuracy: 99.90%
F1-score: 99.95%
(35) ECG using deep learning Chapman and CPSC-2018 F1-score: 0.9718 and 0.8004
(36) Detection of PVC MIT-BIH Arrhythmia Dataset Accuracy: 0.9990, 0.9984, 0.9967, 0.9941, 0.9938, 0.9927, 0.9871
MobileNetv2, ResNet-18, ShuffleNet, BLSTM, LSTM, Random Forest, Decision Tree
(37) Detection of PVC MIT-BIH Arrhythmia Dataset Accuracy: 98.53%, 99.74%, 99.76%, 99.70%, 99.85%
LSTM, Alex Net, GoogleNet, Inception V3 and ResNet-50

ECG, electrocardiogram; PCG, phonocardiogram; BLSTM, bidirectional LSTM; LSTM, long short-term memory; PCC, prediction cepstral coefficients; TSCNN, time-spatial constitutional neural network; KNN, K-nearest neighborhood; CNN, convolutional neural network; ViT, vision transformer; PVC, premature ventricular contractions.

Heart disease detection based on X-ray images

Pant et al. conducted a distinctive study compared to other researchers (38), presenting a method named CardioHelp that utilizes a CNN to predict the likelihood of a patient developing CVD. The ChestX-ray14 dataset was employed, enabling the program to identify 14 different abnormalities. The study utilized ResNet-38, ResNet-50, and ResNet-101 among its methods. Image normalization was performed by subtracting the mean and dividing the values by the standard deviation. Interestingly, the model outperformed a radiologist in detecting 11 out of the 14 abnormalities. However, it performed worse in identifying three abnormalities, including cardiomegaly (an unusually large heart).

Nasser et al. devised a system for diagnosing heart and lung diseases from chest X-ray images (39). Their approach involved a two-step process. Initially, the images were categorized into three classes: normal, lung disease, and heart disease. Subsequently, the specific heart and lung diseases were classified into one of seven categories. The dataset encompassed 26,316 chest X-ray images. The researchers proposed two distinct deep learning methods. The first, DC-ChestNet, relied on ensembling deep CNN models. The second method, VT-ChestNet, was based on a transformer architecture. Notably, VT-ChestNet outperformed state-of-the-art models such as Xception.

Choudhary et al. developed a CNN comprising 12 layers (40). The architecture included three activation layers, three pooling layers, three additional activation layers, and three more pooling layers. Finally, three fully connected layers were followed by a softmax function to produce the model’s output based on probability. They utilized the ChestX-ray14 dataset (41).

Aram et al. employed the VGG16 architecture along with the ChestX-ray14 dataset (42). Notably, their study focused on reporting the detection rate of each disease individually.

Table 5 summarizes the methods, datasets used, and results of the approaches mentioned in this section.

Table 5

Analysis of heart disease detection based on X-ray images

Reference Classifier and training algorithm Dataset description Results
(38) CNN Segmented images Accuracy: 93.46%
(39) DC-ChestNet, VT-ChestNet Merging CheXpert and VinDr-CXR datasets DC-ChestNet: F1-score: 81.00%, specificity: 95.65%, sensitivity: 74.35%, accuracy: 81.10%, AUC: 94.89%
VT-ChestNet: F1-score: 79.34%, specificity: 98.60%, sensitivity: 84.21%, accuracy: 82.36%, AUC: 95.13%
(40) CNN ChestX-ray8 Accuracy of proposed CNN: 89.77%
Accuracy of AlexNet: 76.09%
Accuracy of LiNet: 67.90%
(42) VGG16 ChestX-ray14 Accuracy: 92.6%
Sensitivity: 92.9%

CNN, convolutional neural network.

Heart disease detection: other research

Some research has been conducted on fibrosis and scar segmentation from cardiac magnetic resonance imaging (MRI) images. Wu et al. conducted a comprehensive survey in this field (43), with a particular emphasis on deep learning methods. Additionally, other studies have explored left atrium segmentation and heart failure (44-52).


In this review, many modern and different methods in heart disease detection were discussed. Different types of heart disease detection were covered. The findings from the most recent papers in this field are as follows:

  • In case of datasets, the results are as follows:
    • For heart disease detection for information from patients, Cleveland, Farmingham, Kaggle, Statlog, Hungarian were the datasets that were used more than others.
    • For heart disease detection based on ECG signals, the popular datasets were MIT-BIH Arrhythmia and PTB Diagnostic ECG Dataset. The others were self-made datasets.
    • For heart disease detection for X-ray images, the ChestX-Ray dataset is the most popular one, as it consists of 14 different diseases.
  • In case of algorithms, lets discuss them in case of different methods.
    • For heart disease detection for information from patients, many methods were used, among them XGBoost, Random Forest, Ensemble Learning, and Neural Network were better than other classic machine learning methods. Many new algorithms and techniques have also been applied. Among them these methods and techniques seems to have better impact on the results.
      • Applying dimensional reduction methods on the data such as PCA can produce better results.
      • In data processing sometimes some features do not have discrimination powers. It is better to detect and remove these features as they only increase the dimensionality of the space. SBO algorithm, random forest, hybrid genetic algorithm recursive feature elimination (GARFE), recursive feature elimination, quantitative measurements and Extra-Tree-Classifier can be used to choose the most important features from the data.
      • Distributing the samples evenly is another technique that increased the results. SMOTE and Synthetic Minority Over-Sampling Technique Edited Nearest Neighbor (SMOTE-ENN) are popular algorithms in this filed.
      • Some methods such as soft voting ensemble model can reduce the false negatives.
      • Standard scalar is important for normalizing the data.
      • For detecting and eliminating outliers the density-based spatial clustering applications with noise (DBSCAN) can be used.
    • For heart disease based on ECG signals CNN still seems to be among the best methods. For X-ray images ViT can be tested.


Funding: None.


Reporting Checklist: The author has completed the Narrative Review reporting checklist. Available at https://jmai.amegroups.com/article/view/10.21037/jmai-23-152/rc

Peer Review File: Available at https://jmai.amegroups.com/article/view/10.21037/jmai-23-152/prf

Conflicts of Interest: The author has completed the ICMJE uniform disclosure form (available at https://jmai.amegroups.com/article/view/10.21037/jmai-23-152/coif). The author has no conflicts of interest to declare.

Ethical Statement: The author is accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.


  1. Bani Hani SH, Ahmad MM. Machine-learning Algorithms for Ischemic Heart Disease Prediction: A Systematic Review. Curr Cardiol Rev 2023;19:e090622205797. [Crossref] [PubMed]
  2. Cardiovascular Disease (CVD)-World Heart Federation (accessed Jan 11, 2023). Available online: https://world-heart-federation.org/what-is-cvd/
  3. Das CR, Das CM, Hossain MA, et al. Heart Disease Detection Using ML. 2023 IEEE 13th Annual Computing and Communication Workshop and Conference (CCWC). Las Vegas, NV, USA: IEEE; 2023.
  4. Pytlak K. Indicators of Heart Disease (2022 UPDATE). Available online: https://www.kaggle.com/datasets/kamilpytlak/personal-key-indicators-of-heart-disease
  5. Gangadhar MS, Sai KVS, Kumar SHS, et al. Machine Learning and Deep Learning Techniques on Accurate Risk Prediction of Coronary Heart Disease. 2023 7th International Conference on Computing Methodologies and Communication (ICCMC). Erode: IEEE; 2023.
  6. Khader Basha S, Roja D, Santhj Priya S, et al. Coronary Heart Disease Prediction and Classification using Hybrid Machine Learning Algorithms. 2023 International Conference on Innovative Data Communication Technologies and Application (ICIDCA). Uttarakhand: IEEE; 2023.
  7. Jahed R, Asser O, Al-Mousa A. Using Personal Key Indicators and Machine Learning-based Classifiers for the Prediction of Heart Disease. 2023 International Conference on Smart Computing and Application (ICSCA). Hail: IEEE; 2023.
  8. Chopra S, Karla N, Rani R. Identification of Cardiovascular Disease using Machine Learning and Ensemble Learning. 2023 International Conference on Innovative Data Communication Technologies and Application (ICIDCA). Uttarakhand: IEEE; 2023.
  9. Gola KK, Arya S. Satin Bowerbird Optimization-Based Classification Model for Heart Disease Prediction Using Deep Learning in E-Healthcare. 2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing Workshops (CCGridW). Bangalore: IEEE; 2023.
  10. Samareh Moosavi SH, Khatibi Bardsiri V. Satin bowerbird optimizer: A new optimization algorithm to optimize ANFIS for software development effort estimation. Engineering Applications of Artificial Intelligence 2017;60:1-15. [Crossref]
  11. Shail MA, Sreeja R, Zainab S, et al. Improving Accuracy of Heart Disease Prediction through Machine Learning Algorithms. 2023 International Conference on Innovative Data Communication Technologies and Application (ICIDCA). Uttarakhand: IEEE; 2023.
  12. Shukla A, Khan IR, Sharma V, et al. A Novel Prediction System to Diagnose Heart Disease. 2023 International Conference on Inventive Computation Technologies (ICICT). Lalitpur: IEEE; 2023.
  13. Bathla G, Singh P, Singh RK, et al. Intelligent fake reviews detection based on aspect extraction and analysis using deep learning. Neural Computing and Applications 2022;34:20213-29. [Crossref]
  14. Aggarwal A, Gaba S, Singh P. Character Recognition using Approaches of Artificial Neural Network: A Review. CEUR Workshop Proceedings 2022;3309:186-93.
  15. Srivastava D, Chui KT, Arya V. Analysis of Protein Structure for Drug Repurposing Using Computational Intelligence and ML Algorithm. International Journal of Software Science and Computational Intelligence 2022;14:1-11. (IJSSCI). [Crossref]
  16. Sen K, Verma B. Heart Disease Prediction Using a Soft Voting Ensemble of Gradient Boosting Models, RandomForest, and Gaussian Naive Bayes. 2023 4th International Conference for Emerging Technology (INCET). Belgaum: IEEE; 2023.
  17. Jain AK, Kumar K, Tiwari RG, et al. Machine Learning-Based Detection of Cardiovascular Disease using Classification and Feature Selection. 2023 IEEE 12th International Conference on Communication Systems and Network Technologies (CSNT). Bhopal: IEEE; 2023.
  18. Sakshi Kukreja V. A dive in white and grey shades of ML and non-ML literature: a multivocal analysis of mathematical expressions. Artif Intell Rev 2022;56:7047-135. [Crossref]
  19. Varshini G, Ramya A, Sravya CL, et al. Improving Heart Disease Prediction of Classifiers with Data Transformation using PCA and Relief Feature Selection. 2023 Second International Conference on Electronics and Renewable Systems (ICEARS). Tuticorin: IEEE; 2023.
  20. Mahmud T, Barua A, Begum M, et al. An Improved Framework for Reliable Cardiovascular Disease Prediction Using Hybrid Ensemble Learning. 2023 International Conference on Electrical, Computer and Communication Engineering (ECCE). Chittagong: IEEE; 2023.
  21. Ramesh HV, Pathinarupothi RK. Performance Analysis of Machine Learning Algorithms to Predict Cardiovascular Disease. 2023 IEEE 8th International Conference for Convergence in Technology (I2CT). Lonavla: IEEE; 2023.
  22. Prasanna KSL, Challa NP, Nagaraju J. Heart Disease Prediction using Reinforcement Learning Technique. 2023 Third International Conference on Advances in Electrical, Computing, Communication and Sustainable Technologies (ICAECT). Bhilai: IEEE; 2023.
  23. Abdellatif A, Abdellatef H, Kanesan J, et al. An Effective Heart Disease Detection and Severity Level Classification Model Using Machine Learning and Hyperparameter Optimization Methods. IEEE Access 2022;10:79974-85.
  24. Mohan S, Thirumalai C, Srivastava G. Effective Heart Disease Prediction Using Hybrid Machine Learning Techniques. IEEE Access 2019;7:81542-54.
  25. Ali L, Niamat A, Khan JA, et al. An Optimized Stacked Support Vector Machines Based Expert System for the Effective Prediction of Heart Failure. IEEE Access 2019;7:54007-14.
  26. Tama BA, Im S, Lee S. Improving an Intelligent Detection System for Coronary Heart Disease Using a Two-Tier Classifier Ensemble. Biomed Res Int 2020;2020:9816142. [Crossref] [PubMed]
  27. Fitriyani NL, Syafrudin M, Alfian G, et al. HDPM: An Effective Heart Disease Prediction Model for a Clinical Decision Support System. IEEE Access 2020;8:133034-50.
  28. Pandya S, Gadekallu TR, Reddy PK, et al. InfusedHeart: A Novel Knowledge-Infused Learning Framework for Diagnosis of Cardiovascular Events. IEEE Transactions on Computational Social Systems 2022; [Crossref]
  29. Fernando T, Ghaemmaghami H, Denman S, et al. Heart Sound Segmentation Using Bidirectional LSTMs With Attention. IEEE J Biomed Health Inform 2020;24:1601-9. [Crossref] [PubMed]
  30. Meng L, Ge K, Song Y, et al. Long-term Wearable Electrocardiogram Signal Monitoring and Analysis Based on Convolutional Neural Network. IEEE Transactions on Instrumentation and Measurement 2021;70:1-11. [Crossref]
  31. Venkataramanaiah B, Kamala J. ECG signal processing and KNN classifier-based abnormality detection by VH-doctor for remote cardiac healthcare monitoring Soft Comput 2020;24:17457-66. [Crossref]
  32. Zhang X, Gu K, Miao S, et al. Automated detection of cardiovascular disease by electrocardiogram signal analysis: a deep learning system. Cardiovasc Diagn Ther 2020;10:227-35. [Crossref] [PubMed]
  33. Kumar S, Mallik A, Kumar A, et al. Fuzz-ClustNet: Coupled fuzzy clustering and deep neural networks for Arrhythmia detection from ECG signals. Comput Biol Med 2023;153:106511. [Crossref] [PubMed]
  34. Jamil S, Roy AM. An efficient and robust Phonocardiography (PCG)-based Valvular Heart Diseases (VHD) detection framework using Vision Transformer (ViT). Comput Biol Med 2023;158:106734. [Crossref] [PubMed]
  35. Le KH, Pham HH, Nguyen TBT, et al. LightX3ECG: A Lightweight and eXplainable Deep Learning System for 3-lead Electrocardiogram Classification. Biomedical Signal Processing and Control 2023;85:104963. [Crossref]
  36. De Marco F, Ferrucci F, Risi M, et al. Classification of QRS complexes to detect Premature Ventricular Contraction using machine learning techniques. PLoS One 2022;17:e0268555. [Crossref] [PubMed]
  37. De Marco F, Finlay D, Bond RR, et al. Classification of Premature Ventricular Contraction Using Deep Learning. 2020 Computing in Cardiology. Rimini: IEEE; 2020.
  38. Pant A, Rasool A, Wadhvani R, et al. Heart disease prediction using image segmentation Through the CNN model. 2023 13th International Conference on Cloud Computing, Data Science & Engineering (Confluence). Noida: IEEE; 2023.
  39. Nasser AA, Akhloufi MA. Deep Learning Methods for Chest Disease Detection Using Radiography Images. SN Comput Sci 2023;4:388. [Crossref] [PubMed]
  40. Choudhary A, Hazra A, Choudhary P. Diagnosis of Chest Diseases in X-Ray images using, Deep Convolutional Neural Network. 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT). Kanpur: IEEE; 2019.
  41. Wang X, Peng Y, Lu L, et al. ChestX-Ray8: Hospital-Scale Chest X-Ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, HI, USA: IEEE; 2017:3462-71.
  42. Aram S, Sadeghian R, Abdellatif I, et al. Diagnosing Heart Disease Types from Chest X-Rays Using a Deep Learning Approach. 2019 International Conference on Computational Science and Computational Intelligence (CSCI). Las Vegas, NV, USA: IEEE; 2020.
  43. Wu Y, Tang Z, Li B, et al. Recent Advances in Fibrosis and Scar Segmentation From Cardiac MRI: A State-of-the-Art Review and Future Perspectives. Front Physiol 2021;12:709230. [Crossref] [PubMed]
  44. Chen J, Zhang H, Mohiaddin R, et al. Adaptive Hierarchical Dual Consistency for Semi-Supervised Left Atrium Segmentation on Cross-Domain Data. IEEE Trans Med Imaging 2022;41:420-33. [Crossref] [PubMed]
  45. Chen J, Yang G, Khan H, et al. JAS-GAN: Generative Adversarial Network Based Joint Atrium and Scar Segmentations on Unbalanced Atrial Targets. IEEE J Biomed Health Inform 2022;26:103-14. [Crossref] [PubMed]
  46. Gao Y, Zhou Z, Zhang B, et al. Deep learning-based prognostic model using non-enhanced cardiac cine MRI for outcome prediction in patients with heart failure. Eur Radiol 2023;33:8203-13. [Crossref] [PubMed]
  47. Kuang M, Wu Y, Alonso-Álvarez D, et al. Three-Dimensional Embedded Attentive RNN (3D-EAR) Segmentor for Left Ventricle Delineation from Myocardial Velocity Mapping. In: Ennis DB, Perotti LE, Wang VY. Editors. Functional Imaging and Modeling of the Heart. FIMH 2021. Cham: Springer International Publishing; 2021.
  48. Liu T, Gao Y, Wang H, et al. Association between right ventricular strain and outcomes in patients with dilated cardiomyopathy. Heart 2021;107:1233-9. [Crossref] [PubMed]
  49. Li L, Wu F, Yang G, et al. Atrial scar quantification via multi-scale CNN in the graph-cuts framework. Med Image Anal 2020;60:101595. [Crossref] [PubMed]
  50. Yang G, Chen J, Gao Z, et al. Simultaneous left atrium anatomy and scar segmentations via deep learning in multiview information with attention. Future Gener Comput Syst 2020;107:215-28. [Crossref] [PubMed]
  51. Zhuang X, Li L, Payer C, et al. Evaluation of algorithms for Multi-Modality Whole Heart Segmentation: An open-access grand challenge. Med Image Anal 2019;58:101537. [Crossref] [PubMed]
  52. Shi Z, Zeng G, Zhang L, et al. Bayesian VoxDRN: A Probabilistic Deep Voxelwise Dilated Residual Network for Whole Heart Segmentation from 3D MR Images. In: Frangi A, Schnabel J, Davatzikos C, et al. editors. Medical Image Computing and Computer Assisted Intervention – MICCAI 2018. MICCAI 2018. Cham: Springer International Publishing; 2018.
doi: 10.21037/jmai-23-152
Cite this article as: Hajiarbabi M. Heart disease detection using machine learning methods: a comprehensive narrative review. J Med Artif Intell 2024;7:21.

Download Citation