Harder to mend than to break?—counterfactual explainable artificial intelligence for lifestyle medicine and heart disease prediction
Highlight box
Key findings
• Machine learning (ML) models such as naive Bayes, random forest, and support vector machine maintained robust predictive accuracy across different counterfactual (CF) generation methods when applied to cardiovascular health predictions.
• The choice of CF generation algorithm played a significant role in the quality of the CFs produced, impacting the number of features that needed modification to change health outcomes.
• Our analysis showed that transitions from an adverse health to a positive state involved more feature alterations than transitions from positive to negative, hinting at an inherent asymmetry in health state changes with critical implications for patient care and treatment planning.
What is known, and what is new?
• It is known that ML can predict health outcomes with high accuracy, and CFs have the potential to explain these predictions. However, the effect of different CF generation methods on the performance and interpretability of these models needs to be understood.
• The study is novel in its systematic comparison of three distinct CF generation methods across three different ML models in cardiovascular health. It offers new insights into the dynamics of ML and CF explanations.
What is the implication, and what should change now?
• Given the significant influence of CF generation methods on the interpretability of model predictions, practitioners should prioritize the selection of CF algorithms that provide clear, actionable insights for intervention strategies.
• Healthcare professionals and policymakers should consider integrating these findings into clinical practice, favoring ML models and CF methods that facilitate understanding minimal and impactful changes to prevent negative health outcomes.
Introduction
Background
An estimated 32 % of the total mortality rate was attributed to cardiovascular disease (CVD) in 2019, resulting also in substantial financial loss (1). CVD is caused by a myriad of factors, including exercise, consumption habits, and non-communicable diseases (2,3). Counterfactual (CF) methods expand the realm of explainable artificial intelligence (XAI) by providing an authentic or synthetic scenario, yielding a prediction of the opposite class from the original. The new scenario can be evaluated based on several statistics, primarily focused on proximity.
Rationale
This study examines the usage of CF methods on a medium-sized data set concerning behavioral and pathological risk factors for heart disease. Three different prediction methods [naive Bayes (NB), random forest (RF), and support vector machine (SVM)] were used to examine the risk factors for heart disease. These models were all examined with three CF methods: multi-objective CF (MOC), Nested and Interpolated Counterfactual Explanations (NICE), and What-If. We tested two possible transition types in every case: one upward (unhealthy to healthy) and one downward (healthy to unhealthy).
- Will there be significant differences between machine learning (ML) methods?
- Will harmful to positive transition have different outcomes than the opposite?
- Will differences result from opposing instances with different CF methods?
Objective
The objective was to assess how different ML and CF approaches interact and to study whether different ML approaches result in different results. This may influence the utility of predictive models in clinical decision-making.
Heart disease and lifestyle medicine
Modifiable risk factors play a major contributory factor in the development of all heart diseases, accounting for nearly 70% of the incidence and mortality rates. Body mass index (BMI) leads to more coronary lesions (4), and there is a known U-shaped association between BMI and long-term mortality, where the riskiest groups are those patients who are obese and those overly lean (5,6). It has been found that BMI predicts the development of coronary heart disease (CHD) in adulthood (7,8).
Individuals exposed to smoke (9,10) increase their risk of CHD (11). Moderate drinking may be beneficial for decreasing heart disease risk (12,13), but binge drinking carries a higher risk (14,15). Physical health links to CVD (16-19). Psychology interventions have shown promise in patients with CVD (20), while depression and anxiety may have a negative effect (21,22). Greater walking capacity and better walking performance are linked with better health and a lower risk of CVD (23-25). Both prediabetes and diabetes are associated with a higher likelihood of developing CVD through various pathways (26,27). Activity is known to have a positive impact (28-30). There is a non-linear association between sleep duration and CVD (31). A higher disease prevalence is observed when sleeping more than 9 hours or less than 6 hours (32-34). Another ML model has identified congestive heart disease and a family history of heart attack as a risk factor for insomnia, thus establishing a bidirectional relationship (35). There are known associations between asthma and both CVD (36,37) and CHD (38,39). The relationship between kidney disease and CVD is bidirectional (40-42). Skin cancer is a significant cause of morbidity in transplant patients (43,44) and is linked to increased CVD risk (45-47).
The multinational Prospective Urban Rural Epidemiology (PURE) study described the effect of modifiable risk factors on the incidence and mortality of CVD. We extracted data relating to various risk factors, such as overall population attributable fraction (PAF) (Table 1).
Table 1
Risk factors | Overall PAF (%) |
---|---|
Hypertension | 22.3 |
Cholesterol | 8.1 |
Smoking | 6.1 |
Diet (alcohol consumption) | 6.1 |
Abdominal obesity (BMI) | 5.7 |
Diabetes | 5.1 |
Physical activity | 1.5 |
PAF, population attributable fraction; BMI, body mass index.
Heart disease prediction on tabular data
Multiple studies have utilized ML algorithms in the prediction of heart failure survival (48), e.g., SVMs (49), light gradient boosting machine (LightGBM) (50), and Extra Tree classifiers (51). Some examples are displayed in Figure 1. Accuracies typically range between 80% and 90% (55-57), enhanced hybrid (58,59). We present this article in accordance with the STROBE reporting checklist (available at https://jmai.amegroups.com/article/view/10.21037/jmai-24-83/rc).
Methods
Sample and features
The data set is the United States Centers for Disease Control and Prevention (CDC) annual observational survey of >400,000 American adults about their health and lifestyle habits through their Behavioral Risk Factor Surveillance System (BRFSS). A stratified sample of 53,030 observations was judged to be sufficient for desired statistical power and assigned randomly such that the number of persons with a positive (yes) and a negative (no) response to the response variable (CHD) were selected (26,515 rows for each cohort). A 20–80% random split was used to assign each row to either the train or test set.
The selected columns can be seen in Appendix 1. All values are self-reported.
ML heart disease prediction models
ML models have been used in multiple contexts for predicting CVD (53,60-63).
NB
Bohacik and Zabovsky (64) used a probability approach to recognize CVD, employing NB for numeric attributes specific to CVD patients.
RF
RF is a powerful ensemble learning method that combines the predictions of multiple decision trees to produce an accurate and robust model. Each decision tree in the RF is trained on a subset of the dataset using a bootstrap sampling technique, and at each split, a random subset of features is considered. The final prediction of the RF is made by aggregating the predictions of all the individual trees through a voting mechanism or averaging the predictions. RF techniques outperformed other classifiers for prediction of CHD (65,66).
SVM
The employment of SVM for CHD is highly reliable and accurate; this offers great potential for clinical diagnosis, especially when combined with RF (67,68).
None of the three used models (52,53,69) were optimized for hyperparameters, their default values when invoked through their respective R packages were used.
Performance measures
The performance of any binary ML classification algorithm is usually based on the Confusion Matrix related to the performance of the test algorithm, as displayed in Table 2.
Table 2
Real class | Prediction | |
---|---|---|
True | False | |
True | TP | FN |
False | FP | TN |
TP, true positive; FN, false negative; FP, false positive; TN, true negative.
Six significant benchmarks are measuring the success rate of classification tasks (Table 3).
Table 3
Measures | Formula | Definition |
---|---|---|
Accuracy | % All correct | |
Sensitivity | % With disease correct | |
Specificity | Detection of patients without disease | |
Precision | Probability of successful positive prediction | |
Negative predictive value | Proportion of negative instances correctly predicted | |
AUC | Derived from ROC | Proportion of the unit square under the ROC |
ML, machine learning; TP, true positive; TN, true negative; FP, false positive; FN, false negative; AUC, area under the curve; ROC, receiver operating characteristic.
Nearest CF (NC)
An important first step in many tasks related to CF is the identification of the NC, i.e., which of the actual observations in the data set is most closely related to the analyzed instance. The distance measure between any two instances would depend on the data types present, which in this case is a mix of continuous, discrete, and dichotomic data (Appendix 1). The benchmark for distance is therefore the Gower distance as presented below.
Gower distance
Euclidean distance has been used in multiple contexts with strictly numeric data, e.g., pulse rate frequency (70) and K-nearest neighbors (71). With a mix of nominal and numeric, Gower distance is often used:
The function sj(x1, x2) depends on the feature type. For a numeric variable:
While for a categorical feature, it assumes the Kronecker delta function:
NC
The base case (BC) xi is assigned a prediction P(xi) by the selected ML algorithm. The NC is then defined as the instance with the lowest Gower distance to the BC while the ML model predicts it as belonging to another class, i.e.,
CF instance generation on tabular data
Three different methods for generating CF observations were used: MOC, NICE (72), and What-If.
MOC
MOCs consider multiple outcomes simultaneously and aim to find the optimal intervention to maximize all outcomes of interest. This approach is particularly beneficial in scenarios where interventions may present trade-offs between different outcomes. For example, a medication may reduce the risk of CVD but increase the risk of kidney damage (73,74).
To quantitatively assess the trade-offs between different outcomes, we utilize the concept of “distance to origin” in a multi-dimensional outcome space. Each axis in this space represents a different outcome, and the origin represents the absence of intervention. The distance to origin, which we calculate with the Gower distance, serves as a measure of the overall impact of the intervention across all outcomes. This metric provides a straightforward method to compare the efficacy of different interventions by their distance from the origin, where shorter distances may indicate less overall intervention impact.
In comparing ML methods to CF methods, we focus on how these approaches model transitions between states in response to interventions. ML methods typically predict outcomes based on historical data and are often used to simulate the effects of different intervention types before implementation. In contrast, CF methods generate hypothetical alternatives (CFs) that might have occurred under different interventions. By measuring the distances generated by these CFs to the origin, we can objectively compare the potential impacts of different types of interventions, ensuring the comparisons are reasonable and comparable across studies. This approach has been employed successfully in various settings, including medical treatments (75) and public health policies (76), providing a robust framework for decision-making in complex scenarios involving multiple objectives.
NICE
The NICE CF method generates hypothetical scenarios that depict how changing a particular input variable would alter the prediction (77). It works by overwriting feature values with individual values copied directly from other instances. First, the NC is found, as described in section 3.4. The algorithm then proceeds in successive rounds, where it overwrites individual feature values from the NC instance to the BC. As a simple illustration, imagine a data set with three numerical and two binary features. The BC and NC are shown in Table 4.
Table 4
Instances | Feature values in each round | ||||
---|---|---|---|---|---|
x1 | x2 | x3 | x4 | x5 | |
BC | 4 | Yes | 2 | Yes | 1 |
NC | 3 | Yes | 3 | No | 4 |
CF11 | 3† | Yes | 2 | Yes | 1 |
CF12 | 4 | Yes | 3† | No | 1 |
CF13 | 4 | Yes | 2 | No† | 1 |
CF14 | 4 | Yes | 2 | Yes | 4† |
CF21 | 3† | Yes | 3† | Yes | 1 |
CF22 | 3† | Yes | 2 | No† | 1 |
CF23 | 3† | Yes | 2 | Yes | 4† |
CF24 | 4 | Yes | 3† | No† | 1 |
CF25 | 4 | Yes | 3† | Yes | 4† |
CF26 | 4 | Yes | 2 | No† | 4† |
†, changed feature from the previous instance. NICE, Nested and Interpolated Counterfactual Explanations; BC, base case; NC, nearest CF; CF, counterfactual.
In the first round, four CFs are created, as shown in Table 4. This corresponds to all possible ways of replacing one value in BC by one value in NC. In the second round, all possible ways to replace two values in BC from NC are generated (Table 4). Everything is repeated until all permutations of the features where the values of NC and BC are not the same have been generated. Predictions for each instance is obtained through the same ML model, and only those new instances leading to a class change are retained.
What-If
What-If CFs is arguably the most straightforward technique, as it simply uses entire instances of the data set with an opposing classification compared to the BC. One of the What-If CFs will hence always be the NC case. This has been used as standard support for the treatment and control groups in causal inference (78), estimating the effect of small changes in parameters (79), or highlighting model flaws (80). The main advantage is that it can aid understanding (81), perioperative optimization, and planning (82).
CF quality measures
The quality of CF explanations benchmarks the effectiveness and usability of a model. Several measures are suggested to control the quality of CFs. These benchmarks are based on concepts from human-computer interaction (Table 5).
Table 5
Measures | Concept |
---|---|
Proximity | Are instances close to base case? (83) |
Sparsity | How many features were changed? (84) |
Diversity | Are CF instances heterogeneous? (77) |
Actionability | Can we control the changes? (85) |
Feasibility | Feature changes doable? (77) |
Causality | Feature changes causal? (77) |
Validity | Feature changes lead to desired outcome? (86) |
Fidelity | Are CF instances faithful during the decision process? (86) |
Understandability | Can the end users understand CF instances? (87) |
CF, counterfactual.
No changed features
The fewer features that need to be changed, the more understandable and actionable CF explanation is for an end user (88,89).
Frequency of feature changes
The main philosophy of feature change frequency is quantifying the frequency of changed CFs. In this context, some CF methods are optimized for minimal feature changes to ensure better interpretability and actionability of users when robust RF explanations are suggested, when the CFs with the fewest feature changes are selected as a CF for further analysis. However, feature attribution methods are only sometimes efficient (90).
Statistics tests
Standard inference tests with a significance level of 0.05 were used to analyze differences in mean between independent groups. For numeric features such as Gower distance and number of changes, the one-way analysis of variance (ANOVA) test was used comparing means between ML models and CF techniques (three groups in each category), while for the determination of results based on transition type (two options), the Mann-Whitney non-parametric test was selected. We employed the R and Excel software for descriptives and inference.
Results
Performance
The overall performance characteristics for each model are in Table 6.
Table 6
CF measurements | ML model performance | ||
---|---|---|---|
NB | RF | SVM | |
Accuracy | 0.66 | 0.69 | 0.69 |
Sensitivity | 0.75 | 0.73 | 0.74 |
Specificity | 0.6 | 0.66 | 0.65 |
AUC | 0.67 | 0.69 | 0.69 |
PPV | 0.56 | 0.69 | 0.67 |
NPV | 0.78 | 0.69 | 0.72 |
CF, counterfactual; ML, machine learning; NB, naive Bayes; RF, random forest; SVM, support vector machine; AUC, area under the curve; PPV, positive predictive value; NPV, negative predictive value.
Starting points
Based on the described data set, we selected two points of interest or BCs for CF generation (Table 7).
Table 7
Features | Value negative | Value positive |
---|---|---|
HeartDisease | No | Yes |
BMI (kg/m2) | 23.0 | 34.5 |
Smoking | No | Yes |
AlcoholDrinking | No | No |
PhysicalHealth (days) | 0 | 2 |
MentalHealth (days) | 0 | 15 |
DiffWalking | No | Yes |
Diabetic | No | Yes |
PhysicalActivity | Yes | Yes |
SleepTime (hours) | 8 | 5 |
Asthma | No | Yes |
KidneyDisease | No | No |
SkinCancer | No | No |
BMI, body mass index.
The first person was healthy on all accounts except for diabetes. CFs for this observation were sought as new instances, predicting not being healthy with a probability between 0.7 and 1. The opposite case describes an individual with a multitude of problems related to their health. The CF region for this (heart disease) case was between P(HeartDisease =0) =0.7 and P(HeartDisease =0) =1. The class predictions for each ML method are shown in Table 8. It should be noted that the negative case has a higher mean class probability than the positive and thus starts further away from the decision.
Table 8
ML model | Healthy BC | Unhealthy BC [P(HeartDisease = yes)] |
---|---|---|
NB | 0.09 | 0.91 |
RF | 0.02 | 0.93 |
SVM | <0.01 | >0.99 |
BC, base case; ML, machine learning; NB, naive Bayes; RF, random forest; SVM, support vector machine.
CFs
For each triad of ML method (NB, RF, and SVM), transition type [healthy to unhealthy (downward) and unhealthy to healthy (upward)], and CF generation technique (MOC, NICE, and What-If), the R counterfactuals package with footnote (https://cran.r-project.org/web/packages/counterfactuals/vignettes/introduction.html) was used to generate several CFs.
Distance to the BC
The mean value for the mean Gower distance to the BC aggregated by the ML method, CF technique, and transition type are shown with error bars in Figure 2A-2C.
A significant (P=0.006) difference was found via ANOVA based on the ML method used. For the comparison of means between upward and downward transitions, the rather small sample size (nine transitions of each kind) meant that we selected a non-parametric test. The Mann-Whitney U test confirmed that the mean distance to the BC is higher for the positive to negative CFs than in the opposite direction (P<0.001). The ANOVA test (P<0.001) confirmed that multi-objective synthetic instances have a lower distance than the two others.
Mean number of feature changes
The mean number of features changed for each configuration is shown in Figure 3.
An ANOVA test (P=0.01) confirmed that more features were modified to create instances from the NB than the other methods, which were almost equal. A Mann-Whitney U test confirmed that the mean number of features changed is higher for the positive to negative CFs than in the opposite direction (P<0.001). The ANOVA test (P>0.001) confirmed that MOCs altered fewer values than others, and that What-Ifs modified more features than the NICE CFs.
Discussion
Key findings
A key takeaway from our analysis is the relative insensitivity of ML algorithms to CF generation methods, suggesting a level of robustness in these models when applied to healthcare data. This finding is particularly relevant in the context of CVD prediction, where the stability of model predictions in the face of hypothetical ‘What-If’ scenarios can significantly influence clinical decision-making and patient counseling.
Strengths and limitations
Our results provide significant insights into the nuanced interplay between these variables, shedding light on their implications for healthcare predictive analytics and decision-making processes. This study simplifies the real-world situation by classifying patients into either healthy and diseased states. Obviously, such classification is simplistic and does not reflect the range of illnesses or their severity experienced by actual patients. However, clinicians must make subjective decision as to when patients can be considered as requiring treatment, so such simplification was considered to reflect actual real-world clinical experience. Future work could therefore include trials on whether the recommendation of these CFs through personalized medicine interfaces lead to the desired outcomes.
Comparison with similar research
NB appears to change a larger number of features while remaining closer to the BC. This supports the findings of Taha et al. (91) and Arar and Ayan (92) stating that NB can effectively select a smaller number of features while maintaining classification accuracy. CF explanations generated by NB may change more features than those from SVM due to the inherent characteristics of the algorithms. NB is a probabilistic classifier that assumes independence between features, which can lead to changes in multiple features to generate a CF explanation. It should be noted that each ML algorithm working on tabular data is associated with an inductive bias. RFs are known for a feature selection bias; NB assumes conditional independence; and SVMs have an instance selection bias.
On the other hand, SVM aims to find the optimal hyperplane that best separates classes, which may result in fewer features being changed in a CF explanation. The decision boundaries created by SVM are generally smoother, leading to more focused changes in features to achieve a different prediction.
Explanation of findings
Results indicate that a move from healthy to unhealthy requires changing more habits than the opposite, calling for multiple reasons (93,94). Transitioning from an unhealthy state to a healthy state often involves addressing underlying causes and making significant lifestyle changes, e.g., weight loss through changes in diet, exercise, and overall lifestyle habits (95,96). These changes involve multiple attributes such as food choices, physical activity levels (97), and stress management techniques. Going from healthy to unhealthy may involve a few unhealthy choices or habits that are easier to implement, such as high-calorie foods or sedentary life (98).
Complexity of positive outcomes: achieving and maintaining good health often requires a multifaceted approach that involves many interconnected attributes. For instance, improving cardiovascular health may involve factors like diet (99), exercise (100), sleep quality (101), stress management, and genetic predispositions. On the other hand, slipping into an unhealthy state may involve neglecting or reverting to a few negative habits or behaviors, which may require less complexity or effort. Pathogenesis of CVD is a gradual process. Continuous exposure to oxidative stress, along with inflammatory mediators, causes damage to endothelial cells of blood vessels, finally setting in as coronary artery disease (102). Different risk factors exposure, like cigarette smoking and low-density lipoprotein (LDL) cholesterol deposition in walls of blood vessels, can help set in or maintain this tissue-destroying inflammatory process (103,104).
Protective hormones: adaptive changes as a part of host response still provide some relief. Anti-atherogenic hormones present in the body reduce the risk of development to CVD. Estrogen is a hormone widely described to have such effects, hence contributing to the lesser incidence of CVD in women of reproductive age group (105). Other anti-atherogenic hormones which can retard the development of CVD include adiponectin and insulin, which try to repair and prevent further endothelial damage (106).
Associated development of other comorbidities: once the foundation of CVD has set in, other diseases begin to develop due to the loss of practical function of the heart. One of the most striking connections is that cancer, cancers, and CVD share a bidirectional association. While cancers like skin cancer can increase the risk of CVD (107), CVD itself can promote cancer development due to atherosclerosis (108). Common malignancies that are observed more in CVD patients include cancers of the lung, bladder, colon, and more. Furthermore, the decreased blood supply to organs due to the narrowing of blood vessels or reduced ejection fraction of the heart can lead to end organ damage in various tissues. these manifest as chronic kidney disease, retinopathy, and peripheral vascular disease. These associated comorbidities make it more challenging for CVD patients to revert to a ‘negative’ disease-free state.
The features involved are dependent on the starting point, so one should not consider the selected attribution as an overall model selection. The CFs give a higher emphasis on mental health, difficulty walking, and diabetes, as blood pressure was not present.
When looking at the type of CF generation method, the results show that the synthetic process requires fewer changes and creates instances that are closer to the decision boundary, i.e., have a lower distance to the starting point. This was expected, as unlike the other two, the MOC method can set feature values through a continuous spectrum.
Implications and actions needed
Here, three commonly used ML algorithms were selected. Although this brings in risk of bias, by using those that are most frequently used in real-world applications for tabular data this risk is reduced. However, the comparison using CF provides a safeguard against bias in ML algorithm selection. The use of CF analysis provides an attempt to objectively assess different ML methods and select those most suitable, thus reducing risk of biased ML selection. The findings of this study have important implications for clinical practice and future research. In clinical settings, the robustness of ML algorithms to CF generation methods suggests their reliability for CVD prediction, allowing for more tailored preventive measures. Policymakers could use this evidence to support the integration of ML-based predictive analytics into healthcare systems. Additionally, there is a need for further research to understand the mechanisms underlying the insensitivity of ML algorithms to CF generation methods and to explore their applicability in other healthcare contexts. Education for healthcare professionals on the potential and interpretation of ML models in disease prediction is also crucial. Further validation in real-world clinical settings is needed to assess the practical applicability and effectiveness of the findings. Finally, the stability of model predictions in ‘What-If’ scenarios can inform patient counseling, providing personalized advice on lifestyle changes.
Conclusions
The significant difference observed in the number of features changed by NB compared to other ML models and the consequent implications for the interpretability and actionability of CFs underscore the importance of selecting appropriate ML models for health outcome prediction tasks. This aspect is crucial for developing interventions and recommendations that are both effective and feasible for patients. Additionally, our findings regarding the asymmetry in transition types—from positive to negative health states and vice versa—highlight the complexity of health state transitions and the potential need for more personalized, multifaceted intervention strategies in clinical practice. This insight aligns with the broader understanding in healthcare that improving health outcomes often requires comprehensive lifestyle and behavioral changes, which can be more challenging to implement than adopting unhealthy behaviors leading to adverse health states.
Acknowledgments
Funding: The project was funded by
Footnote
Reporting Checklist: The authors have completed the STROBE reporting checklist. Available at https://jmai.amegroups.com/article/view/10.21037/jmai-24-83/rc
Peer Review File: Available at https://jmai.amegroups.com/article/view/10.21037/jmai-24-83/prf
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://jmai.amegroups.com/article/view/10.21037/jmai-24-83/coif). The authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. No ethics permission nor informed consent was required as there are no human experiments involved.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- Pogosova N. Costs associated with cardiovascular disease create a significant burden for society and they seem to be globally underestimated. Eur J Prev Cardiol 2019;26:1147-9. [Crossref] [PubMed]
- Santos-Parker JR, LaRocca TJ, Seals DR. Aerobic exercise and other healthy lifestyle factors that influence vascular aging. Adv Physiol Educ 2014;38:296-307. [Crossref] [PubMed]
- Münzel T, Steven S, Frenis K, et al. Environmental Factors Such as Noise and Air Pollution and Vascular Disease. Antioxid Redox Signal 2020;33:581-601. [Crossref] [PubMed]
- Islam MS, Talukder R, Sakib AM, et al. Study of relation between body mass index (BMI) and angiographically severity of coronary artery disease. KYAMC Journal 2011;1:39-42. [Crossref]
- Chen Z, Yang G, Zhou M, et al. Body mass index and mortality from ischaemic heart disease in a lean population: 10 year prospective study of 220,000 adult men. Int J Epidemiol 2006;35:141-50. [Crossref] [PubMed]
- Benderly M, Boyko V, Goldbourt U. Relation of body mass index to mortality among men with coronary heart disease. Am J Cardiol 2010;106:297-304. [Crossref] [PubMed]
- Baker JL, Olsen LW, Sørensen TI. Childhood body-mass index and the risk of coronary heart disease in adulthood. N Engl J Med 2007;357:2329-37. [Crossref] [PubMed]
- Huang RC, Beilin LJ. Adolescent BMI is independently associated with the development of coronary heart disease. Evid Based Med 2012;17:35-6. [Crossref] [PubMed]
- Hammond EC. Smoking in relation to heart disease. Am J Public Health Nations Health 1960;2:20-6. [Crossref] [PubMed]
- Khoramdad M, Vahedian-Azimi A, Karimi L, et al. Association between passive smoking and cardiovascular disease: A systematic review and meta-analysis. IUBMB Life 2020;72:677-86. [Crossref] [PubMed]
- Benowitz NL, Liakoni E. Tobacco use disorder and cardiovascular health. Addiction 2022;117:1128-38. [Crossref] [PubMed]
- Stockley CS. The relationships between alcohol, wine and cardiovascular diseases–A review. Nutrition and Aging 2015;3:55-88. [Crossref]
- Buemann B, Dyerberg J, Astrup A. Alcohol drinking and cardiac risk. Nutr Res Rev 2002;15:91-121. [Crossref] [PubMed]
- Maugeri A, Hlinomaz O, Agodi A, et al. Is Drinking Alcohol Really Linked to Cardiovascular Health? Evidence from the Kardiovize 2030 Project. Nutrients 2020;12:2848. [Crossref] [PubMed]
- Chudzińska M, Wołowiec Ł, Banach J, et al. Alcohol and Cardiovascular Diseases-Do the Consumption Pattern and Dose Make the Difference? J Cardiovasc Dev Dis 2022;9:317. [Crossref] [PubMed]
- Hamer M, O'Donovan G, Stamatakis E. Association between physical activity and sub-types of cardiovascular disease death causes in a general population cohort. Eur J Epidemiol 2019;34:483-7. [Crossref] [PubMed]
- Saquib N, Brunner R, Desai M, et al. Association between physical health and cardiovascular diseases: Effect modification by chronic conditions. SAGE Open Med 2018;6:2050312118785335. [Crossref] [PubMed]
- Kasargod Prabhakar CR, Stewart R. Physical activity and mortality in patients with stable coronary heart disease. Curr Opin Cardiol 2018;33:653-9. [Crossref] [PubMed]
- Lavie CJ, Carbone S, Kachur S, et al. Effects of Physical Activity, Exercise, and Fitness on Obesity-Related Morbidity and Mortality. Curr Sports Med Rep 2019;18:292-8. [Crossref] [PubMed]
- Huffman JC, Legler SR, Boehm JK. Positive psychological well-being and health in patients with heart disease: a brief review. Future Cardiol 2017;13:443-50. [Crossref] [PubMed]
- Sowden GL, Huffman JC. The impact of mental illness on cardiac outcomes: a review for the cardiologist. Int J Cardiol 2009;132:30-7. [Crossref] [PubMed]
- Chaddha A, Robinson EA, Kline-Rogers E, et al. Mental Health and Cardiovascular Disease. Am J Med 2016;129:1145-8. [Crossref] [PubMed]
- Lima AH, Soares AH, Cucato GG, et al. Walking Capacity Is Positively Related with Heart Rate Variability in Symptomatic Peripheral Artery Disease. Eur J Vasc Endovasc Surg 2016;52:82-9. [Crossref] [PubMed]
- Lee HJ, Kim HK, Han KD, et al. Age-dependent associations of body mass index with myocardial infarction, heart failure, and mortality in over 9 million Koreans. Eur J Prev Cardiol 2022;29:1479-88. [Crossref] [PubMed]
- Pettee KK, Larouere BM, Kriska AM, et al. Associations among walking performance, physical activity, and subclinical cardiovascular disease. Prev Cardiol 2007;10:134-40. [Crossref] [PubMed]
- Tousoulis D, Oikonomou E, Siasos G, et al. Diabetes Mellitus and Heart Failure. Eur Cardiol 2014;9:37-42. [Crossref] [PubMed]
- Matheus AS, Tannus LR, Cobas RA, et al. Impact of diabetes on cardiovascular disease: an update. Int J Hypertens 2013;2013:653789. [Crossref] [PubMed]
- Rigotti NA, Thomas GS, Leaf A. Exercise and coronary heart disease. Annu Rev Med 1983;34:391-412. [Crossref] [PubMed]
- Leon AS. Physical activity levels and coronary heart disease. Analysis of epidemiologic and supporting studies. Med Clin North Am 1985;69:3-20. [Crossref] [PubMed]
- Biscaglia S, Campo G, Sorbets E, et al. Relationship between physical activity and long-term outcomes in patients with stable coronary artery disease. Eur J Prev Cardiol 2020;27:426-36. [Crossref] [PubMed]
- Wang D, Li W, Cui X, et al. Sleep duration and risk of coronary heart disease: A systematic review and meta-analysis of prospective cohort studies. Int J Cardiol 2016;219:231-9. [Crossref] [PubMed]
- Partinen M, Putkonen PT, Kaprio J, et al. Sleep disorders in relation to coronary heart disease. Acta Med Scand Suppl 1982;660:69-83. [Crossref] [PubMed]
- Zeng R, Jiang Y, Chen T, et al. Longitudinal associations of sleep duration and sleep quality with coronary heart disease risk among adult population: classical meta-analysis and Bayesian network meta-analysis. Sleep Biol Rhythms 2021;19:265-76. [Crossref]
- Kuehn BM. Sleep Duration Linked to Cardiovascular Disease. Circulation 2019;139:2483-4. [Crossref] [PubMed]
- Huang AA, Huang SY. Use of machine learning to identify risk factors for insomnia. PLoS One 2023;18:e0282622. [Crossref] [PubMed]
- Wee JH, Park MW, Min C, et al. Association between asthma and cardiovascular disease. Eur J Clin Invest 2021;51:e13396. [Crossref] [PubMed]
- Pollevick ME, Xu KY, Mhango G, et al. The Relationship Between Asthma and Cardiovascular Disease: An Examination of the Framingham Offspring Study. Chest 2021;159:1338-45. [Crossref] [PubMed]
- Gurgone D, McShane L, McSharry C, et al. Cytokines at the Interplay Between Asthma and Atherosclerosis? Front Pharmacol 2020;11:166. [Crossref] [PubMed]
- Mishra P, Hisalkar P, Mallick N. Asthma in Relation to Coronary Heart Disease: A Systematic Review and Meta-analysis. Indian J Med Biochem 2021;25:39. [Crossref]
- Metra M, Cotter G, Gheorghiade M, et al. The role of the kidney in heart failure. Eur Heart J 2012;33:2135-42. [Crossref] [PubMed]
- Damman K, Testani JM. The kidney in heart failure: an update. Eur Heart J 2015;36:1437-44. [Crossref] [PubMed]
- Shiba N, Shimokawa H. Chronic kidney disease and heart failure--Bidirectional close link and common therapeutic goal. J Cardiol 2011;57:8-17. [Crossref] [PubMed]
- Ong CS, Keogh AM, Kossard S, et al. Skin cancer in Australian heart transplant recipients. J Am Acad Dermatol 1999;40:27-34. [Crossref] [PubMed]
- Fortina AB, Caforio AL, Piaserico S, et al. Skin cancer in heart transplant recipients: frequency and risk factor analysis. J Heart Lung Transplant 2000;19:249-55. [Crossref] [PubMed]
- Kwa MC, Silverberg JI. Association Between Inflammatory Skin Disease and Cardiovascular and Cerebrovascular Co-Morbidities in US Adults: Analysis of Nationwide Inpatient Sample Data. Am J Clin Dermatol 2017;18:813-23. [Crossref] [PubMed]
- Hojman L, Karsulovic C. Cardiovascular Disease-Associated Skin Conditions. Vasc Health Risk Manag 2022;18:43-53. [Crossref] [PubMed]
- Kitsis RN, Riquelme JA, Lavandero S. Heart Disease and Cancer: Are the Two Killers Colluding? Circulation 2018;138:692-5. [Crossref] [PubMed]
- Moreno-Sanchez PA. Development of an explainable prediction model of heart failure survival by using ensemble trees. In: 2020 IEEE international conference on big data (big data). IEEE; 2020:4902-10.
- Guleria P, Naga Srinivasu P, Ahmed S, et al. XAI framework for cardiovascular disease prediction using classification techniques. Electronics 2022;11:4086. [Crossref]
- Mamun M, Farjana A, Al Mamun M, et al. Heart failure survival prediction using machine learning algorithm: am I safe from heart failure? In: 2022 IEEE world AI IoT congress (AIIoT). IEEE; 2022:194-200.
- Ishaq A, Sadiq S, Umer M, et al. Improving the prediction of heart failure patients’ survival using SMOTE and effective data mining techniques. IEEE Access 2021;9:39707-16.
- Shyrokykh K, Girnyk M, Dellmuth L. Short text classification with machine learning in the social sciences: The case of climate change on Twitter. PLoS One 2023;18:e0290762. [Crossref] [PubMed]
- Hai PM, Tinh PH, Son NP, et al. Mangrove health assessment using spatial metrics and multi-temporal remote sensing data. PLoS One 2022;17:e0275928. [Crossref] [PubMed]
- Zou ZM, Chang DH, Liu H, et al. Current updates in machine learning in the prediction of therapeutic outcome of hepatocellular carcinoma: what should we know? Insights Imaging 2021;12:31. [Crossref] [PubMed]
- Dave D, Naik H, Singhal S, et al. Explainable AI meets healthcare: A study on heart disease dataset. arXiv:2011.03195 [Preprint]. 2020. Available online: https://arxiv.org/abs/2011.03195
- Sang X, Yao QZ, Ma L, et al. Study on survival prediction of patients with heart failure based on support vector machine algorithm. In: 2020 International Conference on Robots & Intelligent System (ICRIS). IEEE; 2020:636-9.
- Devi SK, Krishnapriya S, Kalita D. Prediction of heart disease using data mining techniques. Indian Journal of Science and Technology 2016; [Crossref]
- Dickerman BA, Hernán MA. Counterfactual prediction is not only for causal inference. Eur J Epidemiol 2020;35:615-7. [Crossref] [PubMed]
- Pahwa K, Kumar R. Prediction of heart disease using hybrid technique for selecting features. In: 2017 4th IEEE Uttar Pradesh Section International Conference on Electrical, Computer and Electronics (UPCON). IEEE; 2017:500-4.
- Hasan SMM, Mamun MA, Uddin MP, et al. Comparative analysis of classification approaches for heart disease prediction. In: 2018 International Conference on Computer, Communication, Chemical, Material and Electronic Engineering (IC4ME2). IEEE; 2018:1-4.
- Mirza I, Mahapatra A, Rego D, et al. Human heart disease prediction using data mining techniques. In: 2019 International Conference on Advances in Computing, Communication and Control (ICAC3). IEEE; 2019:1-5.
- Powar A, Shilvant S, Pawar V, et al. Data mining & artificial intelligence techniques for prediction of heart disorders: a survey. In: 2019 International Conference on Vision Towards Emerging Trends in Communication and Networking (ViTECoN). IEEE; 2019:1-7.
- Ananey-Obiri D, Sarku E. Predicting the presence of heart diseases using comparative data mining and machine learning algorithms. Int J Comput Appl 2020;176:17-21. [Crossref]
- Bohacik J, Zabovsky M. Naive Bayes for statlog heart database with consideration of data specifics. In: 2017 IEEE 14th International Scientific Conference on Informatics. IEEE; 2017:35-9.
- Abdul A, Isiaka RM, Babatunde RS, et al. An Improved Coronary Heart Disease Predictive System Using Random Forest. Asian J Res Comput Sci 2021;11:17-27. [Crossref]
- Barry KA, Manzali Y, Flouchi R, et al. Exploring the use of association rules in random forest for predicting heart disease. Comput Methods Biomech Biomed Engin 2024;27:338-46. [Crossref] [PubMed]
- Gong W, Wang S. Support vector machine for assistant clinical diagnosis of cardiac disease. In: 2009 WRI Global Congress on Intelligent Systems. IEEE; 2009:588-91.
- Suresh T, Assegie TA, Rajkumar S, et al. A hybrid approach to medical decision-making: diagnosis of heart disease with machine-learning model. Int J Elec Comp Eng 2022;12:1831-8. [Crossref]
- Wellawatte GP, Seshadri A, White AD. Model agnostic generation of counterfactual explanations for molecules. Chem Sci 2022;13:3697-705. [Crossref] [PubMed]
- Jain A, Keller J, Popescu M. Explainable AI for dataset comparison. In: 2019 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE). IEEE; 2019:1-7.
- Rahmat D, Putra AA, Setiawan AW. Heart disease prediction using K-nearest neighbor. In: 2021 International Conference on Electrical Engineering and Informatics (ICEEI). IEEE; 2021:1-6.
- Brughmans D, Leyman P, Martens D. Nice: an algorithm for nearest instance counterfactual explanations. Data Mining and Knowledge Discovery 2024; [Crossref]
- Bello AK, Qarni B, Samimi A, et al. Effectiveness of Multifaceted Care Approach on Adverse Clinical Outcomes in Nondiabetic CKD: A Systematic Review and Meta-analysis. Kidney Int Rep 2017;2:617-25. [Crossref] [PubMed]
- Fried TR, Tinetti ME, Iannone L, et al. Health outcome prioritization as a tool for decision making among older persons with multiple chronic conditions. Arch Intern Med 2011;171:1854-6. [Crossref] [PubMed]
- Nguyen QVH, Zheng K, Weidlich M, et al. What-if analysis with conflicting goals: Recommending data ranges for exploration. In: 2018 IEEE 34th International Conference on Data Engineering (ICDE). IEEE; 2018:89-100.
- Wulkow H, Conrad TOF, Djurdjevac Conrad N, et al. Prediction of Covid-19 spreading and optimal coordination of counter-measures: From microscopic to macroscopic models to Pareto fronts. PLoS One 2021;16:e0249676. [Crossref] [PubMed]
- Mothilal RK, Sharma A, Tan C. Explaining machine learning classifiers through diverse counterfactual explanations. In: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. ACM; 2020:607-17.
- Stoll H, King G, Zeng L. WhatIf: R software for evaluating counterfactuals. J Stat Softw 2005; [Crossref]
- Arsham H, Kahn AB. “What-if” analysis in computer simulation models: A comparative survey with some extensions. Mathematical and Computer Modelling 1990;14:101-6. [Crossref]
- Wexler J, Pushkarna M, Bolukbasi T, et al. The what-if tool: Interactive probing of machine learning models. IEEE Trans Vis Comput Graph 2019;26:56-65. [Crossref] [PubMed]
- Lutters D, Vaneker THJ, van Houten FJAM. ‘What-if’ design: a synthesis method in the design process. CIRP Annals 2004;53:113-6. [Crossref]
- Anstey MH, Senthuran S. The what-if approach to perioperative planning. Anaesth Intensive Care 2023;51:168-9. [Crossref] [PubMed]
- Pawelczyk M, Broelemann K, Kasneci G. Learning model-agnostic counterfactual explanations for tabular data. In: Proceedings of the Web Conference 2020. ACM; 2020:3126-32.
- Zhou S, Islam UJ, Pfeiffer N, et al. SCGAN: Sparse CounterGAN for counterfactual explanations in breast cancer prediction. IEEE Trans Autom Sci Eng 2023; [Crossref]
- Frosch CA, Egan SM, Hancock EN. The effect of controllability and causality on counterfactual thinking. Thinking & Reasoning 2015;21:317-40. [Crossref]
- Ribeiro FDS, Xia T, Monteiro M, et al. High fidelity image counterfactuals with probabilistic causal models. arXiv:2306.15764 [Preprint]. 2023. Available online: https://arxiv.org/abs/2306.15764
- Tan J, Xu S, Ge Y, et al. Counterfactual explainable recommendation. In: Proceedings of the 30th ACM International Conference on Information & Knowledge Management. ACM; 2021:1784-93.
- Byrne RMJ. Counterfactuals in explainable artificial intelligence (XAI): Evidence from human reasoning. In: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19). IJCAI; 2019:6276-82.
- de Oliveira RMB, Martens D. A framework and benchmarking study for counterfactual generating methods on tabular data. Applied Sciences 2021;11:7274. [Crossref]
- Riley RD, Collins GS. Stability of clinical prediction models developed using statistical or machine learning methods. Biom J 2023;65:e2200302. [Crossref] [PubMed]
- Taha AM, Mustapha A, Chen SD. Naive Bayes-guided bat algorithm for feature selection. ScientificWorldJournal 2013;2013:325973. [Crossref] [PubMed]
- Arar ÖF, Ayan K. A feature dependent Naive Bayes approach and its application to the software defect prediction problem. Applied Soft Computing 2017;59:197-209. [Crossref]
- Feng W, Sun J, Zhang L, et al. A support vector machine based naive Bayes algorithm for spam filtering. In: 2016 IEEE 35th International Performance Computing and Communications Conference (IPCCC). IEEE; 2016:1-8.
- Cheng S, Shih FY. Feature Reduction for Support Vector Machines. In: Wang J. editor. Encyclopedia of Data Warehousing and Mining. 2nd ed. Hershey: IGI Global; 2009:870-7.
- Lucieer A. Visualization of Hyperplanes for SVM classification. In: 2007 IEEE International Geoscience and Remote Sensing Symposium. IEEE; 2007:2034-5.
- Daw J, Margolis R, Wright L. Emerging Adulthood, Emergent Health Lifestyles: Sociodemographic Determinants of Trajectories of Smoking, Binge Drinking, Obesity, and Sedentary Behavior. J Health Soc Behav 2017;58:181-97. [Crossref] [PubMed]
- Stubbs RJ, Lavin JH. The challenges of implementing behaviour changes that lead to sustained weight management. Nutr Bull 2013;38:5-22. [Crossref]
- Macfarlane DJ, Thomas GN. Exercise and diet in weight management: updating what works. Br J Sports Med 2010;44:1197-201. [Crossref] [PubMed]
- Jakicic JM, Rogers RJ, Davis KK, et al. Role of Physical Activity and Exercise in Treating Patients with Overweight and Obesity. Clin Chem 2018;64:99-107. [Crossref] [PubMed]
- Chaput JP, Klingenberg L, Astrup A, et al. Modern sedentary activities promote overconsumption of food in our current obesogenic environment. Obes Rev 2011;12:e12-20. [Crossref] [PubMed]
- Morera LP, Marchiori GN, Medrano LA, et al. Stress, Dietary Patterns and Cardiovascular Disease: A Mini-Review. Front Neurosci 2019;13:1226. [Crossref] [PubMed]
- Doughty KN, Del Pilar NX, Audette A, et al. Lifestyle Medicine and the Management of Cardiovascular Disease. Curr Cardiol Rep 2017;19:116. [Crossref] [PubMed]
- Kaar JL, Luberto CM, Campbell KA, et al. Sleep, health behaviors, and behavioral interventions: Reducing the risk of cardiovascular disease in adults. World J Cardiol 2017;9:396-406. [Crossref] [PubMed]
- Shaito A, Aramouni K, Assaf R, et al. Oxidative Stress-Induced Endothelial Dysfunction in Cardiovascular Diseases. Front Biosci (Landmark Ed) 2022;27:105. [Crossref] [PubMed]
- Ueda K, Adachi Y, Liu P, et al. Regulatory Actions of Estrogen Receptor Signaling in the Cardiovascular System. Front Endocrinol (Lausanne) 2020;10:909. [Crossref] [PubMed]
- Han SH, Quon MJ, Kim JA, et al. Adiponectin and cardiovascular disease: response to therapeutic interventions. J Am Coll Cardiol 2007;49:531-8. [Crossref] [PubMed]
- Miao J, Wang Y, Gu X, et al. Risk of Cardiovascular Disease Death in Older Malignant Melanoma Patients: A Population-Based Study. Cancers (Basel) 2022;14:4783. [Crossref] [PubMed]
- Bell CF, Lei X, Haas A, et al. Risk of Cancer After Diagnosis of Cardiovascular Disease. JACC CardioOncol 2023;5:431-40. [Crossref] [PubMed]
Cite this article as: Lane H, Valko M, Rath S, Walker MD, Olson ML, Kramer S. Harder to mend than to break?—counterfactual explainable artificial intelligence for lifestyle medicine and heart disease prediction. J Med Artif Intell 2025;8:3.