Background: Machine learning (ML) is a field in artificial intelligence which supplies tools for knowledge discovery in large datasets. Mortality following ST-segment elevation myocardial infarction (STEMI) varies considerably. Risk scores for prediction of mortality 30-days following a STEMI have been developed using a conventional statistical approach. The complex nature of STEMI data, has motivated us to apply ML for predictive modeling. Using ML algorithms, the perspective of the current study was prediction of 30-days mortality following STEMI and ranking of variable contribution. The aim of the current study was to validate the concept of ML with the most accurate risk score available for patients presenting with an ST elevation myocardial infarction (STEMI).
Methods and Results: This was a retrospective, supervised learning, data mining study. Out of a cohort of 13,422 patients from the Acute Coronary Syndrome Israeli Survey (ACSIS) registry, 2,782 patients fulfilled inclusion criteria and fifty four variables were considered. Prediction models for overall mortality 30 days after STEMI were developed using 6 ML algorithms. Models were compared to each other and to the Global Registry of Acute Coronary Events (GRACE) and Thrombolysis In Myocardial Infarction (TIMI) scores.
Depending on the algorithm, using all available variables, prediction models’ performance measured in an area under the receiver operating characteristic curve (AUC) ranged of 0.73-0.90. The highest achieving models performed similarly to the GRACE score (0.89±0.07) and outperformed the TIMI score (0.81±0.09, p<0.05). Performance of most algorithms plateaued when introduced with 15 variables. Among the top predictors were creatinine, Killip classification at admission, blood pressure, glucose level, and age.
Conclusions: We present a data mining approach for prediction of mortality post STEMI. The algorithms selected showed competence in prediction across an increasing number of variables. Machine learning may be used for outcome prediction in high dimensional cardiology settings.