Introduction:The development of acute heart failure (AHF) is a critical decision-point in the natural history of heart failure and carries a dismal prognosis. The lack of appropriate risk-stratification tools for AHF patients limits physician ability to precisely tailor patient-specific therapy regimen at this important juncture. Machine learning(ML) based strategies may enhance risk stratification by incorporating analysis of high-dimensional patient data with multiple covariates and novel prediction methodologies. In the current study, we aimed at evaluating the drivers for success in prediction models and establishing an institute-tailored ML based prediction model for real-time decision support.
Methods: We used a cohort of all AHF patients admitted during a 12 years period including 10,868 patients. A total of 372 covariates were collected from admission to the end of the hospitalization (demographics, lab-tests, medical therapies, echocardiographic and administrative data). Data pre-processing included features cleaning, train-test split, imputation and normalization to reduce data-noise and to handle missing-data. We assessed model performance across two axes (1)type of prediction method and (2)type and number of covariates. The primary outcome was one-year survival from hospital discharge. For the model-type axis we experimented seven different methods: Logistic Regression(LR), Random Forest(RF), Cox model(Cox), XGBoost, a deep neural-network (NeuralNet) and an ensembled model.
Results: Data pre-processing methodology combined with multiple-covariates allowed to achieve an AUROC prediction accuracy of more than 80% with most prediction models: L1/L2-LR(80.4%/80.3%);Cox(80.1%);XGBoost(80.7%);NeuralNet(80.5%). The number of co-variates was a significant modifier of prediction success(p<0.001), the use of multiple-covariates (372) performed better (AUROC 80.4% for L1-LR) compared with a set of known clinical covariates (AUROC 77.8%).
Conclusions: The choice of the predictive modeling method is secondary to the multiplicity and type of covariates for predicting AHF prognosis. The application of a structured data pre-processing combined with the use of multiple-covariates results in an accurate, institute-tailored, risk prediction in AHF.