Development of machine learning models to predict risk of hospitalisation and 90-day readmission among patients with cardiovascular risk factors using community health survey data

Scritto il 31/12/2025
da Arinze Nkemdirim Okere

BMJ Health Care Inform. 2025 Dec 31;32(1):e101742. doi: 10.1136/bmjhci-2025-101742.

ABSTRACT

OBJECTIVES: This study aimed to develop and validate machine learning (ML) models to predict all-cause hospital admissions and 90-day readmissions using structured, patient-reported survey data.

METHODS: A cross-sectional survey was conducted between 3 July 2021 and 18 December 2022, among US adults aged ≥18 years with at least one cardiovascular risk factor. Participants were recruited through social media, community pharmacies and outpatient clinics. The final sample included 1318 participants. Primary outcomes were any all-cause hospitalisation and readmission within 90 days. Eight supervised ML models were trained using an 80:20 train-test split and 10-fold cross-validation. Model performance was evaluated using area under the receiver operating characteristic curve (AUROC), precision, recall, F1 score and calibration metrics. SHapley Additive exPlanations (SHAP) values identified key predictors.

RESULTS: Among 1318 participants, 35.0% reported at least one hospitalisation and 10.4% reported a 90-day readmission. The Extra Trees (ET) model demonstrated the best performance across both outcomes. For hospitalisation, ET achieved an AUROC of 0.93, precision of 0.83 and recall of 0.87. For readmission, AUROC was 0.99 with precision of 0.95 and recall of 0.96. SHAP analysis identified heart disease, medication burden, race/ethnicity, employment and insurance status as the most influential predictors.

DISCUSSION: Patient-reported data reflecting behavioural, social and clinical factors can predict hospitalisations with high accuracy, complementing traditional EHR-based models.

CONCLUSIONS: Integrating such patient-reported and behavioural data into electronic health records could enable earlier identification of high-risk individuals and support targeted, preventive interventions to improve healthcare outcomes.

PMID:41475883 | DOI:10.1136/bmjhci-2025-101742