Digit Health. 2026 Jan 22;12:20552076251408534. doi: 10.1177/20552076251408534. eCollection 2026 Jan-Dec.
ABSTRACT
OBJECTIVE: To evaluate the performance of machine learning (ML)-based survival models for 10-year cardiovascular disease (CVD) risk prediction using large-scale electronic health records (EHRs). The study benchmarks these models against the QRISK3 score and conventional Cox proportional hazards (CoxPH) models currently used in UK primary prevention, with the aim of assessing their potential to capture complex risk patterns beyond traditional approaches.
METHODS: This study utilized individual-level data from the CPRD Aurum, covering 40 million UK primary care records from 2011 to 2021. A total of 469,496 patients aged 40-85 was analysed. Predictor variables were selected based on QRISK3 definitions, with additional phenotyping for comorbidities and pre-stratified risk scores. ML models, including deep neural networks (e.g., DeepSurv and DeepHit) and ensemble survival models (e.g., random survival forest [RSF] and gradient boosting), were developed for CVD risk prediction. Model performance was assessed using calibration and discrimination metrics, with 'spatial external validation' conducted using a London-held dataset.
RESULTS: A total of 849,651 records were analysed, including 117,421 for 'spatial validation' and 732,230 for development. QRISK3 scores effectively differentiated CVD patients, particularly among females, showing stronger predictive performance. Ensemble methods and neural networks outperformed CoxPH models, with RSF achieving the best discrimination and calibration: AUROC values of 0.738 (95% CI: 0.723-0.752) for males and 0.778 (95% CI: 0.762-0.793) for females, with Brier scores of 0.088 and 0.055.
CONCLUSION: ML models enhance CVD risk prediction, outperforming conventional approaches in calibration and discrimination. Integrating pre-stratified risk scores further improves performance, highlighting the value of augmenting tools like QRISK.
PMID:41602938 | PMC:PMC12833136 | DOI:10.1177/20552076251408534