Building and validating a machine learning model to predict coronary heart disease risk based on non-invasive indicators

Scritto il 07/12/2025
da Bo Wu

Comput Methods Programs Biomed. 2025 Nov 30;275:109186. doi: 10.1016/j.cmpb.2025.109186. Online ahead of print.

ABSTRACT

BACKGROUND: Coronary heart disease (CHD) remains a leading global cause of death. Early identification of high-risk individuals and timely intervention are crucial. This study developed and evaluated a predictive CHD risk model using machine learning (ML) techniques.

METHODS: The Behavioral Risk Factor Surveillance System (BRFSS) data were randomly split into a training set and an internal validation set in a 7:3 ratio. Variable screening was performed using univariate and multivariate logistic regression analyses. Subsequently, predictive models were developed using eight machine learning algorithms. Model performance on the internal validation set was evaluated using the area under the curve (AUC), sensitivity, specificity, and accuracy, and the optimal model was selected based on these metrics. The National Health and Nutrition Examination Survey (NHANES) dataset was used for external validation of the optimal model. Shapley Additive exPlanations (SHAP) analysis was employed to visualize the importance of features.

RESULTS: Eight machine learning models were developed based on 12 clinical features. Among these models, the Light Gradient Boosting Machine (LightGBM) demonstrated the best performance, with an internal-validation cohort AUC of 0.825 (95% CI 0.821-0.829), sensitivity of 0.800, and specificity of 0.700, significantly outperforming the other models. The external-validation cohort achieved an AUC of 0.851 (95% CI 0.835-0.867). SHAP analysis identified age, sex, hypertension, and dyslipidaemia as key risk factors. A web-based calculator was developed based on the LightGBM model to predict CHD.

CONCLUSION: LightGBM-based prediction model exhibits high accuracy in assessing CHD risk and holds promise as an effective tool for the early screening and prevention of CHD.

PMID:41353986 | DOI:10.1016/j.cmpb.2025.109186