J Clin Transl Sci. 2026 Mar 6;10(1):e57. doi: 10.1017/cts.2026.10722. eCollection 2026.
ABSTRACT
BACKGROUND: Carotid atherosclerosis is associated with increased coronary heart disease (CHD) risk, yet current risk models lack specificity and interpretability for this population. This study aimed to develop explainable machine learning (ML) models to predict CHD in these patients.
METHODS: We retrospectively analyzed 487 patients with carotid atherosclerosis (191 CHD, 296 non-CHD) from January 2022 to July 2025. Thirty-eight variables were collected, including demographic, clinical, and biochemical indicators. LASSO regression identified six key predictors. Seven ML models were trained and evaluated using area under receiver operating characteristic curve (AUC), PRC-AUC, calibration curves, and decision curve analysis (DCA). SHAP was applied to interpret the best-performing model.
RESULTS: Logistic regression model achieved the highest test-set performance (AUC = 0.827; PRC-AUC = 0.752), with strong generalizability and calibration. SHAP analysis identified age and diastolic blood pressure as the most influential features, aligning with model coefficients. DCA demonstrated superior clinical net benefit of the logistic regression model across probability thresholds.
CONCLUSION: A six-variable logistic model provides accurate and interpretable CHD risk prediction in patients with carotid atherosclerosis. Its transparency and clinical utility support its integration into personalized risk management.
PMID:41960597 | PMC:PMC13058763 | DOI:10.1017/cts.2026.10722