Explainable machine learning for predicting activities of daily living at discharge in stroke patients: A retrospective study using SHAP interpretability

Scritto il 02/07/2026

da Qian Ye

PLoS One. 2026 Jul 2;21(7):e0351468. doi: 10.1371/journal.pone.0351468. eCollection 2026.

ABSTRACT

PURPOSE: We aimed to develop a machine learning model to predict activities of daily living (ADL) at discharge in stroke patients and identify key predictors to guide rehabilitation decisions.

MATERIALS AND METHODS: Data of 589 stroke inpatients (2019-2024) were split into good (BI ≥ 60) and poor (BI < 60) ADL groups. Continuous variables were processed using Z-score normalization, followed by preliminary univariate regression screening (P < 0.05) and final feature selection via LASSO regression (lambda.1se = 0.0488). The screened features were used to train and validate ten machine learning algorithms; 30% of the dataset (n = 177) was allocated as an independent test set for model evaluation, and SHAP analysis was performed to interpret the optimal model.

RESULTS: Six of 41 features were retained. Random forest achieved the best performance (AUC = 0.958; accuracy = 0.936; sensitivity = 0.934; specificity = 0.950). SHAP identified the top drivers: admission Barthel Index, standing balance, Brunnstrom stages (upper and lower limb), dressing, and grooming abilities.

CONCLUSION: The ADL risk prediction model constructed using machine learning, particularly the random forest model, shows excellent predictive performance and clinical interpretability, making it valuable for individualized risk assessment of daily living skills in stroke patients at discharge.

PMID:42391297 | DOI:10.1371/journal.pone.0351468