Sci Rep. 2026 Feb 5. doi: 10.1038/s41598-025-34060-w. Online ahead of print.
ABSTRACT
The monitoring of daily life in nursing home residents generates diverse and heterogeneous sources of information. Artificial Intelligence (AI) is increasingly used to predict a wide range of outcomes in both research and clinical practice, including mortality and cognitive impairment (CI). A key challenge is determining which information sources (IS) provide the most accurate predictions. In this work, we present an integrative AI-based framework that combines harmonized temporal modeling, Bayesian hyperparameter optimization, XGBoost, and explainable AI (SHAP) to predict CI in nursing home residents using 13 years of heterogeneous longitudinal data from 2,608 individuals. Our approach enables interpretable predictions of CI-related clinical scales such as the Mini-Mental State Examination (MMSE), the Global Deterioration Scale (GDS), and the Barthel Scale while revealing the relative contributions of diverse IS, including clinical metrics and activity records. Using a nested 5 × 3 cross-validation scheme with patient-level grouping and temporal blocking, the Bayesian-optimized XGBoost regressors achieved robust predictive performance, with MSE values of 2.12 (MMSE), 0.47 (GDS), and 4.55 (Barthel) when using only Clinical Variables, and further improvements when integrating all information sources (MMSE: 1.85; GDS: 0.42; Barthel: 4.30). The MMSE severity classifier achieved a macro-averaged AUC of 0.89 (95% CI: 0.87-0.91), with the highest F1-scores in the Normal (0.80) and Severe (0.86) impairment categories. Clinical Variables consistently emerged as the most informative source across regression and classification tasks. Overall, this integrative framework enhances CI prediction from heterogeneous long-term care data while providing interpretable insights that may support more personalized and data-informed care strategies.
PMID:41644966 | DOI:10.1038/s41598-025-34060-w

