PLoS One. 2026 May 8;21(5):e0348670. doi: 10.1371/journal.pone.0348670. eCollection 2026.
ABSTRACT
Accurate disease prediction using clinical datasets is essential for improving early diagnosis and clinical decision-support systems; however, many existing deep learning approaches are disease-specific, computationally intensive, and difficult to generalize across heterogeneous biomedical datasets. This study addresses this challenge by proposing a unified and dataset-aware deep learning framework that enables accurate and interpretable disease prediction across diverse clinical datasets. The framework adopts a modular architecture that selects appropriate models based on dataset characteristics such as feature dimensionality, sample size, and class imbalance. It integrates multiple deep learning architectures, including MLP, one-dimensional CNN, FT-Transformer, autoencoder-based classifiers, and ensemble strategies. Robust preprocessing, fold-safe feature selection, and nested cross-validation are incorporated to ensure reliable performance evaluation. The framework is evaluated on three heterogeneous benchmark datasets: the UCI Heart Disease dataset (303 samples, 13 clinical features), the PIMA Indians Diabetes dataset (768 samples, 8 metabolic features), and the Parkinson's disease voice dataset (195 recordings, 22 acoustic features). Experimental results demonstrate competitive predictive performance relative to classical baselines across the diverse tasks. The FT-Transformer + autoencoder ensemble achieved an AUC of 0.8980 (±0.0483) for heart disease prediction, while the CNN + Autoencoder ensemble obtained an AUC of 0.8451 (±0.0270) for diabetes classification. For Parkinson's disease detection, the MLP achieved an AUC of 0.7538 with perfect specificity. Overall, all models achieved AUC values comparable to ML baselines. The study contributes a scalable and interpretable deep learning framework that improves reliability, generalization, and practical applicability for multi-disease prediction in real-world healthcare environments.
PMID:42102106 | DOI:10.1371/journal.pone.0348670

