Prediction of post-COVID chronic fatigue syndrome using data mining and machine learning techniques in Isfahan COVID cohort study

Scritto il 18/04/2026
da Shayan Nematbakhsh

J Infect Public Health. 2026 Apr 3;19(6):103218. doi: 10.1016/j.jiph.2026.103218. Online ahead of print.

ABSTRACT

BACKGROUND: Post-COVID Fatigue (PCF) is one of the most common issues people face after recovering from COVID-19. Due to the heterogeneity of clinical manifestations and the lack of objective diagnostic criteria, the identification and prediction of PCF remain challenging. This study aimed to use data mining and machine learning techniques for prediction of PCF.

METHODS: We analyzed data from 3850 patients enrolled in the Isfahan COVID cohort study. Key factors linked to PCF were identified and used to build predictive models. Balancing techniques such as Random Under Sampling, Random Over Sampling, and Synthetic Minority Over-sampling Technique were applied before model training. After preparing the data, we applied several machine learning models, including Logistic Regression, Support Vector Machines, Decision Trees, and Random Forest to predict PCF. We then compared how well each model performed using different evaluation criteria. The analysis was done using R-Studio version 4.1.2.

RESULTS: We found 37 factors that were significantly related to fatigue. Among the different models, the RF model performed the best, with an accuracy rate of 85%. According to this model, the top predictors of PCF were, in order: anxiety levels, Body Mass Index (BMI), depression levels, post-COVID increased irritability, memory issues, a history of fatty liver disease, tingling in the hands and feet, and post-COVID excessive sweating.

CONCLUSIONS: Using data mining and machine learning techniques can help healthcare professionals identify key factors contributing to PCF, allowing them to develop better strategies for prevention and treatment.

PMID:42000587 | DOI:10.1016/j.jiph.2026.103218