Front Endocrinol (Lausanne). 2026 Jan 28;17:1686082. doi: 10.3389/fendo.2026.1686082. eCollection 2026.
ABSTRACT
OBJECTIVE: Diabetes mellitus (DM) poses a major global public health challenge. Prediabetes, a critical stage in the progression of DM, represents a pivotal window for intervention and prevention. This study aims to develop and validate a machine learning-based prediction model for glycemic reversal in Chinese individuals with prediabetes, with the goal of facilitating such reversal in this population.
METHODS: This study analyzed data of Chinese adults from the Dryad database, with a follow-up period from 2010 to 2016. LASSO regression was used to select variables. The selected variables were then used to construct models using random forest, gradient boosting decision tree, eXtreme gradient boosting, Naive Bayes, adaptive boosting, support vector machine (SVM), and Cox model. To assess the discriminative ability of each model, the area under the curve (AUC) was calculated for each. Predictive performance was evaluated by computing time-dependent AUC (t-AUC), accuracy, precision, recall, F1, and C-index. Shapley additive explanations (SHAP) analysis was applied to interpret the key variables identified by the optimal model, and Kaplan-Meier curves for key variables associated with glycemic improvement were plotted to explore differences between groups.
RESULTS: 1792 adults with prediabetes were enrolled. During 5 years of follow-up, 942 achieved normoglycemia, yielding a reversal rate of 52.6%. After differential analysis and LASSO regression screening, 12 feature variables were finally determined for model construction. The 3-year, 4-year, and 5-year AUC values for the Cox model all exceeded 0.61. Six machine learning algorithms were employed to construct predictive models. The SVM demonstrated superior overall performance: it yielded a t-AUC of 0.711, accuracy of 0.652, precision of 0.620, recall of 0.661, F1 of 0.639, and a C-Index of 0.709, outperforming the other algorithms. SHAP analysis revealed that age, FPG, BMI, SBP, DBP, and triglycerides are key factors influencing normoglycemia reversal in individuals with prediabetes.
CONCLUSION: We developed an SVM model to predict glycemic reversal in the prediabetic population in China, and identified key factors influencing glycemic improvement. This work provides a scientific basis for both this population and clinicians to implement early targeted interventions, thereby aiding in reducing the incidence of DM and alleviating the healthcare burden.
PMID:41685231 | PMC:PMC12890694 | DOI:10.3389/fendo.2026.1686082