Identification of glycosylation-related gene signatures associated with coronary artery disease via integrated transcriptomic and machine learning analyses

Scritto il 07/02/2026
da Kangjun Fan

Eur J Med Res. 2026 Feb 6. doi: 10.1186/s40001-026-04000-z. Online ahead of print.

ABSTRACT

BACKGROUND: On a global scale, coronary artery disease (CAD) continues to be one of the leading causes of morbidity and death. Glycosylation, a vital post-translational modification, has been linked to a range of cardiovascular conditions, including CAD. This study aims to systematically identify and validate glycosylation-based prognostic biomarkers for CAD, with the ultimate goal of developing a novel clinical diagnostic tool.

METHODS: We sourced and pre-processed CAD data sets (GSE20680, GSE20681, GSE42148) from the Gene Expression Omnibus (GEO) database, which comprised 199 CAD samples and 162 control samples post batch effect removal. DE analysis was executed using the R software to pinpoint glycosylation-related differentially expressed genes (GRDEGs), whose functional enrichment analysis was conducted subsequently. Risk scores were calculated using a diagnostic model that was built employing logistic regression (LR), support vector machine (SVM) algorithms, and least absolute shrinkage and selection operator (LASSO) regression analysis. The model further underwent validation employing receiver operating characteristic (ROC) curves, calibration plots, as well as decision curve analysis (DCA). The infiltrated abundance of the immune cells between the high-risk group (HRG) and low-risk group (LRG) was then analyzed by ssGSEA to explore the effect of the GRDEG-based prognostic signatures on the immune response. Furthermore, the interaction networks were constructed to explore the potential regulatory mechanisms and therapy approaches based on the GRDEGs. Finally, the levels of the GRDEG-based prognostic signatures in CAD samples were detected using qRT-PCR and western blotting.

RESULTS: Thirty-nine GRDEGs were significantly and differently expressed between the CAD and control groups, which predominantly participate in protein glycosylation and N-Glycan biosynthesis pathways. Among these genes, seven GRDEGs were identified as the prognostic risk signatures, including F5, MGAT4A, HSPG2, ARSB, TUBB3, ST6GAL1, and CAMLG. Based on these genes, the CAD samples were separated into HRG and LRG, and the ICI between HRG and LRG were shown to be significantly different. The constructed interaction networks further suggested that the seven GRDEGs were regulated by several microRNAs (miRNAs), transcription factors (TFs), and RNA-binding proteins (RBPs), and they were the target genes of some chemical drugs. Finally, the outcomes of qRT-PCR and western blotting demonstrated that there were significant variations in expressing these seven key genes in CAD.

CONCLUSIONS: This study establishes the critical role of GRDEGs in CAD pathogenesis and presents a robust diagnostic model with potential clinical utility. Furthermore, our findings reveal novel molecular targets that may inform future therapeutic strategies for CAD.

PMID:41652481 | DOI:10.1186/s40001-026-04000-z